Ian Joyner - archive.adaic.comarchive.adaic.com/intro/ada-vs-c/cppcv3.pdf · 3rd Edition Ian Joyner ... The last edition was addressed to people who were considering ... completely

C++??A Critique of C++

and Programming and Language Trends of the 1990s

3rd Edition

Ian Joyner

The views in this critique in no way reflect the position of my employer

© Ian Joyner 1996

C++?? ii

3rd Edition © Ian Joyner 1996

1. INTRODUCTION .................................................................................................................................................1

2. THE ROLE OF A PROGRAMMING LANGUAGE ...........................................................................................2

2.1 PROGRAMMING ..................................................................................................................................................32.2 COMMUNICATION, ABSTRACTION AND PRECISION.................................................................................................42.3 NOTATION .........................................................................................................................................................52.4 TOOL INTEGRATION............................................................................................................................................52.5 CORRECTNESS....................................................................................................................................................52.6 TYPES................................................................................................................................................................72.7 REDUNDANCY AND CHECKING ............................................................................................................................72.8 ENCAPSULATION ................................................................................................................................................82.9 SAFETY AND COURTESY CONCERNS....................................................................................................................82.10 IMPLEMENTATION AND DEPLOYMENT CONCERNS...............................................................................................92.11 CONCLUDING REMARKS....................................................................................................................................9

3. C++ SPECIFIC CRITICISMS ..............................................................................................................................9

3.1 VIRTUAL FUNCTIONS..........................................................................................................................................93.2 GLOBAL ANALYSIS ...........................................................................................................................................123.3 TYPE-SAFE LINKAGE .........................................................................................................................................133.4 FUNCTION OVERLOADING.................................................................................................................................143.5 THE NATURE OF INHERITANCE..........................................................................................................................153.6 MULTIPLE INHERITANCE...................................................................................................................................163.7 VIRTUAL CLASSES............................................................................................................................................173.8 TEMPLATES......................................................................................................................................................173.9 NAME OVERLOADING .......................................................................................................................................193.10 NESTED CLASSES............................................................................................................................................213.11 GLOBAL ENVIRONMENTS................................................................................................................................223.12 POLYMORPHISM AND INHERITANCE.................................................................................................................233.13 TYPE CASTS...................................................................................................................................................233.14 RTTI AND TYPE CASTS...................................................................................................................................243.15 NEW TYPE CASTS...........................................................................................................................................253.16 JAVA AND CASTS............................................................................................................................................263.17 ‘.’ AND ‘->’ ...................................................................................................................................................263.18 ANONYMOUS PARAMETERS IN CLASS DEFINITIONS...........................................................................................273.19 NAMELESS CONSTRUCTORS.............................................................................................................................273.20 CONSTRUCTORS AND TEMPORARIES................................................................................................................273.21 OPTIONAL PARAMETERS.................................................................................................................................283.22 BAD DELETIONS.............................................................................................................................................283.23 LOCAL ENTITY DECLARATIONS........................................................................................................................283.24 MEMBERS......................................................................................................................................................293.25 INLINES..........................................................................................................................................................293.26 FRIENDS.........................................................................................................................................................303.27 CONTROLLED EXPORTS VS FRIENDS.................................................................................................................303.28 STATIC...........................................................................................................................................................313.29 UNION............................................................................................................................................................323.30 STRUCTS........................................................................................................................................................323.31 TYPEDEFS......................................................................................................................................................323.32 NAMESPACES..................................................................................................................................................323.33 HEADER FILES................................................................................................................................................333.34 CLASS INTERFACES.........................................................................................................................................343.35 CLASS HEADER DECLARATIONS.......................................................................................................................343.36 GARBAGE COLLECTION...................................................................................................................................343.37 LOW LEVEL CODING........................................................................................................................................353.38 SIGNATURE VARIANCE....................................................................................................................................353.39 PURE VIRTUAL FUNCTIONS.............................................................................................................................363.40 PROGRAMMING BY CONTRACT........................................................................................................................363.41 C++ AND THE SOFTWARE LIFECYCLE...............................................................................................................373.42 CASE TOOLS.................................................................................................................................................383.43 REUSABILITY AND COMMUNICATION ...............................................................................................................393.44 REUSABILITY AND TRUST................................................................................................................................393.45 REUSABILITY AND COMPATIBILITY ..................................................................................................................40

C++?? iii


3.46 REUSABILITY AND PORTABILITY ......................................................................................................................403.47 IDIOMATIC PROGRAMMING..............................................................................................................................413.48 CONCURRENT PROGRAMMING.........................................................................................................................413.49 STANDARDISATION, STABILITY AND MATURITY ..............................................................................................423.50 COMPLEXITY ..................................................................................................................................................433.51 C++: THE OVERWHELMING OOL OF CHOICE? .................................................................................................44

4. GENERIC C CRITICISMS ................................................................................................................................45

4.1 POINTERS.........................................................................................................................................................454.2 ARRAYS...........................................................................................................................................................464.3 FUNCTION ARGUMENTS....................................................................................................................................474.4 VOID AND VOID * ..............................................................................................................................................484.5 VOID FN () ........................................................................................................................................................484.6 FN ().................................................................................................................................................................494.7 FN (VOID) .........................................................................................................................................................504.8 METADATA IN STRINGS.....................................................................................................................................504.9 ++, --...............................................................................................................................................................504.10 DEFINES.........................................................................................................................................................514.11 NULL VS 0 ....................................................................................................................................................514.12 CASE SENSITIVITY ..........................................................................................................................................524.13 ASSIGNMENT OPERATOR.................................................................................................................................534.14 CHAR; SIGNED AND UNSIGNED.........................................................................................................................534.15 SEMICOLONS..................................................................................................................................................534.16 BOOLEANS.....................................................................................................................................................544.17 COMMENTS....................................................................................................................................................544.18 CPAGHE++I ....................................................................................................................................................54

4.18.1 Cpaghe++i Gotos ..................................................................................................................................544.18.2 Cpaghe++i Globals ...............................................................................................................................554.18.3 Cpaghe++i Pointers...............................................................................................................................55

5. CONCLUSIONS..................................................................................................................................................56

6. BIBLIOGRAPHY ...............................................................................................................................................58

7. WEBLIOGRAPHY .............................................................................................................................................59

C++?? 1


1. IntroductionThis is now the third edition of this critique; it hasbeen four years since the last edition. The mainfactor to precipitate a new edition is that there arenow more environments and languages availablethat rectify the problems of C++. The last editionwas addressed to people who were consideringadopting C++, in particular managers who wouldhave to fund projects. There are now more choices,so comparison to the alternatives makes the critiqueless hypothetical. The critique was not meant as anacademic treatise, although some of the aspectsrelating to inheritance, etc., required a bit oftechnical knowledge.

The critique is long; it would be good if it wereshorter, but that would be possible only if there wereless flaws in C++. Even so, the critique is notexhaustive of the flaws: I find new traps all the time.Instead of documenting every trap, the critiqueattempts to arrange the traps into categories andprinciples. This is because the traps are not just oneoff things, but more deeply rooted in the principlesof C++. Neither is the critique a repository of ‘guesswhat this obscure code does’ examples.

One desired outcome of this critique is that itshould awaken the industry about the C++ myth andthe fact that there are now viable alternatives to C++that do not suffer from as many technical problems.The industry needs less hype and more sensibleprogramming practices. No language can be perfectin every situation, and tradeoffs are sometimesnecessary, but you can now feel freer to choose alanguage which is more closely suited to your needs.The alternatives to C++ provide no silver bullet, butsignificantly reduce the risks and costs of softwaredevelopment compared to C++. The alternatives donot suffer under the complexities of C++ and do notburden the programmer with many trivialities whichthe compiler should handle; and they avoid many ofthe flaws and inanities of C/C++.

The language events which have made an updatedesirable are the introduction of Java, the wideravailability of more stable versions of Eiffel, and thefinalisation of the Ada 95 standard. Java inparticular set out to correct the flaws of C++, andmost sections in the original critique now makesome comment on how Java addresses the problems.Eiffel never did have the same flaws as C++, andhas been around since long before the originalcritique. Eiffel was designed to be object-orientedfrom the ground up, rather than a bolt-on. Javaoffers better integration with OO than C++. Nowthat there are language comparisons in the critiquethe arguments are less hypothetical, and thecriticisms of C++ are more concrete.

Another factor has been the publishing of BjarneStroustrup’s “Design and Evolution of C++”[Stroustrup 94]. This has many explanations of theproblems of extending C with object-orientedextensions while retaining compatibility with C. Inmany ways, Stroustrup reinforces comments that Imade in the original critique, but I differ from

Stroustrup in that I do not view the flaws of C++ asacceptable, even if they are widely known, andmany programmers know how to avoid the traps.Programming is a complex endeavour: complex andflawed languages do not help.

A question which has been on my mind in thelast few years is when is OO applicable? OO is auniversal paradigm. It is very general and powerful.There is nothing that you could not program in it.But is this always appropriate? Lower levelprogrammers have tended to keep writing suchthings as device drivers in C. It is not lower levelsthat I am interested in, but the higher levels. OOmight still be too low level for a number ofapplications. A recent book [Shaw 96] suggests thatsoftware engineers are too busy designing systemsin terms of stacks, lists, queues, etc., instead ofadopting higher level, domain-orientedarchitectures. [Shaw 96] offers some hope to theindustry that we are learning how to architect tosolve problems, rather than distorting problems to fitparticular technologies and solutions.

For instance, commercial and businessprogramming might be faster using a paradigminvolving business objects. While these could beprovided in an OO framework, the generality is notneeded in commercial processing, and will slow andlimit the flexibility of the development process. Byanalogy, walking is a fine mode of transport, but doI choose to walk everywhere? There seems to be apotentially large market for specialised paradigms,which support rapid application development (RAD)techniques. These paradigms may be based on someOO language, framework and libraries in thebackground. In anything though, we should becautious, as this is an industry particularly prone tobuzzwords and fads.

The second edition generated a lot of interest,and it was published in a number of places:Software Design in Japan translated it into Japanese,and published it over a series of months in 1993; itwas published in an abridged form in TOOLSPacific 1992; it was also published in Gregory’s ASeries Technical Journal. However, I resistedhanding over copyright to anyone, as I wanted thepaper to be freely available on the Internet; it is nowavailable on more sites than I know about. Mythanks to all those who have been so supportive ofthe 2nd edition.

Another reason for the 3rd edition is that theoriginal critique was very much a product ofnewsgroup discussions. In this edition, I haveattempted to at least improve the readability andflow, while not changing the overall structure orembarking on a complete rewrite. The primary goalhas been to annotate the original with comparisonsto Java and Eiffel.

C++ has become even more widely used overthe last few years. However, people are starting torealise that it is not the answer to all programmingproblems, or that retaining compatibility with C is agood thing. In some sectors there has been a

C++?? 2


backlash, precipitated by the fact that people havefound the production of defect free quality softwarean extremely difficult and costly task. OO has beenover-hyped, but neither are its real benefits presentin C++.

It is important and timely to question C++’s suc-cess. Several books are already published on thesubject [Sakkinen 92], [Yoshida 92], and [Wiener95]. A paper on the recommended practices for usein C++ [Ellemtel 92] suggests “C++ is a difficultlanguage in which there may be a very fine linebetween a feature and a bug. This places a largeresponsibility upon the programmer.” Is this aresponsibility or a burden? The ‘fine line’ is a resultof an unnecessarily complicated language definition.The C++ standardisation committee warns “C++ isalready too large and complicated for our taste”[X3J16 92].

Sun’s Java White Paper [Sun 95] says that indesigning Java, “The first step was to eliminateredundancy from C and C++. In many ways, the Clanguage evolved into a collection of overlappingfeatures, providing too many ways to do the samething, while in many cases not providing neededfeatures. C++, even in an attempt to add “classes inC” merely added more redundancy while retainingthe inherent problems of C.”

The designer of Eiffel, Bertrand Meyer, states inthe appendix “On language design and evolution” in[Meyer 92] some guiding principles of languagedesign: simplicity vs complexity, uniqueness,consistency. “The Principle of Uniqueness,” Meyersays, “is easily expressed: the language shouldprovide one good way to express every operation ofinterest; it should avoid providing two.”

Meyer has produced a seminal work on OO:Object-oriented Software Construction, [Meyer 88].All software engineers and object-orientedpractitioners should read and absorb this work. Acompletely revised 2nd edition is soon to appear. Alater short book “Object Success” is directed tomanagers (probably the reason for the pun in thename), with an overview of OO, [Meyer 95].

While C programmers can immediately use C++to write and compile C programs, this does not takeadvantage of OO. Many see this as a strength, but itis often stated that the C base is C++’s greatestweakness. However, C++ adds its own layers ofcomplexity, like its handling of multiple inheritance,overloading, and others. I am not so sure that C isC++’s greatest weakness. Java has shown that inremoving C constructs that do not fit with object-oriented concepts, that C can provide an acceptable,albeit not perfect base.

Adoption of C++ does not suddenly transform Cprogrammers into object-oriented programmers. Acomplete change of thinking is required, and C++actually makes this difficult. A critique of C++cannot be separated from criticism of the C baselanguage, as it is essential for the C++ programmerto be fluent in C. Many of C’s problems affect theway that object-orientation is implemented and used

in C++. This critique is not exhaustive of theweaknesses of C++, but it illustrates the practicalconsequences of these weaknesses with respect tothe timely and economic production of qualitysoftware.

This paper is structured as follows: section 2considers the role of a programming language;section 3 examines some specific aspects of C++;section 4 looks specifically at C; and the conclusionexamines where C++ has left us, and considers thefuture.

I have tried to keep the sections reasonably selfcontained, so that you can read the sections thatinterest you, and use the critique in a reference style.There are some threads that occur throughout thecritique, and you will find some repetition of ideasto achieve self contained sections.

Having said that, I hope that you find thiscritique useful, and enjoyable: so please feel free todistribute it to your management, peers and friends.

2. The Role of a ProgrammingLanguageA programming language functions at manydifferent levels and has many roles, and should beevaluated with respect to those levels and roles.Historically, programming languages have had alimited role, that of writing executable programs. Asprograms have grown in complexity, this role alonehas proved insufficient. Many design and analysistechniques have arisen to support other necessaryroles.

Object-oriented techniques help in the analysisand design phases; object-oriented languages tosupport the implementation phase of OO, but inmany cases these lack uniformity of concepts,integration with the development environment andcommonality of purpose. Traditional problematicsoftware practices are infiltrating the object-orientedworld with little thought. Often these techniquesappeal to management because they are outwardlyorganised: people are assigned organisational rolessuch as project manager, team leader, analyst,designer and programmer. But these techniques aresimplistic and insufficient, and result in demotivatedand uncreative environments.

Object-orientation, however, offers a betterrational approach to software development. Thecomplementary roles of analysis, design,implementation and project organisation should bebetter integrated in the object-oriented scheme. Thisresults in economical software production, and morecreative and motivated environments.

The organisation of projects also required toolsexternal to the language and compiler, like ‘make.’A re-evaluation of these tools shows that often thedivision of labour between them has not been donealong optimal lines: firstly, programmers need to doextra bookkeeping work which could be automated;and secondly, inadequate separation of concerns hasresulted in inflexible software systems.

C++?? 3


C++ is an interesting experiment in adapting theadvantages of object-orientation to a traditionalprogramming language and developmentenvironment. Bjarne Stroustrup should berecognised for having the insight to put the twotechnologies together; he ventured into OO not onlybefore solutions were known to many issues, butbefore the issues were even widely recognised. Hedeserves better than a back full of arrows. But inretrospect, we now treat concepts such as multipleinheritance with a good deal of respect, and realisethat the Unix development environment with limitedlinker support does not provide enough compilersupport for many of the features that should be in ahigh level language.

There are solutions to the problems that C++uncovered. C++ has gone down a path in research,but now we know what the problems are and how tosolve them. Let’s adopt or develop such languages.Fortunately, such languages have been developed,which are of industrial strength, meant forcommercial projects, and are not just academicresearch projects. It is now up to the industry toadopt them on a wider scale.

C++, however, retains the problems of the oldorder of software production. C++ has an advantageover C as it supports many facets of object-orientation. These can be used for some analysis anddesign. The processes of analysis, design, andorganisation, however, are still largely external toC++. C++ has not realised the important advantagesof integrated software development that leads toimproved economies of software production.

Java is an interesting development taking adifferent approach to C++: strict compatibility withC is not seen as a relevant goal. Java is not the onlyC based alternative to C++ in the object-orientedworld. There has also been Objective-C from BradCox, and mainly used in NeXT’s OpenStepenvironment. Objective-C is more like Smalltalk, inthat all binding is done dynamically at run time.

A language should not only be evaluated from atechnical point of view, considering its syntactic andsemantic features; it should also be analysed fromthe viewpoint of its contribution to the entiresoftware development process. A language shouldenable communication between project membersacting at different levels, from management, who setenterprise level policies, to testers, who must test theresult. All these people are involved in the generalactivity of programming, so a language shouldenable communication between project membersseparated in space and time. A single programmer isnot often responsible for a task over its entirelifetime.

2.1 ProgrammingProgramming and specification are now seen as thesame task. One man’s specification is another’sprogram. Eventually you get to the point ofprocessing a specification with a compiler, whichgenerates a program which actually runs on a

computer. Carroll Morgan banishes the distinctionbetween specifications and programs: “To us theyare all programs.” [Morgan 90]. Programming is aterm that not only refers to implementation;programming refers to the whole process ofanalysis, design and implementation.

The Eiffel language integrates the concept ofspecification and programming, rejecting thedivided models of the past in favour of a newintegrated approach to projects. Eiffel achieves thisin several ways: it has a clean clear syntax which iseasy to read, even by non-programmers; it hastechniques such as preconditions and postconditionsso that the semantics of a routine can be clearlydocumented, these being borrowed from formalspecification techniques, but made easy for the ‘restof us’ to use; and it has tools to extract the abstractspecification from the implementation details of aprogram. Thus Eiffel is more than just a language,providing a whole integrated developmentenvironment.

Chris Reade [Reade 89] gives the followingexplanation of programming and languages. “One,rather narrow, view is that a program is a sequenceof instructions for a machine. We hope to show thatthere is much to be gained from taking the muchbroader view that programs are descriptions ofvalues, properties, methods, problems and solutions.The role of the machine is to speed up themanipulation of these descriptions to provide so-lutions to particular problems. A programminglanguage is a convention for writing descriptionswhich can be evaluated.”

[Reade 89] also describes programming as beinga “Separation of concerns”. He says:

“The programmer is having to do several thingsat the same time, namely,

(1) describe what is to be computed;(2) organise the computation sequencing into

small steps;(3) organise memory management during the

computation.”Reade continues, “Ideally, the programmer shouldbe able to concentrate on the first of the three tasks(describing what is to be computed) without beingdistracted by the other two, more administrative,tasks. Clearly, administration is important but byseparating it from the main task we are likely to getmore reliable results and we can ease theprogramming problem by automating much of theadministration.

“The separation of concerns has otheradvantages as well. For example, program provingbecomes much more feasible when details ofsequencing and memory management are absentfrom the program. Furthermore, descriptions of whatis to be computed should be free of such detailedstep-by-step descriptions of how to do it if they areto be evaluated with different machine architectures.Sequences of small changes to a data object held ina store may be an inappropriate description of how

C++?? 4


to compute something when a highly parallelmachine is being used with thousands of processorsdistributed throughout the machine and local ratherthan global storage facilities.

“Automating the administrative aspects meansthat the language implementor has to deal withthem, but he/she has far more opportunity to makeuse of very different computation mechanisms withdifferent machine architectures.”

These quotes from Reade are a good summaryof the principles from which I criticise C++. WhatReade calls administrative tasks, I call bookkeeping.Bookkeeping adds to the cost of softwareproduction, and reduces flexibility which in turnadds more to the cost. C and C++ are often criticisedfor being cryptic. The reason is that C concentrateson points 2 and 3, while the description of what is tobe computed is obscured.

High level languages describe ‘what’ is to becomputed; that is the problem domain. ‘How’ acomputation is achieved is in the low-level machine-oriented deployment domain. Automating thebookkeeping tasks enhances correctness,compatibility, portability and efficiency.Bookkeeping tasks arise from having to specify‘how’ a computation is done. Specifying ‘how’things are done in one environment hindersportability to other platforms.

The most significant way high level languagesreplace bookkeeping is using a declarative approach,whereas low level languages use operators, whichmake them more like assemblers. C and C++provide operators rather than the declarativeapproach, so are low level. The declarative approachcentralises decisions and lets the compiler generatethe underlying machine operators. With the operatorapproach, the bookkeeping is on the programmer touse the correct operator to access an entity, and if adecision changes, the programmer will have tochange all operators, rather than change the singledeclaration and simply recompiling. Thus in C andC++ the programmer is often concerned with theaccess mechanisms to data, whereas high levellanguages hide the implementation detail, makingprogram development and maintenance far moreflexible.

While C and C++ syntax is similar to high levellanguage syntax, C and C++ cannot be consideredhigh level, as they do not remove bookkeeping fromthe programmer that high level languages should,requiring the compiler to take care of these details.The low level nature of C and C++ severely impactsthe development process.

The most important quality of a high levellanguage is to remove bookkeeping burden from theprogrammer in order to enhance speed ofdevelopment, maintainability and flexibility. Thisattribute is more important than object-orientationitself, and should be intrinsic to any modernprogramming paradigm. C++ more than cancels thebenefits of OO by requiring programmers to perform

much of the bookkeeping instead of it beingautomated.

The industry should be moving towards theseideals, which will help in the economic productionof software, rather than the costly techniques oftoday. We should consider what we need, and assessthe problems of what we have against that. Object-orientation provides one solution to these problems.The effectiveness of OO, however, depends on thequality of its implementation.

2.2 Communication, abstraction andprecisionThe primary purpose of any language iscommunication. A specification is communicationfrom one person to another entity of a task to befulfilled. At the lowest level, the task to be fulfilledis the execution of a program by a computer. At thenext level it is the compilation of a program by acompiler. At higher levels, specificationscommunicate to other people what is to beaccomplished by the programming task. At thelowest level, instructions must be preciselyexecuted, but there is no understanding; it is purelymechanical. At higher levels, understanding isimportant, as human intelligence is involved, whichis why enlightened management practices emphasisetraining rather than forced processes. This is not tosay that precision is not important; precision at thehigher levels is of utmost importance, or the rest ofthe endeavour will fail. Most projects fail due tolack of precision in the requirements and other earlystages.

Unfortunately, often those who are least skilledin programming work at the higher levels, sospecifications lack the desirable properties ofabstraction and precision. Just as in the DilbertPrinciple [Adams 96], the least effectiveprogrammers are promoted to where they willseemingly do the least damage. This is not quite thewinning strategy that it seems, as that is where theyactually do the most damage, as teams of confusedprogrammers are then left to straighten out theirspecifications, while the so called analysts moveonto the next project or company to sew the seeds ofdisaster there.

(Indeed, since many managers have not read orunderstood the works of Deming [Deming 82],[L&S 95], De Marco and Lister [DM&L 87], andTom Peters’ later works, the message that thephysical environment and attitudes of the workplace leads to quality has not got through. Perhapsthe humour of Scott Adams is now the only way thismessage will have impact.)

At higher levels, abstraction facilitatesunderstanding. Abstraction and precision are bothimportant qualities of high level specifications.Abstraction does not mean vagueness, nor theabandonment of precision. Abstraction means theremoval of irrelevant detail from a certainviewpoint. With an abstract specification, you are

C++?? 5


left with a precise specification; precisely theproperties of the system that are relevant.

Abstraction is a fundamental concept incomputing. Aho and Ullman say “An important partof the field [computer science] deals with how tomake programming easier and software morereliable. But fundamentally, computer science is ascience of abstraction -- creating the right model fora problem and devising the appropriatemechanizable techniques to solve it.” [Aho 92].They also say “Abstraction in the sense we use itoften implies simplification, the replacement of acomplex and detailed real-world situation by anunderstandable model within which we can solvethe problem.”

A well known example that exhibits bothabstraction and precision is the LondonUnderground map designed by Harold Beck. This isa diagrammatic map that has abstracted irrelevantdetails from the real London geography to result in aconveniently sized and more readable map. Yet themap precisely shows the underground stations andwhere passengers can change trains. Many other citytransport systems have adopted the principles ofBeck’s map. Using this model passengers can easilysolve such problems as “How do I get fromKnightsbridge to Baker Street?”

2.3 NotationA programming language should support the ex-change of ideas, intentions, and decisions betweenproject members; it should provide a formal, yetreadable, notation to support consistent descriptionsof systems that satisfy the requirements of diverseproblems. A language should also provide methodsfor automated project tracking. This ensures thatmodules (classes and functionality) that satisfyproject requirements are completed in a timely andeconomic fashion. A programming language aidsreasoning about the design, implementation,extension, correction, and optimisation of a system.

During requirements analysis and design phases,formal and semi-formal notations are desirable.Notations used in analysis, design, andimplementation phases should be complementary,rather than contradictory. Currently, analysis, designand modelling notations are too far removed fromimplementation, while programming languages arein general too low level. Both designers andprogrammers must compromise to fill the gap.Many current notations provide difficult transitionpaths between stages. This ‘semantic gap’contributes to errors and omissions between therequirements, design and implementation phases.

Better programming languages are animplementation extension of the high level notationsused for requirements analysis and design, whichwill lead to improved consistency between analysis,design and implementation. Object-orientedtechniques emphasise the importance of this, asabstract definition and concrete implementation canbe separate, yet provided in the same notation.

Programming languages also provide notationsto formally document a system. Program source isthe only reliable documentation of a system, so alanguage should explicitly support documentation,not just in the form of comments. As with alllanguage, the effectiveness of communication isdependent upon the skill of the writer. Goodprogram writers require languages that support therole of documentation, and that the languagenotation is perspicuous, and easy to learn. Those nottrained in the skill of ‘writing’ programs, can readthem to gain understanding of the system. After all,it is not necessary for newspaper readers to bejournalists.

2.4 Tool IntegrationA language definition should enable thedevelopment of integrated automated tools tosupport software development. For example,browsers, editors and debuggers. The compiler isjust another tool, having a twofold role. Firstly, codegeneration for the target machine. The role of themachine is to execute the produced programs. Acompiler has to check that a program conforms tothe language syntax and grammar, so it can‘understand’ the program in order to translate it intoan executable form. Secondly, and moreimportantly, the compiler should check that theprogrammers expression of the system is valid,complete, and consistent; ie., perform semanticschecks that a program is internally consistent.Generating a system that has detectableinconsistencies is pointless.

2.5 CorrectnessDeciding what constitutes an inconsistency and howto detect it often raises passionate debate. Thediscord arises because the detectable inconsistenciesdo not exactly match real inconsistencies. There aretwo opposing views: firstly, languages thatovercompensate are restrictive, you should trustyour programmers; secondly, that programmers arehuman and make mistakes and program crashes atrun-time are intolerable.

This is the key to the following diagrams:

RealInconsistencies

Obscurefailures

FalseAlarms

Superfluousrun-timechecks/inefficiency

C++?? 6


In the first figure the black box represents the realinconsistencies, which must be covered by eithercompile-time checks or run-time checks.

In the scenario of this diagram, checks areinsufficient so obscure failures occur at run-time,varying from obscure run-time crashes to strangelywrong results to being lucky and getting away withit. Currently too much software development isbased on programming until you are in the luckystate, known as hacking. This sorry situation in theindustry must change by the adoption of betterlanguages to remove the ad hoc nature ofdevelopment.

Some feel that compiler checks are restrictiveand that run-time checks are not efficient, sopassionately defend this model, as programmers aresupposedly trustworthy enough to remove the rest ofthe real consistencies. Although most programmersare conscientious and trustworthy people, this leavestoo much to chance. You can produce defect-freesoftware this way, as long as the programmer doesnot introduce the inconsistencies in the first place,but this becomes much more difficult as the size andcomplexity of a software system increases, andmany programmers become involved. The realinconsistencies are often removed by hacking untilthe program works, with a resultant dependency ontesting to find the errors in the first place.Sometimes companies depend on the customers toactually do the testing and provide feedback aboutthe problems. While fault reporting is an essentialpath of communication from the customer, it mustbe regarded as the last and most costly line ofdefence.

C and C++ are in this category. Softwareproduced in these languages is prone to obscurefailures.

The second figure, shows that the language detectsinconsistencies beyond the real inconsistency box.These are false alarms. The run-time environmentalso doubles up on inconsistencies that the compiler

has detected and removed, which results in run-timeinefficiency. The language will be seen asrestrictive, and the run-time as inefficient. Youwon’t get any obscure crashes, but the language willget in the way of some useful computations. Pascalis often (somewhat unfairly) criticised for being toorestrictive.

The above figure shows an even worse situation,where the compiler generates false alarms onfictional inconsistencies, does superfluous checks atrun-time, but fails to detect real inconsistencies.

The best situation would be for a compiler tostatically detect all inconsistencies without falsealarms. However, it is not possible to staticallydetect all errors with the current state of technology,as a significant class of inconsistencies can only bedetected at run-time; inconsistencies such as: divideby zero; array index out of bounds; and a class oftype checks that are discussed in the section onRTTI and type casts.

The current ideal is to have the detectable andreal inconsistency domains exactly coincide, with asfew checks left to run-time as possible. This has twoadvantages: firstly, that your run-time environmentwill be a lot more likely to work without exceptions,so your software is safer; and secondly, that yoursoftware is more efficient, as you don’t need somany run-time checks. A good language willcorrectly classify inconsistencies that can bedetected at compile time, and those that must be leftuntil run-time.

This analysis shows that as some inconsistenciescan only be detected at run-time, and that suchdetection results in exceptions that exceptionhandling is an exceedingly important part ofsoftware. Unfortunately, exception handling has notreceived serious enough attention in mostprogramming languages.

Eiffel has been chosen for comparison in thiscritique as the language that is as close to the idealas possible; that is, all inconsistencies are covered,while false alarms are minimised, and the detectable

Compile Time

Run Time

CompileTime

RunTime Compile

Time

RunTime

CompileTime

Run Time

C++?? 7


inconsistencies are correctly categorised as compile-time or run-time. Eiffel also pays serious attentionto exception handling.

2.6 TypesIn order to produce correct programs, syntax checksfor conformance to a language grammar are notsufficient: we should also check semantics. Somesemantics can be built into the language, but mostlythis must be specified by the programmer about thesystem being developed.

Semantics checking is done by ensuring that aspecification conforms to some schema. Forexample, the sentence: “The boy drank the computerand switched on the glass of water” is grammaticallycorrect, but nonsense: it does not conform to themental schema we have of computers and glasses ofwater. A programming language should includetechniques for the detection of similar nonsense. Thetechnique that enables detection of the abovenonsense is types. We know from the computer’stype that it does not have the property ‘drinkable’.Types define an entity’s properties and behaviour.

Programming languages can either be typed oruntyped; typed languages can be statically typed ordynamically typed. Static typing ensures at compiletime that only valid operations are applied to anentity. In dynamically typed languages, typeinconsistencies are not detected until run-time.Smalltalk is a dynamically typed language, not anuntyped language. Eiffel is statically typed.

C++ is statically typed, but there are manymechanisms that allow the programmer to render iteffectively untyped, which means errors are notdetected until a serious failure. Some argue thatsometimes you might want to force someone todrink a computer, so without these facilities, thelanguage is not flexible enough. The correct solutionthough is to modify the design, so that now thecomputer has the property drinkable. Underminingthe type system is not needed, as the type system iswhere the flexibility should be, not in the ability toundermine the type system. Providing andmodifying declarations is declarative programming.Eiffel tends to be declarative with a simpleoperational syntax, whereas C++ provides a plethoraof operators.

Defining complex types is a central concept ofobject-oriented programming: “Perhaps the mostimportant development [in programming languages]has been the introduction of features that supportabstract data types (ADTs). These features allowprogrammers to add new types to languages that canbe treated as though they were primitive types of thelanguage. The programmer can define a type and acollection of constants, functions, and procedures onthe type, while prohibiting any program using thistype from gaining access to the implementation ofthe type. In particular, access to values of the type isavailable only through the provided constants,functions, and procedures.” [Bruce 96].

Object-oriented programming also provides twospecific ways to assemble new and complex types:“objects can be combined with other types inexpressive and efficient ways (composition andhierarchy) to define new, more complex types.”[Ege 96].

2.7 Redundancy and CheckingRedundant information is often needed to enablecorrectness checking. Type definitions define theelements in a system’s universe, and the propertiesgoverning the valid combinations and interactions ofthe elements. Declarations define the entities in asystem’s universe. The compiler uses redundantinformation for consistency checking, and strips itaway to produce efficient executable systems. Typesare redundant information. You can program in anentirely typeless language: however, this would beto deny the progress that has been made in makingprogramming a disciplined craft, that producescorrect programs economically.

It is a misconception that consistency checks are‘training wheels’ for student programmers, and that‘syntax’ errors are a hindrance to professionalprogrammers. Languages that exploit techniques ofschema checking are often criticised as beingrestrictive and therefore unusable for real worldsoftware. This is nonsense and misunderstands thepower of these languages. It is an immatureconception; the best programmers realise thatprogramming is difficult. As a whole, the computingprofession is still learning to program.

While C++ is a step in this direction, it ishindered by its C base, importing such mechanismsas pointers with which you can undermine the logicof the type system. Java has abandoned these Cmechanisms where they hinder: “The Java compileremploys stringent compile-time checking so thatsyntax-related errors can be detected early, before aprogram is deployed in service” [Sun 95]. Theprogramming community has matured in the lastfew years, and while there was vehement argumentagainst such checking in the past by those who sawit as restrictive and disciplinarian, the majority ofthe industry now accepts, and even demands it.

Checking has also been criticised from anotherpoint of view. This point of view says that checkingcannot guarantee software quality, so why bother?The premise is correct, but the conclusion is wrong.Checking is neither necessary, nor sufficient toproduce quality software. However, it is helpful anduseful, and is a piece in a complicated jig-saw whichshould not be ignored.

In fact there are few things that are necessary forquality software production. Mainly, softwarequality is dependent on the skill and dedication ofthe people involved, not methodologies ortechniques. There is nothing that is sufficient. AsFred Brooks has pointed out, there is no SilverBullet [Brooks 95]. Good craftsmen choose the righttools and techniques, but the result is dependent onthe skill used in applying the tools. Any tool is

C++?? 8


worthless in itself. But the Silver Bullet rationale isnot a valid rationale against adopting betterprogramming languages, tools and environments;unfortunately, Brooks’ article has been misused.

Another example of consistency checking comesfrom the user interface world. Instead of correcting auser after an erroneous action, a good user interfacewill not offer the action as a possibility in the firstplace. It is cheaper to avoid error than to fix it. Mostpeople drive their cars with this principle in mind:smash repair is time consuming and expensive.

Program development is a dynamic process;program descriptions are constantly modified duringdevelopment. Modifications often lead toinconsistencies and error. Consistency checks helpprevent such ‘bugs’, which can ‘creep’ into apreviously working system. These checks helpverify that as a program is modified, previousdecisions and work are not invalidated.

It is interesting to consider how much checkingcould be integrated in an editor. The focus of manycurrent generation editors is text. What happens ifwe change this focus from text to programcomponents? Such editors might check not onlysyntax, but semantics. Signalling potential errorsearlier and interactively will shorten developmenttimes, alerting programmers to problems, rather thanwasting hours on changes which later have to beundone. Future languages should be defined verycleanly in order to enable such editor technology.

2.8 EncapsulationThere is much confusion about encapsulation,mostly arising from C++ equating encapsulationwith data hiding. The Macquarie dictionary definesthe verb to encapsulate as “to enclose in or as in acapsule.” The object-oriented meaning ofencapsulation is to enclose related data, routines anddefinitions in a class capsule. This does notnecessarily mean hiding.

Implementation hiding is an orthogonal conceptwhich is possible because of encapsulation. Bothdata and routines in a class are classified accordingto their role in the class as interface orimplementation.

To put this another way: first you encapsulateinformation and operations together in a class, thenyou decide what is visible, and what is hiddenbecause it is implementation detail. Most often onlythe interface routines and data should appear atdesign time, the implementation details appearinglater.

Encapsulation provides the means to separatethe abstract interface of a class from itsimplementation: the interface is the visible surfaceof the capsule; the implementation is hidden in thecapsule. The interface describes the essentialcharacteristics of objects of the class which arevisible to the exterior world. Like routines, data in aclass can also be divided into characteristic interfacedata which should be visible, and implementation

data which should be hidden. Interface data are anycharacteristics which might be of interest to theoutside world. For example when buying a car, thepurchaser might want to know data such as theengine capacity and horse-power, etc. However, thefact that it took John Engineer six days to design theengine block is of no interest.

Implementation hiding means that data can onlybe manipulated, that is updated, within the class, butit does not mean hiding interface data. If the datawere hidden, you could never read it, in which case,classes would perform no useful function as youcould only put data into them, but never getinformation out.

In order to provide implementation hiding inC++ you should access your data through Cfunctions. This is known as data hiding in C++. It isnot the data that is actually being hidden, but theaccess mechanism to the data. The accessmechanism is the implementation detail that you arehiding. C++ has visible differences between theaccess mechanisms of constants, variables andfunctions. There is even a typographic convention ofupper case constant names, which makes thedifferences between constants and variables visible.The fact that an item is implemented as a constantshould also be hidden. Most non-C languagesprovide uniform functional access to constants,variables and value returning routines. In the case ofvariables, functional access means they can be readfrom the outside, but not updated. An importantprinciple is that updates are centralised within theclass.

Above I indicated that encapsulation wasgrouping operations and information together.Where do functions fit into this? The wrong answeris that functions are operations. Functions areactually part of the information, as a function returnsinformation derived from an object’s data to theoutside world.

This theme and its adverse consequences, thatplace the burden of encapsulation on theprogrammer rather than being transparent, recurthroughout this critique.

2.9 Safety and Courtesy ConcernsThis critique makes two general types of criticismabout ‘safety’ concerns and ‘courtesy’ concerns.These themes recur throughout this critique, as Cand C++ have flaws that often compromise them.Safety concerns affect the external perception of thequality of the program; failure to meet them resultsin unfulfilled requirements, unsatisfied customersand program failures.

Courtesy concerns affect the internal view ofthe quality of a program in the development andmaintenance process. Courtesy concerns are usuallystylistic and syntactic, whereas safety concerns aresemantic. The two often go together. It is a courtesyconcern for an airline to keep its fleet clean and well

C++?? 9


maintained, which is also very much a safetyconcern.

Courtesy issues are even more important in thecontext of reusable software. Reusability depends onthe clear communication of the purpose of a module.Courtesy is important to establish socialinteractions, such as communication. Courtesyimplies inconvenience to the provider, but providesconvenience to others. Courtesy issues includechoosing meaningful identifiers, consistent layoutand typography, meaningful and non-redundantcommentary, etc. Courtesy issues are more than justa style consideration: a language design shoulddirectly support courtesy issues. A language,however, cannot enforce courtesy issues, and it isoften pointed out that poor, discourteous programscan be written in any language. But this is no reasonfor being careless about the languages that wedevelop and choose for software development.

Programmers fulfilling courtesy and safetyconcerns provide a high quality service fulfillingtheir obligations by providing benefits to otherprogrammers who must read, reuse and maintain thecode; and by producing programs that delight theend-user.

The programming by contract model has beenadvocated in the last few years as a model forprogramming by which safety and courtesy concernscan be formally documented. Programming bycontract documents the obligations of a client andthe benefits to a provider in preconditions; and thebenefits to the client and obligations of the providerin postconditions [Meyer 88], [Kilov and Ross 94].

2.10 Implementation and DeploymentConcernsClass implementors are concerned with theimplementation of the class. Clients of the classonly need to know as much information about theclass as is documented in the abstract interface. Theimplementation is otherwise hidden.

Another aspect that is just as important to shieldprogrammers from is deployment concerns.Deployment is how a system is installed on theunderlying technology. If deployment issues arebuilt into a program, then the program lacksportability, and flexibility. One kind of deploymentconcern is how a system is mapped to the availablecomputing resources. For example, in a distributedsystem, this is what parts of the system are run inwhich location. As things can move around adistributed system, programmers should not buildinto their code location knowledge of other entities.Locations should be looked up in a directory.

Another deployment issue is how individualunits of a system are plugged together to form anintegrated whole. This is particularly important inOO, where several libraries can come from differentvendors, but their combination results in conflicts. Asolution to this is some kind of language that bindsthe units. Thus if you purchase two OO libraries,

and they have clashes of any kind, you can resolvethis deployment issue without having to change thelibraries, which you might not be able to do anyway.

Programmers should not only be separated fromimplementation concerns of other units, butseparated from deployment concerns as well.

2.11 Concluding RemarksIt is relevant to ask if grafting OO concepts onto aconventional language realises the full benefits ofOO? The following parable seems apt: “No onesews a patch of unshrunk cloth on to an oldgarment; if he does, the patch tears away from it, thenew from the old, and leaves a bigger hole. No oneputs new wine into old wineskins; if he does, thewine will burst the skins, and then wine and skinsare both lost. New wine goes into fresh skins.” Mark2:22

We must abandon disorganised and error-pronepractices, not adapt them to new contexts. How wellcan hybrid languages support the sophisticatedrequirements of modern software production? In myexperience bolt-on approaches to object-orientationusually end in disaster, with the new tearing awayfrom the old leaving a bigger hole.

Surely a basic premise of object-orientedprogramming is to enable the development ofsophisticated systems through the adoption of thesimplest techniques possible? Software developmenttechnologies and methodologies should not impedethe production of such sophisticated systems.

3. C++ Specific Criticisms

3.1 Virtual FunctionsThis is the most complicated section in the critique,due to C++’s complex mechanisms. Although thisissue is central as polymorphism is a key concept ofOOP, feel free to skim if you want an overview,without the details.

In C++ the keyword virtual enables thepossibility for a function to be polymorphic when itis overridden (redefined) in one or more descendantclasses, but the virtual keyword is unnecessary,as any function which is redefined in a descendantclass could be polymorphic. A compiler only needsto generate dynamic dispatch for truly polymorphicroutines.

The problem in C++ is that if a parent classdesigner does not foresee that a descendant classmight want to redefine a function, then thedescendant class cannot make the functionpolymorphic. This is a most serious flaw in C++because it reduces the flexibility of softwarecomponents and therefore the ability to writereusable and extensible libraries.

C++ also allows functions to be overloaded, inwhich case the correct function to call depends onthe arguments. The actual arguments in the functioncall must match the formal arguments of one of theoverloaded functions. The difference between

C++?? 10


overloaded functions and polymorphic (overridden)functions is that with overloaded functions, thecorrect function to call is determined at compile-time; with polymorphic functions the correctfunction to call is determined at run-time.

When a parent class is designed the programmercan only guess that a descendant class mightoverride or overload a function. A descendant classcan overload a function at any time, but this is notthe case for the more important mechanism ofpolymorphism, where the parent class programmermust specify that the routine is virtual in orderfor the compiler to set up a dispatch entry for thefunction in the class jump table. So the burden is onthe programmer for something which could beautomatically done by the compiler, and is done bythe compiler in other languages. However, this is arelic from how C++ was originally implementedwith Unix tools, rather than specialised compilerand linker support.

There are three options for overriding, correspondingto ‘must not’, ‘can’, and ‘must’ be overridden:

1) Overriding a routine is prohibited;descendant classes must use the routine as is.

2) A routine can be overridden. Descendantclasses can use the routine as provided, or providetheir own implementation as long as it conforms tothe original interface definition and accomplishes atleast as much.

3) A routine is abstract. No implementation isprovided and each non-abstract descendent classmust provide its own implementation.

The base class designer must decide options 1and 3. Descendant class designers must decideoption 2. A language should provide direct syntaxfor these options.

Option 1C++ does not cater for the prohibition of overridinga routine in a descendant class. Even privatevirtual routines can be overridden. [Sakkinen92] points out that a descendant class can redefine aprivate virtual function even though itcannot access the function in other ways.

Not using a virtual function is the closest, but inthat case the routine can be completely replaced.This causes two problems. Firstly, a routine can beunintentionally replaced in a descendent. Theredeclaration of a name within the same scopeshould cause a name clash; the compiler shouldreport a ‘duplicate declaration’ syntax error as theentities inherited from the parent are included in thedescendants namespace. Allowing two entities tohave the same name within one scope causesambiguity and other problems. (See the section onname overloading.)

The following example illustrates the secondproblem:

class A{

public:void nonvirt ();virtual void virt ();

}

class B : public A{

public:void nonvirt ();void virt ();

}

A a;B b;A *ap = &b;B *bp = &b;

bp->nonvirt (); // calls B::nonvirt as // you would expect.ap->nonvirt (); // calls A::nonvirt, // even though this // object is of type B.ap->virt (); // calls B::virt, the // correct version of // the routine for B // objects.

In this example, class B has extended or replacedroutines in class A. B::nonvirt is the routinethat should be called for objects of type B. It couldbe pointed out that C++ gives the client programmerflexibility to call either A::nonvirt orB::nonvirt , but this can be provided in asimpler more direct way: A::nonvirt andB::nonvirt should be given different names.That way the programmer calls the correct routineexplicitly, not by an obscure and error prone trick ofthe language. The different name approach is asfollows:

class B : public A{

public:void b_nonvirt ();void virt ();

}B b;B *bp = &b;bp->nonvirt (); // calls A::nonvirtbp->b_nonvirt (); // calls B::b_nonvirt

Now the designer of class B has direct control overB’s interface. The application requires that clients ofB can call both A::nonvirt , andB::b_nonvirt , which B’s designer has explicitlyprovided for. This is good object-oriented design,which provides strongly defined interfaces. C++allows client programmers to play tricks with theclass interfaces, external to the class, and B’s

C++?? 11


designer cannot prevent A::nonvirt from beingcalled. Objects of class B have their own specialisednonvirt , but B’s designer does not have controlover B’s interface to ensure that the correct versionof nonvirt is called.

C++ also does not protect class B from otherchanges in the system. Suppose we need to write aclass C that needs nonvirt to be virtual . Thennonvirt in A will be changed to virtual . Butthis breaks the B::nonvirt trick. Therequirement of class C to have a virtual functionforces a change in the base class, which affects allother descendants of the base class, instead of thespecific new requirement being localised to the newclass. This is against to the reason for OOP havingloosely coupled classes, so that new requirements,and modifications will have localised effects, andnot require changes elsewhere which can potentiallybreak other existing parts of the system.

Another problem is that statements shouldconsistently have the same semantics. Thepolymorphic interpretation of a statement likea->f() is that the most suitable implementation off() is invoked for the object referred to by ‘a’,whether the object is of type A, or a descendent of A.In C++, however, the programmer must knowwhether the function f() is defined virtual or non-virtual in order to interpret exactly what a->f()means. Therefore, the statement a->f() is notimplementation independent and the principle ofimplementation hiding is broken. A change in thedeclaration of f() changes the semantics of theinvocation. Implementation independence meansthat a change in the implementation DOES NOTchange the semantics, of executable statements.

If a change in the declaration changes thesemantics, this should generate a compiler detectederror. The programmer should make the statementsemantically consistent with the changeddeclaration. This reflects the dynamic nature ofsoftware development, where you’ll see perpetualchange in program text.

For yet another case of the inconsistentsemantics of the statement a->f() vs constructors,consult section 10.9c, p 232 of the C++ ARM.Neither Eiffel nor Java have these problems. Theirmechanisms are clearer and simpler, and don’t leadto the surprises of C++. In Java, everything isvirtual , and to gain the effect where a methodmust not be overridden, the method may be definedwith the qualifier final .

Eiffel allows the programmer to specify aroutine as frozen, in which case the routine cannotbe redefined in descendants.

Option 2Using the function as is or overriding it should beleft open for the programmers of descendant classes.In C++, the possibility must be enabled in the baseclass by specifying virtual . In object-orienteddesign, the decisions you decide not to make are asimportant as the decisions you make. Decisions

should be made as late as possible. This strategyprevents mistakes being built into the system atearly stages. By making early decisions, you areoften stuck with assumptions that later prove to beincorrect; or the assumptions could be correct in oneenvironment, but false in another, making softwarebrittle and non-reusable.

C++ requires the parent class to specify potentialpolymorphism by virtual (although an intermediateclass in the inheritance chain can introduce virtual).This prejudges that a routine might be redefined indescendants. This can be a problem because routinesthat aren’t actually polymorphic are accessed via theslightly less efficient virtual table technique insteadof a straight procedure call. (This is never a largeoverhead but object-oriented programs tend to usemore and smaller routines making routineinvocation a more significant overhead.) The policyin C++ should be that routines that might beredefined should be declared virtual. What is worseis that it says that non-virtual routines cannot beredefined, so the descendant class programmer hasno control.

Rumbaugh et al put their criticism of C++’svirtual as follows: “C++ contains facilities forinheritance and run-time method resolution, but aC++ data structure is not automatically object-oriented. Method resolution and the ability tooverride an operation in a subclass are onlyavailable if the operation is declared virtual in thesuperclass. Thus, the need to override a methodmust be anticipated and written into the origin classdefinition. Unfortunately, the writer of a class maynot expect the need to define specialised subclassesor may not know what operations will have to beredefined by a subclass. This means that thesuperclass often must be modified when a subclassis defined and places a serious restriction on theability to reuse library classes by creating sub-classes, especially if the source code library is notavailable. (Of course, you could declare alloperations as virtual, at a slight cost in memory andfunction-calling overhead.)” [RBPEL91]

Virtual, however, is the wrong mechanism forthe programmer to deal with. A compiler can detectpolymorphism, and generate the underlying virtualcode, where and only where necessary. Having tospecify virtual burdens the programmer with anotherbookkeeping task. This is the main reason why C++is a weak object-oriented language as theprogrammer must constantly be concerned with lowlevel details, which should be automatically handledby the compiler.

Another problem in C++ is mistaken overriding.The base class routine can be overriddenunwittingly. The compiler should report anerroneous name redefinition within the same namespace unless the descendant class programmerspecifies that the routine redefinition is reallyintended. The same name can be used, but the pro-grammer must be conscious of this, and state thisexplicitly, especially in environments where systems

C++?? 12


are assembled out of preexisting components.Unless the programmer explicitly overrides theoriginal name a syntax error should report that thename is a duplicate declaration. C++, however,adopted the original approach of Simula. Thisapproach has been improved upon, and otherlanguages have adopted better, more explicitapproaches, that avoid the error of mistakenredefinition.

The solution is that virtual should not bespecified in the parent. Where run-time polymorphicdynamic-binding is required, the child class shouldspecify override on the function. When compile-time static-binding is required, the child class shouldspecify overload on the function. This has theadvantages: in the case of polymorphic functions,the compiler can check that the function signaturesconform; and in the case of overloaded functionsthat the function signatures are different in somerespect. The second advantage would be that duringthe maintenance phases of a program, the originalprogrammer’s intention is clear. As it is, laterprogrammers must guess if the original programmerhad made some kind of error in choosing a duplicatename, or whether overloading was intended.

In Java, there is no virtual keyword; allmethods are potentially polymorphic. Java usesdirect call instead of dynamic method lookup whenthe method is static , private or final . Thismeans that there will be non-polymorphic routinesthat must be called dynamically, but the dynamicnature of Java means further optimisation is notpossible.

Eiffel and Object Pascal cater for this option asthe descendant class programmer must specify thatredefinition is intended. This has the extra benefitthat a later reader or maintainer of the class caneasily identify the routines that have been redefined,and that this definition is related to a definition in anancestor class without having to refer to ancestorclass definitions. Thus option 2 is exactly where itshould be, in descendant classes.

Both Eiffel and Object Pascal optimise calls:they only generate dispatch table entries for dynamicbinding where a routine is truly polymorphic. Howthis is possible is covered in the section on globalanalysis.

Option 3The pure virtual function caters for leaving afunction abstract, that is a descendent class mustprovide its implementation if it is to be instantiated.Any descendants that do not define the routine arealso abstract classes. This concept is correct, but seethe section on pure virtual functions forcriticism of the terminology and syntax.

Java also has abstract methods, and in Eiffel, theimplementation is marked as deferred.SummaryThe main problem with virtual is that it forcesthe base class designer to guess that a function

might be polymorphic in one or more derivedclasses. If this requirement is not foreseen, or notincluded as an optimisation to avoid dynamicallydispatched calls, the possibility is effectively closed,rather than being left open. As implemented in C++,virtual coupled with the independent notion ofoverloading make an error prone combination.

Virtual is a difficult notion to grasp. Therelated concepts of polymorphism and dynamicbinding, redefinition, and overriding are easier tograsp, being oriented towards the problem domain.Virtual routines are an implementation mechanismwhich instruct the compiler to set up entries in theclass’s virtual table; where global analysis is notdone by the compiler, leaving this burden to theprogrammer. Polymorphism is the ‘what’, andvirtual is the ‘how’. Smalltalk, Objective-C, Java,and Eiffel all use a different mechanism toimplement polymorphism.

Virtual is an example of where C++ obscures theconcepts of OOP. The programmer has to come toterms with low level concepts, rather than the higherlevel object-oriented concepts. Virtual leavesoptimisation to the programmer. Other approachesleave the optimisation of dynamic dispatch to thecompiler, which can remove 100% of cases wheredynamic dispatch is not required. Interesting asunderlying mechanisms might be for the theoreticianor compiler implementor, the practitioner should notbe required to understand or use them to make senseof the higher level concepts. Having to use them inpractice is tedious and error-prone, and can preventthe adaptation of software to further advances in theunderlying technology and execution mechanisms(see concurrent programming), and reduces theflexibility and reusability of the software.

3.2 Global Analysis[P&S 94] note that there are two world assumptionsabout type safety. The first is the closed-worldassumption, where all parts of the program areknown at compilation time, and type checking isdone for the entire program. The second is the open-world assumption, where type checking is doneindependently for each module. The open-worldassumption is useful when developing andprototyping. However, “When a finished producthas matured, it makes sense to adopt the closed-world assumption, since it enables more advancedcompilation techniques. Only when the entireprogram is known, is it possible to perform globalregister allocation, flow analysis, or dead codedetection.” [P&S 94].

One of the major problems with C++ is the wayanalysis is divided between the compiler, whichworks under the open-world assumption, and thelinker which is depended on to do very limitedclosed-world analysis. Closed-world or globalanalysis is essential for two reasons: firstly, toensure that the assembled system is consistent; andsecondly to remove burden from the programmer byproviding automatic optimisations.

C++?? 13


The main burden that can be removed from theprogrammer is that of a base class designer havingto help the compiler build class virtual tables withthe virtual function modifier. As explained in thesection on virtual functions, this adversely effectssoftware flexibility. Virtual tables should not bebuilt when a class is compiled: rather virtual tablesshould only be built when the entire system isassembled. During the system assembly (linker)phase, the compiler and linker can entirelydetermine which functions need virtual table entries.Other burdens are that the programmer must useoperators to help the compiler with information inother modules it cannot see, and the maintenance ofheader files.

In Eiffel and Object Pascal, global analysis ofthe entire system is done to determine the trulypolymorphic calls and accordingly construct thevirtual tables. In Eiffel this is done by the compiler.In Object Pascal, Apple extended the linker toperform global analysis. Such global analysis isdifficult in a C/Unix style environment, so in C++ itwas not included, leaving this burden to theprogrammer.

In order to remove this burden from theprogrammer, global analysis should have been putin the linker. However, as C++ was originallyimplemented as the Cfront preprocessor, necessarychanges to the linker weren’t undertaken. The earlyimplementations of C++ were a patchwork, and thishas resulted in many holes. The design of C++ wasseverely limited by its implementation technology,rather than being guided by the principles of betterlanguage design, which would require dedicatedcompilers and linkers. That is, C++ has beenseverely limited by its original experimentalimplementation.

I am now convinced that such technologydependence has severely damaged C++ as an object-oriented language and as a high level language. Ahigh level language removes the bookkeepingburden from the programmer and places them in thecompiler, which is the primary aim of high levellanguages. Lack of global or closed-world analysisis a major deficiency of C++, which leaves C++substantially lacking when compared to languagessuch as Eiffel. As Eiffel insists on system levelvalidity and therefore global analysis, it means thatEiffel implementations are more ambitious thanC++ implementations, and this is a major reasonwhy Eiffel implementations have been slower toappear.

Java dynamically loads pieces of software andlinks them into a running system as required. Thusstatic compile-time global analysis is not possible,as Java is designed to be dynamic. However, Javahas made the valid assumption that all methods arevirtual. This is one reason why Java and Eiffel aresubstantially different tools, although Eiffel hasrecently introduced Dynamic Linking in Eiffel(DLE).

3.3 Type-safe linkageThe C++ ARM explains that type-safe linkage is not100% type safe. If it is not 100% type-safe, then it isunsafe. Statistical analysis showed that in theChallenger disaster, the probability against anindividual O-ring failure was .997. But in acombination of 6 this small margin for failurebecame significant, meaning the combination wasvery likely to fail. In software, we often find strangecombinations cause failure. It is the primaryobjective of OO to reduce these strangecombinations.

It is the subtle errors that cause the mostproblems, not the simple or obvious ones. Oftensuch errors remain undetected in the system untilcritical moments. The seriousness of this situationcannot be underestimated. Many forms of transport,such as planes, and space programs depend onsoftware to provide safety in their operation. Thefinancial survival of organisations can also dependon software. To accept such unsafe situations is atbest irresponsible.

C++ type safe linkage is a huge improvementover C, where the linker will link a function f (p1,...) with parameters to any function f (), maybe onewith no or different parameters. This results infailure at run time. However, since C++ type safelinkage is a linker trick, it does not deal with allinconsistencies like this.

The C++ ARM summarises the situation asfollows - “Handling all inconsistencies - thusmaking a C++ implementation 100% type-safe -would require either linker support or a mechanism(an environment) allowing the compiler access toinformation from separate compilations.”

So why do C++ compilers (at least AT&T’s) notprovide for accessing information from separatecompilations? Why is there not a specialised linkerfor C++, that actually provides 100% type safety?C++ lacks the global analysis of the previoussection. Building systems out of preexistingelements is the common Unix style of softwareproduction. This implements a form of reusability,but not in the truly flexible and consistent manner ofobject-oriented reusability.

In the future, Unix might be replaced by object-oriented operating systems, that are indeed ‘open’ tobe tailored to best suit the purpose at hand. By theuse of pipes and flags, Unix software elements canbe reused to provide functionality that approximateswhat is desired. This approach is valid and workswith efficacy in some instances, like small in-houseapplications, or perhaps for research prototyping,but is unacceptable for widespread and expensivesoftware, or safety critical applications. In the lastten years the advantages of integrated software havebeen acknowledged. Classic Unix systems don’tprovide those advantages. Integrated systems aremore ambitious, and place more demands on theirdevelopers, but this is the sort of software nowbeing demanded by end users. Systems that arecobbled together are unacceptable. Today the

C++?? 14


emphasis is on software component technologiessuch as the public domain OpenDoc or Microsoft’sOLE.

A further problem with linking is that differentcompilation and linking systems should usedifferent name encoding schemes. This problem isrelated to type-safe linkage, but is covered in thesection on ‘reusability and compatibility’.

Java uses a different dynamic linkingmechanism, which is well defined and does not usethe Unix linker. Eiffel does not depend on the Unixor other platform linkers to detect such problems.The compiler must detect these problems.

Eiffel defines system-level validity. An Eiffelcompiler is therefore required to perform closed-world analysis, and not rely on linker tricks. Youcan thus be sure that Eiffel programs are 100% typesafe. A disadvantage of Eiffel is that compilers havea lot of work to do. (The common terminology is‘slow’, but that is inaccurate.) This is overcome tosome extent by Eiffel’s melting-ice technology,where changes can be made to a system, and testedwithout the need to recompile every time.

To summarise the last two sections: global orclosed-world analysis is needed for two reasons:consistency checks and optimisations. This removesmany burdens from the programmer, and its lack isa great shortcoming of C++.

3.4 Function OverloadingC++ allows functions to be overloaded if thearguments in the signature are different types.Overloaded functions are different to polymorphicfunctions: for each invocation the correct function isselected at compile time; with polymorphicfunctions, the correct function is bound dynamicallyat run-time. Polymorphism is achieved by redefiningor overriding routines. Be careful not to confuseoverriding and overloading. Overloading ariseswhen two or more functions share a name. These aredisambiguated by the number and types of thearguments. Overloading is different to multipledispatching in CLOS, as multiple dispatching onargument types is done dynamically at run-time.

[Reade 89] points out the difference betweenoverloading and polymorphism. Overloading meansthe use of the same name in the same context fordifferent entities with completely differentdefinitions and types. Polymorphism though has onedefinition, and all types are subtypes of a principletype. C. Strachey referred to polymorphism asparametric polymorphism and overloading as ad hocpolymorphism. The qualification mechanism foroverloaded functions is the function signature.

Overloading can be useful as these examplesshow:

max (int, int);max (real, real);

This will ensure that the best max routine for thetypes int and real will be invoked. Object-

oriented programming, however, provides a varianton this. Since the object is passed to the routine as ahidden parameter (‘this’ in C++), an equivalent butmore restricted form is already implicitly includedin object-oriented concepts. A simple example suchas the above would be expressed as:

int i, j;real r, s;i.max (j);r.max (s);

but i.max (r) and r.max (j) result in compilationerrors because the types of the arguments do notagree. By operator overloading of course, these canbe better expressed, i max j and r max s, but minand max are peculiar functions that could accept twoor more parameters of the same type so they can beapplied to a arbitrarily sized list. So the most generalcode in Eiffel style syntax will be something like:

il : COMPARABLE_LIST [INTEGER]rl: COMPARABLE_LIST [REAL]

i := il.maxr := rl.max

The above examples show that the object-orientedparadigm, particularly with genericity can achievefunction overloading, without the need for thefunction overloading of C++. C++, however, doesmake the notion more general. The advantage is thatmore than one parameter can overload a function,not just the implicit current object parameter.

Another factor to consider is that overloading isresolved at compile time, but overriding at run-time,so it looks as if overloading has a performanceadvantage. However, global analysis can determinewhether the min and max functions are at the end ofthe inheritance line, and therefore can call themdirectly. That is, the compiler examines the objects iand r, looks at their corresponding max function,sees that at that point no polymorphism is involved,and so generates a direct call to max. By contrast, ifthe object was n which was defined to be aNUMBER which provided the abstract max functionfrom which REAL.max and INTEGER.max werederived, then the compiler would need to generate adynamically bound call, as n could refer to either aINTEGER or a REAL.

If it is felt that C++’s scheme of havingparameters of different types is useful, it should berealised that object-oriented programming providesthis in a more restricted and disciplined form. Thisis done by specifying that the parameter needs toconform to a base class. Any parameter passed tothe routine can only be a type of the base class, or asubclass of the base class. For example:

A.f (B someB) {...};class B ...;class D : public B ...A a;

C++?? 15


D d;a.f (d);

The entity ‘d’ must conform to the class ‘B’, and thecompiler checks this.

The alternative to function overloading bysignature, is to require functions with differentsignatures to have different names. Names should bethe basis of distinction of entities. The compiler cancross check that the parameters supplied are correctfor the given routine name. This also results inbetter self-documented software. It is often difficultto choose appropriate names for entities, but it iswell worth the effort.

[Wiener 95] contributes a nice example on thehazards of virtual functions with overloading:

class Parent{ public: virtual int doIt (int v) { return v * v; }};

class Child : public Parent{ public: int doIt (int v, int av = 20) { return v * av; }};

void main(){ int i; Parent *p = new Child(); i = p->doIt(3);}

What is the value in i after execution of thisprogram? One might expect 60, but it is 9 as thesignature of doIt in Child does not match thesignature in Parent . It therefore does not overridethe Parent doIt , merely overloads it, and thedefault is unusable.

Java also provides method overloading, whereseveral methods can have the same name, but havedifferent signatures.

The Eiffel philosophy is not to introduce a newtechnique, but to use genericity, inheritance andredefinition. Eiffel provides covariant signatures,which means the signatures of descendant routinesdo not have to match exactly, but they do have toconform, according to Eiffel’s strong typing scheme.

Eiffel uses covariance with anchored types toimplement examples such as max. The Vintage 95Kernel Library specifies max as:

max (other: like Current): like Current

This says that the type of the argument to max mustconform to the type of the current class. Thereforeyou get the same effect by redefinition without theoverloading concept. You also get type checking tosee that the parameter conforms to the currentobject. Genericity is also a mechanism thatovercomes most of the need for overloading.

3.5 The Nature of InheritanceInheritance is a close relationship providing afundamental OO way to assemble softwarecomponents, along with composition and genericity.Objects that are instances of a class are alsoinstances of all ancestors of that class. For effectiveobject-oriented design the consistency of thisrelationship should be preserved. Each redefinitionin a subclass should be checked for consistency withthe original definition in an ancestor class. Asubclass should preserve the requirements of anancestor class. Requirements that cannot bepreserved indicate a design error and perhapsinheritance is not appropriate. Consistency due toinheritance is fundamental to object-oriented design.C++’s implementation of non-virtual overloading,means that the compiler does not check for thisconsistency. C++ does not provide this aspect ofobject-oriented design.

Inheritance has been classified as ‘syntactic’inheritance and ‘semantic’ inheritance. Saake et aldescribe these as follows: “Syntactic inheritancedenotes inheritance of structure or methoddefinitions and is therefore related to the reuse ofcode (and to overriding of code for inheritedmethods). Semantic inheritance denotes inheritanceof object semantics, ie of objects themselves. Thiskind of inheritance is known from semantic datamodels, where it is used to model one object thatappears in several roles in an application.” [SJE 91].Saake et al concentrate on the semantic form ofinheritance. Behavioural or semantic inheritanceexpresses the role of an object within a system.

Wegner, however, believes code inheritance tobe of more practical value. He classifies thedifference between syntactic and semanticinheritance as code and behaviour hierarchies [Weg91] (p43). He suggests these are rarely compatiblewith each other and are often negatively correlated.Wegner also poses the question of “How shouldmodification of inherited attributes be constrained?”Code inheritance provides a basis formodularisation. Behavioural inheritance providesmodelling by the ‘is-a’ relationship. Both are usefulin their place. Both require consistency checks thatcombinations due to inheritance actually makesense.

It seems that inheritance is most powerful in themost restrictive form of a semantics preserving

C++?? 16


relationship; a subclass should preserve theassumptions of ancestor classes.

Meyer [Meyer 96a and 96b] has also produced aclassification of inheritance techniques. In histaxonomy he identifies 12 uses of inheritance, all ofwhich he finds useful. This analysis also gives agood idea of when inheritance can be used, andwhen it should not.

Software components are like jig-saw pieces.When assembling a jig-saw the shape of the piecesmust fit, but more importantly, the resulting picturemust make sense. Assembling software componentsis more difficult. A jig-saw is reassembling a picturethat was complete before. Assembling softwarecomponents is building a picture that has never beenseen before. What is worse, is that often the jig-sawpieces are made by different programmers, so whenthe whole system is assembled, the pictures must fit.

Inheritance in C++ is like a jig-saw where thepieces fit together, but the compiler has no way ofchecking that the resultant picture makes sense. Inother words C++ has provided the syntax for classesand inheritance but not the semantics. Reusable C++libraries have been slow to appear, which suggeststhat C++ might not support reusability as well aspossible. By contrast Java, Eiffel and Object Pascalare packaged with libraries. Object Pascal went verymuch in hand with the MacApp applicationframework. Java has been released coupled with theJava API, a comprehensive library. Eiffel is alsointegrated with an extremely comprehensive library,which is even larger than Java’s. In fact the conceptof the library preceded Eiffel as a project toreclassify and produce a taxonomy of all commonstructures used in computer science. [Meyer 94].

3.6 Multiple InheritanceBoth Eiffel and C++ provide multiple inheritance.Java does not, claiming it results in many problems.Instead Java provides interfaces, which are similarto Objective C’s protocols. Sun claims interfacesprovide all the desirable features of multipleinheritance.

Sun’s claim that multiple inheritance results inproblems is true particularly in the way that C++ hasimplemented multiple inheritance. What seems likea simple generalisation of inheriting from multipleclasses instead of just one, turns out to be non-trivial. For example, what should be the policy ifyou inherit an item of the same name from twoclasses? Are they compatible? If so should they bemerged into a single entity? If not, how do youdisambiguate them? And so the list goes on.

Java’s interface mechanism implements multipleinheritance, with one important difference: theinherited interfaces must be abstract. This doesobviate the need to choose between differentimplementations, as with interfaces there are noimplementations. Java allows the declaration ofconstant fields in an interface. Where these aremultiply inherited, they merge to form one entity so

that no ambiguity arises, but what happens if theconstants have different values?

Since Java does not have multiple inheritance,you cannot do mixins as you can in C++ and Eiffel.Mixin is the ability to inherit sets of non-abstractroutines from different classes to build a newcomplex class. For example, you might want toimport utility routines from a number of differentsources. However, you can achieve the same effectusing composition instead of inheritance, so this isprobably not a great minus against Java.

Eiffel solves multiple inheritance problemswithout having to introduce a separate, interfacemechanism.

Some feel that single inheritance is elegant byitself, but that multiple inheritance is not. This isone particular standpoint.

BETA [Madsen 93] falls into the ‘multipleinheritance is inelegant’ category: “Beta does nothave multiple inheritance, due to the lack of aprofound theoretical understanding, and alsobecause the current proposals seem technically verycomplicated.” They cite Flavors as a language thatmixes classes together, where according to Madsen,the order of inheritance matters, that is inheriting(A, B) is different from inheriting (B, A).

Ada 95 is also a language that avoids multipleinheritance. Ada 95 supports single inheritance asthe tagged type extension.

Others feel that multiple inheritance can provideelegant solutions to particular modelling problemsso is worth the effort. Although, the above list ofquestions arising from multiple inheritance is notcomplete, it shows that the problems with multipleinheritance can be systematically identified, andonce the problems are recognised, they can besolved elegantly. While [Sakkinen 92] goes into theproblems of multiple inheritance in great depth, hedefends it.

Eiffel has taken the approach that multipleinheritance poses some interesting and challengingproblems, but rises to the challenge, and solves themelegantly. Nor does the order of inheritance matter.All resolutions that the programmer must specify aregiven in the inheritance clause of a class. Thisincludes renaming to ensure that multiple featuresinherited with the same name end up as multiplefeatures with unambiguous names, redefining, newexport policies for inherited features, undefining,and disambiguating with select. In all cases, theaction taken by the compiler, whether using fork orjoin semantics is made clear, and the programmerhas complete control.

C++ has a different disambiguation mechanismto Eiffel. In Eiffel, one or both of the features mustbe given a different name in the renames clause. InC++ the members must be disambiguated using thescope resolution operator ‘::’. The advantage of theEiffel approach is that the ambiguity is dealt withdeclaratively in one place. Eiffel’s inheritance clauseis considerably more complex than C++’s, but thecode is considerably simpler, more robust and

C++?? 17


flexible, which is the advantage of the declarativeapproach as against the operator approach. In C++,you must use the scope resolution operator in thecode, every time you run into an ambiguity problembetween two or more members. This clutters thecode, and makes it less malleable, as if anythingchanges that affects the ambiguity, you potentiallyhave to change the code everywhere, where theambiguity occurs.

According to [Stroustrup 94] section 12.8, theANSI committee considered renaming, but thesuggestion was blocked by one member whoinsisted that the rest of the committee go away andthink about it for two weeks. The example in section12.8 shows how the effect of renaming is achieved,without explicit renaming. The problem is, if it tookthis group of experts two weeks to work this out,what chance is there for the rest of us?

The scope resolution operator is used for morethan just multiple inheritance disambiguation. Sinceambiguities could be avoided by cleaner languagedesign, the scope resolution operator is an uglycomplication.

The question of whether the order of declarationof multiple parents matters in C++ is complex. Itdoes affect the order in which constructors arecalled, and can cause problems if the programmerdoes really want to get low level. However, thiswould be considered poor programming practice.

Another difference between C++ and Eiffel isdirect repeated inheritance. Eiffel allows:

class B inherit A, A end

but

class B : public A, public A { };

is disallowed in C++.

3.7 Virtual ClassesThe meaning of the keyword virtual is quitedifferent when used in the context of a class to thecontext of a function: with a class it means thatmultiply inherited features are merged; with afunction it means polymorphism. Virtual class doesnot mean that members in the class are allpolymorphic. In fact the two uses of virtual actuallymean quite the opposite of each other: virtualfunctions mean that there could be more than onefunction; virtual classes mean that if the class ismultiply inherited, you only get a single copy.

C++ saves on keywords by overloading onekeyword in several contexts, even though the useshave different or even opposite meanings. Static isanother case, which is used in three differentcontexts. The keyword count metric does not showthat C++ is a small non-complex language: lesskeywords have made C++ more complex andconfusing.

So what do virtual classes do? If class Dmultiply inherits class A via classes B and C, then ifD wants to inherit only a single shared copy of A,

the inheritance of A must be specified as virtualin both B and C. C++ virtual classes raise twoquestions. Firstly, what happens if A is declaredvirtual in only one of B or C? Secondly, what ifanother class E wants to inherit multiple copies of Avia B and C? In C++, the virtual class decision mustbe made early, reducing the flexibility that might berequired in the assembly of derived classes. In ashared software environment different vendorsmight supply classes B and C. It should be left tothe implementor of class D or E, exactly how toresolve this problem. And this is the simplest case:what if A is inherited via more than two paths, withmore than two levels of inheritance? Flexibility iskey to reusable software. You cannot envisage whendesigning a base class all the possible uses inderived classes, and attempting to do soconsiderably complicates design.

As Java has no multiple inheritance, there is noproblem to be solved here.

The Eiffel mechanism allows two classes D andE inheriting multiple copies of A to inherit A in theappropriate way independently. You do not have tochoose in intermediate classes whether A is virtual,ie., inherited as a single copy, or not. Theinheritance is more flexible and done on a feature byfeature basis, and each feature from A will eitherfork, in which it becomes two new features; or join,in which case there is only one resultant feature. Theprogrammer of each descendant class can decidewhether it is appropriate to fork or join each featureindependently of the other descendants, or anypolicy in A.

The fine grained approach of Eiffel is asignificant benefit over C++. While the Eiffelapproach is more sophisticated and flexible, thesyntax is far simpler, and the concepts are easier tounderstand.

3.8 TemplatesTemplates are C++’s mechanism to implement theconcept of genericity. Templates are much the sameas parameterised classes, which is the mechanismEiffel uses for genericity. Genericity is a majorfeature of Ada and Algol 68 and is a valuableaddition to C++. Some see genericity as a morefundamental software assembly mechanism thaninheritance, and certainly less problematic. Ada isan example where genericity is more fundamentalthan inheritance. In C++’s Standard TemplateLibrary (STL), genericity is used almost exclusivelyinstead of inheritance. Meyer [Meyer 88] states thatgenericity is an essential part of an object-orientedlanguage. [P&S 94] see genericity as a mechanismthat achieves type substitution, which you cannot dowith inheritance. Thus genericity is essential as acomplementary concept to inheritance.

Genericity allows you to build collections ofitems, where the type of items is known, and itemscan be retrieved from the collection as that type,without type casting. In a language withoutgenericity you code a LIST class, and objects of any

C++?? 18


type can be added to lists. If the list is only forshopping items, it makes semantic nonsense to add aperson to the list. Without genericity there is nostatic type check to ensure you can’t add people toyour shopping list. You might be able to catch thisoccurrence at run time, but the advantage of statictyping is lost.

Without genericity you could code specific listsfor shopping items, people, and every other itemyou could put in lists. The basic functionality of alllists is the same, but you must duplicate effort, andmanually replicate code. That is you must duplicateeffort if you are going to preserve semantics and betype safe.

Languages such as Eiffel and C++ allow you todeclare a LIST of shopping items, so the compilercan ensure that you cannot add people to such a list.You can also easily add lists that contain any othertype of entity, just by a simple declaration. You donot have to manually replicate the basicfunctionality of the list for every type of elementyou are going to put in it.

This has lead to a criticism of the C++ templatemechanism that you get ‘code bloat’. That is forevery type based on a template definition thecompiler might replicate the code. Seeing that thepurpose of templates is to save the programmer frommanual replication, this does not seem like a badthing. A good implementation of C++ will avoid‘code bloat’ where possible. In fact it is allowed forin the C++ ARM: “This can cause the generation ofunnecessarily many function definitions. A goodimplementation might take advantage of thesimilarity of such functions to suppress spuriousreplications.”

Thus I don’t criticise C++ as others have doneon the basis of ‘code bloat’. The whole concept ofgenerics and templates is simple and yet powerful,and allows the generation of quite sophisticatedprograms from simple specifications. If you areoverly worried about ‘code bloat’, simply do not usegenericity. As [Stroustrup 94] points out “What youdon’t use, you don’t pay for.” This is a goodprinciple for compiler implementors. Many peoplewill use genericity though, as few will find itpractical to code a different kind of LIST for everypossible list element.

While the concept of genericity and templates iscorrect, there are several problems with templates inC++. The syntax leaves a lot to be desired. Readerscan of course form their own opinions of that.However, again C++ masks what is a simple andpowerful mechanism with complicated syntax, sopeople will baulk at using it. There are examples ofwhere the quirky syntax is a trap for young players[Stroustrup 94]. For example, declaring a list of alist of integers would easily be notated:

List<List<int>> a;

However, this results in a syntax error as ‘>>‘ is theright shift or output operator. You must notate thisas ‘> >‘:

List<List<int> > a;

Further, “template” is confusing terminology, as theconceptual view is that a class is a template for a setof objects. “Object-oriented languages allow one todescribe a template, if you will, for an entire set ofobjects. Such a template is called a class.” [Ege 96].This is not the meaning of the C++ term template,which refers to genericity.

Another more serious problem is that there is noconstraint on the types that can be used as theparameters to the templates; the coder of a templateclass can make no assumptions about the type of thegeneric parameter. Thus the class coder cannot issuea function call from within the template class to thegeneric type without a type cast.

As the ARM says on this topic: “Specifying norestrictions on what types can match a typeargument gives the programmer the maximumflexibility. The cost is that errors - such asattempting to sort objects of a type that does nothave comparison operators - will not in general bedetected until link time.”

This shows the need for at least an optional typeconstraint on the actual types passed to the template.Eiffel has such optional constraints in the form ofconstrained genericity. For example:

class SORTED_LIST [T -> COMPARABLE]...feature insert (item: T) is ... end

end

ensures that the type of the item to insert hasappropriate comparison operators from typeCOMPARABLE in order to insert item in the rightplace in the SORTED_LIST. Note that multipleinheritance is important, so that any type eligible forinsertion in the SORTED_LIST includes thecomparison operators.

Java, alas has no genericity mechanism. TheJava recommendation is to use type casts when everretrieving an object from a container class [Flan 96].

[P&S 94] have a good chapter on genericity.Genericity is the ability to build a derived class froma base class by type substitution. Compare this withinheritance, where you can add class members andredefine inherited routines. They criticise theparameterised class/template mechanisms of Eiffeland C++ for three reasons: firstly, there are twokinds of class, generic and non-generic; secondly,you can apply generic instantiation only once; andthirdly, a generic instance is not a subclass.

BETA uses a different mechanism, virtualbinding, which is more flexible than the Eiffel/C++parameterised classes, but [P&S 94] shows that youcan produce derived classes that are not staticallytype correct.

A significant problem with the parameterisedclass mechanism is that the base class designer must

C++?? 19


think about it in advance, and then only the typesnominated in the parameter list can be substituted.This reduces flexibility. [P&S 94] suggests agenericity mechanism known as class substitution,which make inheritance and genericity orthogonalrather than independent concepts. Class substitutionhas the advantage that a base class designer does notneed to design genericity into the base class, anysubclass can perform class substitution; and anytype in the base class may be substituted, not onlythose given in the parameter list. Furthermore, classsubstitution can be applied repeatedly, whereasinstantiation of a parameterised class can be doneonly once.

An example of class substitution in Eiffel likesyntax is:

class Afeature x, y: T

assign isdo x := yend

end

This can be modified using class substitution:

A [T <- INTEGER]A [T <- ANIMAL]

You can also use constrained genericity with exactlythe same syntax that Eiffel now has, as in theSORTED_LIST example, except that semanticallythe [T -> COMPARABLE] only specifies that anyclass substituting T must be a subclass ofCOMPARABLE. [T -> COMPARABLE] is not aparameter list though. You can build new types outof sorted list:

SORTED_LIST [T <- INTEGER]SORTED_LIST [T <- STRING]

Java might be in the best position to implement thisflexible class substitution mechanism for genericity,as it has not implemented genericity yet. Eiffel andC++ could extend their mechanisms, but then therewould be two ways of doing the same thing, exceptthe class substitution mechanism is more flexiblethan parameterised classes. I do not know of anylanguages that implement class substitution as yet,and other consequences must be thought throughbefore adding it to languages, so don’t dispose ofyour Eiffel and C++ compilers just yet!

3.9 Name OverloadingClear names are fundamental in producing self-documenting software helping to produce maintain-able and reusable software components. Names are

fundamental in freeing programmers from low levelmanipulation of addresses. Naming is the basis fordifferentiating between different entities in asoftware module. In programming, when we use theterm name, we usually mean identifier. To beprecise, a name is a label which can refer to morethan one entity, in which case the name isambiguous. An identifier is a name thatunambiguously identifies an entity. (To bemathematical, a name is a relation, an identifier is afunction.) Where a name is ambiguous, it needsqualification to form an identifier to the entity. Forexample, there could be two people named JohnDoe; to disambiguate the reference, you wouldqualify each as John Doe of Washington or JohnDoe of New York.

Name overloading allows the same name to referto two or more different entities. The problem withan ambiguous name is whether the resultantambiguity is useful, and how to resolve it, asambiguity weakens the usefulness of names todistinguish entities.

Name overloading is useful for two purposes.Firstly, it allows programmers to work on two ormore modules without concern about name clashes.The ambiguity can be tolerated as within the contextof each module the name unambiguously refers to aunique entity; the name is qualified by itssurrounding environment. Secondly, nameoverloading provides polymorphism, where thesame name applied to different types refers todifferent implementations for those types.Polymorphism allows one word to describe ‘what’ iscomputed. Different classes might have differentimplementations of ‘how’ a computation is done.For example ‘draw’ is an operation that is applicableto all different shapes, even though circles andsquares, etc., are ‘drawn’ differently.

These two uses of name overloading provide apowerful concept. The use of the same name in thesame context must be resolved. Errors can resultfrom ambiguity, in which case the programmer mustdifferentiate between entities with some form ofqualification of the name. A common way to do thisis to introduce extra distinguishing names. Forexample, in a group of people where two or moreshare the same first name, they can be distinguishedby their surname. Similarly a unique first name willdistinguish the members of a family with a commonsurname.

This is analogous to classes, where each class ina system is given a unique name. Each memberwithin a class is also given a unique name. Wheretwo objects with members of the same name areused within the same context, the object name canqualify the members. In this case the dot operatoracts as a qualifier, for example, a.mem and b.mem.

Locals in a recursive environment are anexample of ambiguity which is resolved at run-time.A single local identifier in the static text of afunction can refer to many entities. When thefunction is called recursively, the name is qualified

C++?? 20


by the call history of the function to give the exactmemory cell where it resides.

Many block structured languages provideoverloading by scoping. Scoping allows the samename to be used in different contexts without clashor confusion, but nested blocks have a subtleproblem. Names in an outer block are in scope ininner blocks, but many languages allow a name tobe overloaded in an inner block, creating a ‘scopehole’ hiding the outer entity, preventing it frombeing accessed. The name in the inner block has norelationship with the entity of the same name in theouter block. Textually nested blocks ‘inherit’named entities from outer blocks. Inheritanceaccomplishes this in object-oriented languages,eliminates the need to textually nest entities, andaccomplishes textual loose coupling. Nesting resultsin tightly coupled text.

Contrary to most languages, a name should notbe overloaded while it is in scope. The followingexample illustrates why:

{ int i; {

int i; // hide the outer i. i = 13; // assign to the inner i.

// Can’t get to the outer i here. // It is in scope, but hidden. }}

Now delete the inner declaration:

{int i;{ i = 13; // Syntactically valid,

// but not the intention.}

}

The inner overloaded declaration is removed, andreferences to that name do not result in syntax errorsdue to the same name being in the outerenvironment. The inner instruction now mistakenlychanges the value of the outer entity. A compilercannot detect this situation unless the languagedefinition forbids nested redeclarations. E.W.Dijkstra uses similar reasoning in ‘An essay on theNotion: “The Scope of Variables”’ in “A Disciplineof Programming,” [Dijkstra 76].

The above example demonstrates how nestingresults in less maintainable programs due to tightcoupling between the inner and outer blocks,making each sensitive to changes in the other. Theadvantage of keeping components decoupled andseparate is that a programmer can confidently makemodifications to one component without affectingother components. Testing can be limited to thechanged component, rather than a combination of

components, which quickly leads to anexponentiation in the number of tests required.

In Eiffel, overloading is recognised as beingproblematic, so even this form is disallowed: routinearguments and local variables cannot overloadnames of class features.

C++ has another analogous form of hiding: anon-virtual function in a derived class hides afunction with the same signature in an ancestorclass. This hiding is explained in section 13.1 of theC++ ARM. This is confusing and error prone.Learning all these ins and outs of the language isextremely burdensome to the programmer, oftenbeing learnt only after falling into a trap. Java doesnot have this problem as everything is virtual, so afunction with the same signature will override ratherthan hide the ancestor function.

In order to overcome the effects of hiding, youcan use the scope resolution operator ‘::’. The scoperesolution operator of C++ provides an interestingtwist to the above argument. Consider the followingexample from p16 of the ARM:

int g = 99;

int f(int g) // hide the outer g.{

return g ? g : :: g; // return argument if it // is nonzero otherwise // return global g}

This would be simpler if the compiler reported anerror on the redefinition of g in the parameter list:the programmer would simply change the name ofone of the entities with no need for the scoperesolution operator:

int g = 99;

int f(int h){

return h ? h : g;}

With the introduction of namespaces in 1993, the‘ :: ’ operator now resolves names in namespaces.For example A::x , means the entity x innamespace A. Above ::g means the entity g in theglobal namespace. Since declarations in anamespace are really just members of a fixedstructure, it would have been cleaner to just use theaccess operator ‘.’, and avoid the ugly scoperesolution operator.

Java does not provide a scope resolutionoperator. However, there are no globals, so the onlycase where the above is a problem is between classmembers, and method parameters or locals.

Java does have a similar problem though. Theproblem is with shadowed variables. With

C++?? 21


shadowed variables, a variable named x in asuperclass can be hidden from the current class byanother variable named x. You can still access bothvariables by the use of this.x and super.x, which arethe equivalents of scope resolution. The ambiguityproblem would have been better avoided altogetherby reporting a duplicate identifier.

Eiffel also has no globals, so a construct such asnamespaces is not needed. Eiffel does not allowname clashes: you must either change the name ofone of the entities, or when combining classes withinheritance, use a rename clause. With this schemethere is no need for scope resolution or ‘super’operators, making the imperative part of thelanguage simpler, by using declarative techniques.

3.10 Nested ClassesSimula provided textually nested classes similar tonested procedures in ALGOL. Textual (syntactic)nesting should not be confused with semanticnesting, nor static modelling with dynamic run-timenesting. Modelling is done in the semantic domain,and should be divorced from syntax; you do notneed textually nested classes to have nested objects.Nested classes are contrary to good object-orienteddesign, and the free spirit of object-orienteddecomposition, where classes should be looselycoupled, to support software reusability.

Instead of tightly coupled environments:

A

B

C

.

.

Z

You should decouple depending on the modellingrequirements:

A

B Binherit A a: A

or

C Cinherit A a: A

. .

. .

Z Zinherit A a: A

is-a component-of/related-to

This is a more flexible arrangement, both in terms ofmodelling and program maintenance.

There are two problems with nested classes:firstly, the inner class is dependent on the outerclass, and so is not reusable, contrary to goodobject-oriented design, where classes areindependent; secondly, the inner class has access tothe implementation of the outer class, soimplementation hiding is violated. Where access toa class’s implementation is needed, you should useinheritance, but note this models the is-arelationship, not the component-of relationship thatnested classes do.

Semantic nesting is achieved independently oftextual nesting. In object-oriented design all objectsshould interact only via well defined interfaces, butobjects of a class that is textually nested in anotherclass have access to the outer object without thebenefit of a clean interface. C avoided thecomplexity of nested functions, but C++ has chosento implement this complexity for classes, which isof less use than nested functions, and is contrary togood object-oriented design.

Pascal and ALGOL programmers sometimes usenested procedures in order to group things together,but nested procedures are not necessary, and if youwant to use a nested procedure in anotherenvironment, you have to dig it out of where it isand make it global, which is a maintenance problem.If the procedure uses locals from the outerenvironment, you have more problems. You will

C++?? 22


have to change these to parameters, which is acleaner approach anyway, and you will probablyhave to unindent all the text by one or more levels.Textually nested classes have worse problems.

Semantically, OOP achieves nesting in twoways: by inheritance and object-orientedcomposition. Modelling nesting is achieved withouttight textual coupling. Consider a car. In the realworld the engine is embedded in the car, but inobject-oriented modelling embedding is modelledwithout textual nesting. Both car and engine areseparate classes: the car contains a reference to anengine object. This allows the vehicle and enginehierarchy to be independently defined. Engine isderived independently into petrol, diesel, andelectric engines. This is simpler, cleaner and moreflexible than having to define a petrol engine car, adiesel engine car, etc., which you have to do if youtextually nest the engine class in the car. In the realworld you can change the cars engine, so it does noteven make sense to tightly couple the car and theengine.

In C++, not only can classes be nested withinother classes, but also within functions, therebytightly coupling a class to a function. This confusesclass definition with object declaration. The class isthe fundamental structure in object-orientedprogramming and nothing has existence separatefrom a class (including globals).

Neither Java, nor Eiffel provide nested classes,and yet everything you can model in C++, you canalso model in these languages, without the problemsassociated with textual nesting.

Chapter 18 of [Madsen 93] provides very goodinsights about modelling; classification andcomposition are the means to organise complexity interms of hierarchies. [Madsen 93] enumerates fourkinds of composition: whole-part composition,reference composition, localisation, and conceptcomposition. They say that these are not altogetherindependent as one composition relationship couldfall into two or more categories. Whole-partcomposition models the car example above, wherethe engine is part of the car. Reference compositionis illustrated where a person makes a hotelreservation. The person is not a part of thereservation, but the reservation references theperson. [Madsen 93] can be consulted for definitionsof localisation and concept composition.

As examples can be given of composition thatcan be modelled in terms of more than one of thecategories of composition, it is better not to providedirect modelling of this in the programminglanguage; your opinion might later change. BETAdoes have mechanisms for modelling the whole-partcomposition as embedded objects, and reference asreferences. However, this is quite different to textualnesting. There is no real need to support thesedifferent categories in your programming language.It is more important for the analyst to be cogniscentof these different flavours so that he can recognise

different kinds of composition in the problemdomain.

3.11 Global EnvironmentsThere are two important properties of globals:firstly, a global is visible to the whole program,which is a compile-time view; and secondly, aglobal is active for the entire execution of aprogram, which is a run-time property. The firstproperty is not desirable in the object-orientedparadigm, as will be explained below. The secondproperty can easily be provided. The life of anyentity is the life of the enclosing object, so to haveentities that are active for the whole execution of theprogram, you create some objects when the programstarts, which don’t get deallocated until the programcompletes.

The global environment provides a special caseof nested classes. When classes are nested in aglobal environment, dependencies can arise thatmake the classes difficult to decouple from theoriginal program, and therefore not reusable, bythemselves. You might be forced to relocate a largeamount of the global environment as well. There arealso problems with the related mechanisms ofheader files and namespaces. Even if a class is notintended for use in another context, it will benefitfrom the discipline of object-oriented design. Eachclass is designed independently of the surroundingenvironment, and relationships and dependenciesbetween classes are explicitly stated.

In C++ functions can change the globalenvironment, beyond the object in which they areencapsulated. Such changes are side-effects thatlimit the opportunity to produce loosely-coupledobjects, which is essential to enable reusablesoftware. This is a drawback of both global andnested environments. A good OO language will onlypermit routines in an object to change its state.

Removing the global environment is trivial:simply encapsulate it in an object or set of objects.The previously global entities are then subject to thediscipline of object-oriented design; globalscircumvent OOD. Objects can also provide a cleaninterface to the external environment, or operatingsystem, without loss of generality, for a negligibleperformance penalty. Classes are independent of thesurrounding environment, and the project for whichthey were first developed, and are more easilyadaptable to new environments and projects.

Java has removed globals from the languagealtogether. Eiffel is another example of a languagewhere there are no globals. Both these languagesshow that globals are not needed for, and evendetrimental to the development of large computersystems.

In concurrent and distributed environments youare better off without globals. In a distributedenvironment, the global state of the system may beimpossible to determine. In order to developdistributed systems, you cannot have globals.Similarly with concurrent environments, problems

C++?? 23


arise when two or more process threads accessshared resources at the same time. Shared resourcesshould only be accessed via an object whichmanages the resource, and prevents contention forthe shared resource. Such a resource should not be aglobal.

3.12 Polymorphism and InheritanceInheritance provides a textually decoupled form ofsubblock. The scope of a name is the class in whichit occurs. If a name occurs twice in a class, it is asyntax error. Inheritance introduces some questionsover and above this simple consideration of scope.Should a name declared in a base class be in scopein a derived class? There are three choices:

1) Names are in scope only in the immediateclass but not in subclasses. Subclasses can freelyreuse names because there is no potential for a clash.This precludes software reusability. Since subclasseswill not inherit definitions of implementation, case 1is not worth considering.

2) The name is in scope in a subclass, but thename can be overloaded without restriction. This isclosest to the overloading of names in nested blocks.This is C++’s approach. Two problems arise: firstly,the name can be reused so the inherited entity isunintentionally hidden; secondly, because the newentity is not assumed to have any relationship to theoriginal, its signature cannot be type checked withthe original entity. Since consistency checksbetween the superclass and subclass are notpossible, the tight relationship that inheritanceimplies, which is fundamental to object-orienteddesign, is not enforced. This can lead toinconsistencies between the abstract definition of abase class, and the implementation of a derivedclass. If the derived class does not conform to thebase class in this way, it should be questioned whythe derived class is inheriting from the base class inthe first place. (See the nature of inheritance.)

3) The name is in scope in the subclass, but canonly be overridden in a disciplined way to provide aspecialisation of the original. Other uses of the nameare reported as duplicate name errors. This form ofoverriding in a subclass ensures the entity referred toin the subclass is closely related to the entity in theancestor class. This helps ensure design consistency.The relationship of name scope is not symmetric.Names in a subclass are not in scope in a superclass(although this is not the case in dynamically typedlanguages such as Smalltalk). In order to provide theconsistent customisation of reusable softwarecomponents, the same name should only be usedwhen explicitly redefining the original entity. Theprogrammer of the descendant class should indicatethat this is not a syntax error due to a duplicatename, but that redefinition is intended, (thesuggested keyword override has already beencovered in the virtual section.) This choice ensuresthat the resultant class is logically constructed. Thismight seem restrictive, but is analogous to strong

typing, and makes inheritance a much morepowerful concept.

3.13 Type Casts“Syntactically and semantically, casts are one of theugliest features of C and C++.” not my words or anyother detractor of C++, but from [Stroustrup 94].

Mathematical functions map values from onetype to values of another type. For examplearithmetic multiplication maps the type ‘pair ofintegers’ to an integer:

Mult : INTEGER x INTEGER -> INTEGER

A language type system enables a programmer tospecify which mappings make sense. Like functions,type casts map values of one type onto values ofanother type, but this forces one type to another,against the defined mappings, undermining thevalue of the type system. A strongly typed languagewith a well defined type system does not need casts:all type to type mapping is achieved with functionsthat are defined within the type system; no castsoutside the type system are needed.

Type casts have been useful in computer sys-tems. Sometimes it is required to map one type ontoanother, where the bit representation of the valueremains the same. Type casts are a trick to optimisecertain operations, but provide no useful conceptthat general functions don’t provide. In manylanguages, the type system is not consistentlydefined, so programmers feel that type casts arenecessary, or the language would be restrictive.

An example often used in programming is tocast between characters and integers. Type castsbetween integers and characters are easily expressedas functions using abstract data types (ADTs).

TYPE CHARACTER

FUNCTIONS ord: CHARACTER -> INTEGER

// convert input character to integerchar: INTEGER /-> CHARACTER// convert input integer to character

PRECONDITION// check i is in range

pre char (i: INTEGER) = 0 <= i and i <= ord (last character)

The notation ‘->’ means every character will map toan integer. The partial function notation ‘/->’ meansthat not every integer will map to a character, and aprecondition, given in the pre char statement,specifies the subset of integers that maps tocharacters. Object-oriented syntax provides thisconsistently with member functions on a class:

i: INTEGERch: CHARACTER

C++?? 24


i := ch.ord// i becomes the integer value of the character.ch := i.char// ch becomes the character corresponding to i.

but a routine char would probably not be defined onthe integer type so this would more likely be:

ch.char (i)// set ch to the character corresponding to i.

The hardware of many machines cater for such basicdata types as character and integer, and it is probablethat a compiler will generate code that is optimal forany target hardware architecture. Thus many lan-guages have characters and integers as built in types.An object-oriented language can treat such basicdata types consistently and elegantly, by the implicitdefinition of their own classes.

Another example of type conversion is from realto integer; but there are several options. Do youtruncate or round?

TYPEREAL

FUNCTIONStruncate: REAL -> INTEGERround: REAL -> INTEGER

r: REALi: INTEGER

i := r.truncate// i becomes the closest integer// <= ri := r.round// i becomes the closest integer to r

Again many hardware platforms provide specificinstructions to achieve this, and an efficient object-oriented language compiler will generate code bestoptimised for the target machine. Such inbuilt classdefinitions might be a part of the standard languagedefinition.

3.14 RTTI and Type castsSince the second edition of this critique in 1992,C++ added Run-Time Type Information (RTTI) inMarch 1993. This is a good and necessary feature,and a discussion of it helps clarify the notion ofcasts.

[P&S 94] makes a case against rejecting allprograms that are not statically type correct. If aprogram is shown to be statically type correct, itstype correctness is guaranteed, but static typechecks can reject a class of programs that areotherwise type valid.

List classes are an example of where static typechecking can reject a valid program. A list class cancontain objects of many different types. Genericityand templates allow constructions such as list of

objects, list of animals, etc. These are types builtfrom the generic list class.

In the list of animals, you might know thatsquirrels occur in even numbered slots in the list.You could then assign an even numbered listelement to a variable of type squirrel. Dynamically,this is correct, but statically the compiler must rejectit as it does not know that only squirrels occur ineven locations in the list.

Things aren’t always this simple. Theprogrammer probably won’t know the pattern ofhow particular animals are stored in the list.Consider a vet’s waiting room. The vet might viewhis waiting room as being the type: list of animals.Calling in the first animal from the waiting room, itis important to know whether the animal is a cat or ahamster if the vet is to perform an operation on theanimal. For many such cases object-orienteddynamic binding and polymorphism will suffice, sothat the programmer does not have to know theexact type of the object, as long as the objects aresufficiently the same that the same operations can beapplied, even though the implementations might bedifferent.

However, this is not always sufficient, andsometimes it is important to know that you haveretrieved a hamster from a list of animals.

For example, once our vet has performed theoperation on the hamster or cat, he must knowenough about their type to decide whether to nowput the animal in the hamster cage, or the cat basket.

Casting can solve this problem, but it is asledgehammer approach where much more elegantand precise solutions exist. [Stroustrup 94] notes:“The C and C++ cast is a sledgehammer.”

Eiffel has such an elegant and precise solutioncalled the assignment attempt, notated as ‘?=‘instead of ‘:=‘. A simple example is:

waiting_room: LIST [ANIMAL]fluffy: HAMSTERh_cage: HAMSTER_CAGE

fluffy := waiting_room.first -- error.

-- The above assignment will be rejected by the-- compiler as type (fluffy) = HAMSTER and-- ANIMAL is not a subtype of HAMSTER. Even-- though we know that the animal will be a-- HAMSTER, and the program is valid, static-- type checking considers it invalid.

fluffy ?= waiting_room.first

-- If the first animal in the waiting room is-- indeed a HAMSTER, then fluffy will refer-- to that animal, else fluffy will be Void.

if fluffy /= Void then h_cage.put (fluffy)end

C++?? 25


The Eiffel assignment attempt provides a precise andelegant solution to the dynamic type problem. Sincethe assignment attempt has the desired effect of by-passing static type checking and leaving it to runtime, type casting is not needed.

If you want to be as flexible as Smalltalk, youcould use assignment attempt instead of straightassignment everywhere, but as this invokes run timetype checks, and you must check for Voidreferences, there is a large overhead to assignmentattempt over straight assignment. This shows thatnot only is static typing important for provingcompile-time correctness, but also for run-timeefficiency. The only real effect of ?= as far as theprogrammer is concerned is that it suppresses thecompiler’s static type checking and puts in a run-time check.

As I said, C++ introduced Run-Time TypeInformation (RTTI) in March 1993. RTTI has theoperator dynamic_cast , which achieves thesame effect as the Eiffel assignment attempt.dynamic_cast returns a pointer to a derivedclass from a pointer to a base class if the object is anobject of the derived class; otherwise it returns 0 (orshould that be null? But 0 isn’t really zero, but anybit pattern representing null).

In C++, the above assignment attempt would becoded:

fluffy =dynamic_cast<hamster*>

(waiting_room.first());

A few observations. Wow! Eiffel uses an operator,and C++ uses a keyword. It should be noted thoughthat in correctly designed programs, neitherassignment attempt, nor dynamic_cast will beused very often. So this is a small point.

The second observation is that in C++ you mustspecify the type. In this example it is superfluous asthe compiler can determine type (fluffy) =HAMSTER, as it does in Eiffel.

In C++ you can dynamically cast to any derivedclass from hamster* but that does not seem togain anything. A second point is that you don’t needto use dynamic_cast directly in an assignment,but can use it in a general expression. However,again it is stressed that run time casting should be solittle used that this is of little advantage. Perhaps theonly small advantage is the ability to be able to passa dynamically cast pointer:

h_cage.put(dynamic_cast<hamster*>

(waiting_room.first());

Looks good right? But remember, if the first animalout of the waiting room is not a hamster, but a rat,you get 0 (well null...etc) returned which will causeh_cage.put() to fail.

This shows that the use of dynamic_cast inan expression is not such a good idea, as it mightcause the whole expression to fail.

Thus Eiffel’s assignment attempt is safer andsyntactically cleaner. And there is another reason forthis remark: if you don’t put the if fluffy /= Voidthen test in, either deliberately or because youforgot, then the precondition that is most likely inthe Eiffel version of h_cage.put tests that theargument is not Void. If you deliberately left out theVoid test, you will have included a rescue clause tohandle this exception.

Although the Eiffel syntax ‘?=‘ for assignmentattempt is cleaner, [Stroustrup 94] points out thatsuch clean syntax would be inappropriate for C++.This is because the ‘?=‘ would be “difficult to spot”in C++’s otherwise clumsy syntax. This is why it ispossible to use this neat notation in Eiffel, asEiffel’s syntax is much clearer, and sinceprogrammers will code small routines, the ‘?=‘ isnot difficult to spot in an Eiffel program. Thereasoning against ‘?=’ in C++ is strange, since Calready provides assignment operators like ‘+=’ and‘ -= ’, which are just a small syntactic convenience.

Another RTTI feature is the typeid operator.[Stroustrup 94] warns against using this todetermine program flow control based on typeinformation. You should not use switch statements,but use dynamic binding on polymorphic (virtual)functions. This will need to be built into your stylerules that programmers will hate, or you will end uphaving to fix the dirty deed after the fact, whichadds to the expense of your software developments.

Eiffel has no built in operator to achieve this, sothe object-oriented principle of using dynamicbinding instead of switch statements is betterenforced. Eiffel removes type identification from thelanguage, but places it in the libraries in someroutines built into the GENERAL class. So in Eiffel,it is harder to commit the bad programmingpractices that [Stroustrup 94] warns about.

3.15 New Type CastsNot only did C++ introduce RTTI anddynamic_cast in March 1993, but also threemore cast operators in November 1993. Theseoperators are:

static_cast<T>(e) ,reinterpret_cast<T>(e) , andconst_cast<T>(e) .

Again for all these the specification of the <type>seems superfluous, as the compiler can derive thatfrom the context. These casts just about cover all thecases where you would need to use C style casts.

[Stroustrup 94] indicates a desire to discard theC casts: “I intended the new-style casts as acomplete replacement for the (T)e notation. Iproposed to deprecate (T)e ; that is, for thecommittee to give users warning that the (T)enotation would most likely not be part of a futurerevision of the C++ standard. ... However, that ideadidn’t gain a majority, so that cleanup of C++ willprobably never happen.”

C++?? 26


The bottom line to these sections on type castscomes again from [Stroustrup 94]: “In all cases, itwould be better if the cast - new or old - could beeliminated.” It can! Use Eiffel or another one of thelanguages in which the type system is more cleanlydefined.

3.16 Java and CastsUnfortunately, Java needs casts in the aboveexamples, but has improved the situation: “Not allcasts are permitted by the Java language. Some castsresult in an error at compile time. For example, aprimitive value may not be cast to a reference type.Some casts can be proven, at compile time, alwaysto be correct at run time. For example, it is alwayscorrect to convert a value of a class type to the typeof its superclass; such a cast should require nospecial action at run time. Finally, some casts cannotbe proven to be either always correct or alwaysincorrect at compile time. Such casts require a test atrun time. A ClassCastException is thrown if a castis found at run time to be impermissible.” - from theJava Language Specification.

3.17 ‘.’ and ‘->’The ‘.’ and ‘->’ member access syntax came from Cstructures, and illustrates where the C base adverselyaffects flexibility. Semantically both access amember of an object. They are, however,operationally defined in terms of how they work.The dot (‘.’) syntax accesses a member in an objectdirectly: ‘x.y’ means access the member y in theobject x.

OBJ x; // declare object x of // class obj // with a member y.x.y; // access y in object x // directlyx->y; // syntax error “. expected”

The specific error is:

error: type 'OBJ' does not have anoverloaded member 'operator ->'

error: left of '->y' must pointto class/struct/union

The ‘->’ syntax means access a member in an objectreferenced by a pointer: ‘x->y ’ (or the equivalent*(x).y ) means access the member y in the objectpointed to by x .

OBJ *x; // declare a pointer x to an // object of class obj.x->y; // access y via pointer xx.y; // syntax error “-> expected”

The specific error is:

error:'.OBJ::y' : left operand pointsto 'class', use '->'

In these examples, ‘what’ is to be computed is“access the element y of object x.” In C++,however, the programmer must specify for everyaccess the detail of ‘how’ this is done. That is theaccess mechanism to the member is made visible tothe programmer, which is an implementation detail.Thus the distinction between ‘.’ and ‘-> ‘compromises implementation hiding, and veryseriously the benefit of encapsulation. We will seein the section on inlines how the visible differenceof access mechanisms between constants, variablesand functions also breaks the implementation hidingprinciple, and how the burden is on the programmerto restore hiding, rather than fix the language.

The compiler could easily restoreimplementation hiding by providing uniform accessand remove this burden from the programmer, as infact most languages do. The major benefit ofimplementation hiding is that if the implementationchanges, the effect is contained within the classitself; not manifest beyond the interface. Whereimplementation hiding is broken, the effects ofimplementation change become visible, and thisreduces flexibility.

For example, if the ‘OBJ x ’ declaration ischanged to ‘OBJ *x ’, the effect is widespread asall occurrences of ‘x.y ’ must be changed to ‘x->y ’. Since the compiler gives a syntax error if thewrong access mechanism is used, this shows that thecompiler already knows what access code isrequired and can generate it automatically. Goodprogramming centralises decisions: the decision toaccess the object directly or via a pointer should becentralised in the declaration. So again, C++ useslow level operators, rather than the high leveldeclarative approach of letting the compiler hide theimplementation and take care of the detail for us.

Java only supports the dot form of access. The‘ -> ‘ form is superfluous. Java objects are onlyaccessed by reference; there are no embeddedobjects.

Eiffel provides a more interesting case. In Eiffelan optimisation is provided as an object can beexpanded in line in another object, in order to save areference. Eiffel calls such objects expandedobjects. There is still no need for explicitdereferencing. The compiler knows exactly whetherthe object is expanded or referenced, and thus thedot accessor is used for both, so uniform access isprovided, and the access mechanism is hidden. Thismakes the program more malleable, as theprogrammer can later change an object to expanded,and not have to worry about changing every ‘-> ‘ toa dot. Conversely, if expansion turns out to beinappropriate, as in the case of a circular reference,then the expanded status of the object can beremoved from the declaration, without having tochange another single line of code. Thus Eiffel

C++?? 27


preserves the implementation hiding principle,which results in convenience for the programmer.

There is even more to Eiffel’s scheme, which isparticularly relevant to concurrent and distributedprocessing. Meyer points out in [Meyer 96c] that theform x.f means passing the message f to the object x.x may be anywhere on the network. In other words,x might not be a reference that is implemented by anunderlying C pointer, but it may be a networkaddress, for example a URL.

3.18 Anonymous parameters in ClassDefinitionsC++ does not require parameters in functiondeclarations to be named. The type alone can bespecified. For example a function f in a class headercan be declared as f (int, int, char) . Thisgives the client no clue to the purpose of theparameters, without referring to the implementationof the function. Meaningful identifiers are essentialin this situation, because this is the abstractdefinition of a routine; a client of the class androutine must know that the first int represents a‘count of apples’, etc. It is true that well knownroutines might not require a name, for examplesqrt (int) . But this is not appropriate for largescale software development.

The use of anonymous parameters handicaps thepurpose of abstract descriptions of classes andmembers: to facilitate the reusability of software.This is covered in more detail in the section on‘Reusability and Communication’. Program textcaptures the meaning of the system for some futureactivity, such as extension or maintenance. Toachieve reusability, communication of intent of asoftware element is essential.

Names are not strictly necessary inprogramming. Naming exists to help the humanreader identify different entities within the program,and to reason about their function. For this reasonnaming is essential; without it, development ofsophisticated systems would be nearly impossible.Some languages access parameters by their address(position) in the parameter list ($1, $2, etc). This isunsatisfactory, even for shell scripts. Anonymousparameters can save typing in a function template,but then programming is not a matter of conve-nience as it is inconvenient for later readers. Theredundancy is beneficial and saves laterprogrammers having to look up the information inanother place. A real convenience in functiontemplates would be that abstract function templatesbe automatically generated from the implementationtext (see header files for more details).

Anonymous parameters illustrate the linkbetween courtesy and safety issues in programming.Due to pressure of work, a client programmer mightwrongly guess the purpose of a parameter from thetype. The failure of the original programmer toprovide a courtesy has caused a client programmerto breach safety. However, the client programmerwill probably be blamed for not taking due care. An

interface client must know the intention of theinterface for it to be used effectively.

Both Java and Eiffel do away with thedistinction between a function definition anddeclaration. The first reason for this is that you don’tneed forward declarations, as entities can bereferenced before they are declared. The secondreason is that in Eiffel, there are tools toautomatically extract abstract interface definitionsfrom the main code.

3.19 Nameless ConstructorsMultiple constructors must have different signatures,similar to overloaded functions. This precludes twoor more constructors having the same signature.Constructors are also not named (apart from thesame name as the class), which makes it difficult totell from the class header the purpose of the differentconstructors. Constructors suffer from all of theproblems described with regards to overloadedfunctions. Firstly, it would be easy to mark routinesas constructors, for example:

constructor make (...)...constructor clone (...)...constructor initialise (...)...

where each constructor leaves the object in valid, butpotentially different states. Named constructorswould aid comprehension as to what the constructoris used for in the same way as function namesdocument the purpose of a function. Secondly,named constructors would allow multipleconstructors with the same signature. Thirdly, it iseasier to match up an object creation with theconstructor actually called. Fourthly, the compilercould check the arguments given in the invocationto the constructor signature.

Java’s constructor scheme is the same as C++.Eiffel allows a series of creation routines. These areindeed independently named as suggested above.

Eiffel has another advantage in that creationroutines can also be exported as normal routineswhich can be called to reinitialize an object. In C++you cannot call a constructor, after the object iscreated.

3.20 Constructors and TemporariesA ‘return <expression>’ can result in a differentvalue than the result of <expression>. In section6.6.3, the C++ ARM says: “If required theexpression is converted, as in an initialisation, to thereturn type of the function in which it appears. Thismay involve the construction and copy of atemporary object (S12.2).”

Section 12.2 explains: “In some circumstances itmay be necessary or convenient for the compiler togenerate a temporary object. Such introduction oftemporaries is implementation dependent. When acompiler introduces a temporary object of a classthat has a constructor it must ensure that aconstructor is called for the temporary object.”

C++?? 28


A note says: “The implementation’s use oftemporaries can be observed, therefore, through theside effects produced by constructors anddestructors.”

Putting this together, creation of a temporary isimplementation dependent, so might or might not bedone. If a temporary is created, a constructor iscalled as a side effect, which can change the state ofthe object. Different C++ implementations couldtherefore return different results for the same code.

3.21 Optional ParametersOptional parameters that assume a default valueaccording to the routines declaration are supposed toprovide a shorthand notation. Shorthand notationsare intended to speed up software development.Such shorthand notations can be convenient in shellscripts, and interactive systems. In large scalesoftware production, however, precision ismandatory, and defaults can lead to ambiguities andmistakes. With optional parameters the programmercould assume the wrong default for a parameter.More importantly, optional parameters underminetype safety. The type of a function is defined by thecomposition of its input types, and its output type:

f: T1 x T2 x T3... -> T4

The entire signature determines the type of thefunction, not just the return type. Optionalparameters mean that C++ is not type safe, and thatthe compiler cannot check that the parameters in thecall exactly match the function signature.

Furthermore, they do not provide a great deal ofconvenience. If a routine has five parameters, thelast three of which are optional, and the caller wantsto assume the defaults for parameters 3 and 4, butmust specify parameter 5, then all five parametersmust be specified. A better scheme would be to havea ‘default’ keyword in function calls:

f (a, b, default, default, e);

Other means, already in the language, can easilyprovide this mechanism. For example, a call toanother (possibly inline) function could provide thedefaults for the optional parameters:

g(a, b, e); // the callg(int a, b, e) // the function {f(a, b, 0, 0, e);}

This not only provides the convenience of optionalparameters, but is more powerful. Any parameter orcombination can be filled in with any combinationof defaults, not just the last parameters. Multipleintermediate routines can provide multiple sets ofdefaults.

Neither Java nor Eiffel have optionalparameters. Strong typing is enforced, so that theparameters of a call must match the routinesignature.

3.22 Bad DeletionsThe following example is given on p.63 in the C++ARM as a warning about bad deletions that cannotbe caught at compile-time, and probably notimmediately at run-time:

p = new int[10];p++;delete p; // errorp = 0;delete p; // ok

One of the restrictions of the design of C++ is that itmust remain compatible with C. This results inexamples like the above, that are ill-definedlanguage constructs, that can only be covered bywarnings of potential disaster. Removal of suchlanguage deficiencies would result in loss ofcompatibility with C. This might be a good thing ifproblems such as the above disappear. But then theresultant language might be so far removed from Cthat C might be best abandoned altogether.

Bad deletions are the kind of problem the Javadesigners set out to avoid. You do not get baddeletions in either Java or Eiffel for two reasons:firstly, they do not have pointers; secondly, theyprovide garbage collection so don’t delete objects.

3.23 Local entity declarationsDeclaring an entity close to where it is used, hasadvantages and disadvantages as it is convenient,but can make a routine appear more complex andcluttered. A problem is that an identifier can bemistakenly overloaded within a nested block in afunction, with the resultant problems covered in thesection on name overloading. C does not havenested routines or blocks so does not have thisproblem. ALGOL uses this simple form of nameoverloading. (A block in the ALGOL sense containsboth declarations and instructions.)

The ARM explains problems of localdeclarations with branching, which shows thecomplications in intermingling declarations andinstructions. Caveats cannot make up for or fix afaulty language definition.

In well written object-oriented software, routineswill be small, typically performing one atomicoperation per routine, so localised declarations willnot be of much value. Small routines that implementatomic operations are fundamental to loosecoupling. For example, a base class that provides asingle routine that logically performs operations Aand B, is not useful to a subclass that needs toprovide its own implementation of B, but does notwant to change A: the descendant must reimplementthe logic of both A and B, missing an opportunity toreuse the logic of A. Splitting A and B into differentroutines accomplishes loose coupling, and thereforeflexibility. Tight coupling reduces flexibility.

Efficiency is also attained without the mess oflocal entity declarations. Good design and cleanmodularisation achieve efficiency, as the entities

C++?? 29


which would be locals to a block in C++ are onlycreated when the routine is entered. Furthermoresmall routines can be inlined, and in this case, thelocals will only be created when the expanded inlineblock is entered, which is the same effect as if theprogrammer had included the block manually.

Java implements locals in the same way as C++.In Eiffel the philosophy is to use good design tomake routines sufficiently small and atomic. That isone operation, one routine. With this approach,having local declarations only in one place in theroutine and not throughout is sufficient. If you find aplace where you want to introduce local variableswithin the code, this is an indication that you shouldwrite it as a separate routine. An objection could bethat small routines with lots of overhead callingthem is not efficient. Eiffel compilers solve this byautomatically inlining routines. Thus the integrity ofa design is preserved in the program text, butefficiency is retained. In C++ you could manuallyinline such functions.

3.24 MembersCare should be taken with the C++ use of the termmember. In general use, an object is a member of aclass. For example, squirrels are a member of theclass animal. This corresponds to members in settheory. But in C++, the term member means a dataitem, or function of the class. Some people mightsay that set theory is one thing, but programming isanother, so there is no problem with using theterminology. However, set theory underpins thetheory of computation and programming, and sets,classes and types are related. Sets are a means ofdescribing groups of entities which have somesimilarity. Supersets group entities according tobroad concepts; subsets group entities according tonarrower concepts, that is, more restrictive criteria.So sets also underpin our understanding of classesand subclasses.

In set theory we say: 3 ∈ N, or 3 is a member ofthe set of natural numbers. In objects we would saythat Fred is a member of the class person. In C++the field name which for some object contains thestring “Fred” is a member of the class person. Thisis not mathematically correct, and the confusioncould have been avoided.

Java does not seem to use the term member. Itmight stick from C++. Eiffel uses the term features.

3.25 InlinesThe problems described in this section are aconsequence of placing the burden of encapsulationon the programmer. You might wish to review thesection on encapsulation at this point.

The main reason inlines were introduced in C++was to alleviate the cost of crossing the ‘protectionbarrier’, [Stroustrup 94]. The protection barrier inC++ is data hiding. When accessing a data item inC++, it is recommended not to do it directly, but viaa class member function. For example, given an

object reference c you should not access the datamember di directly:

i = c.di; // Not recommended C++ style.

instead di should be private and accessed asfollows:

i = c.get_di();

where get_di is:

int C::get_di() {return di;}

However, Stroustrup found that some programmerswere not using an access function because of theoverhead of a function call. So inlines wereintroduced:

inline int C::get_di() {return di;}

Note that this style of data hiding clutters the namespace and text of a class.

The inline mechanism has two conceptualmistakes and a practical one. Firstly, data hiding andimplementation hiding are not the same.Implementation hiding is more to do with hiding themechanics of the access mechanism, so that youcan’t tell whether it is a constant, variable offunction you are accessing. Inlines are the wrongsolution to this problem: the correct solution isuniform access. The OO concept is to hideimplementation; data need not be private, but maybe functionally exported from the classes interface.

This leads to the second conceptual mistake thatfunctional access and C functions are differentthings. Functional access hides the accessmechanism. C functions, however, make the accessmechanism visible: you know you are invoking apiece of code that will be jumped to. Functionalaccess by contrast is any entity name that can occurin the context of an expression. This entity could bea constant, variable or value returning routine, butyou can’t tell which if the implementation of theaccess mechanism is hidden. The statement i =c.di is functional access. C++ has solved thisproblem in exactly the wrong way in order to staycompatible with the flawed concept of function in C.

The programmer is required to bear this burden,which in turn makes software development morecostly for every company using C++, and againflexibility is reduced. In order to restore informationhiding, that is access transparency betweenconstants, variables and C functions, programmersmust as a matter of style hide constants andvariables behind a C function, as is the case withget_di() . A fix to the language would have beenbetter, but not possible to keep compatibility with C.

The practical mistake is that a compiler canautomatically generate inlines. Requiring aprogrammer to specify inline is a manualbookkeeping task. It is not hard for a compiler oroptimiser to work out that C::get_di(){return di;} or even more complex routinescould be inlined. This is exactly the kind of

C++?? 30


optimisation that Eiffel and other sophisticatedlanguages perform.

[Flan 96] says: “A good Java compiler shouldautomatically be able to “inline” short Java methodswhere appropriate.” An article in Byte of September1996 suggests that to optimise Java method calls,“you should make liberal use of the finalkeyword.” Byte also suggests that instead of smallfunctions, programmers should inline by hand smallmethods. Byte further says: “The trade-off, then, iseither better performance or code flexibility. Youmust decide which is most important to theprogram’s operation in that situation.”

In this respect, Eiffel again proves itselfsuperior. Eiffel automatically determines that aroutine is final , or in C++’s terminology, that aroutine is not virtual . Also Eiffel automaticallyinlines. Therefore the Eiffel programmer does notneed to bend the code to gain performance, orconsider trade-offs: you do not have to trade-offflexibility to gain performance.

Eiffel has a further advantage that it understandsthe difference between implementation hiding anddata hiding and provides implementation hiding. Italso accesses data and constants functionally, so inthe instruction:

i := c.di

you can’t tell and don’t need to know whether di isimplemented as a constant, variable or routinefunction. The implementation is hidden: access isuniform as access to a constant or variable looks thesame as a value returning routine, and the differentaccess mechanisms behind these is hidden andautomatically generated by the compiler. And sincethis implementation distinction is hidden, the needis greatly reduced for either the programmer tomanually inline, or for the compiler to automaticallyinline. In this case Eiffel provides the maximumflexibility.

Since C functions are poor cousins tomathematical functions, and C++ also confuses datahiding and implementation hiding, the languageincludes otherwise unnecessary mechanisms likeinline.

3.26 FriendsFriends are a mechanism to override data hiding.Friends of a class have access to its private data.Friend is a ‘limited export’ mechanism. Friendshave three problems:

1) They can change the internal state of objectsfrom outside the definition of the class.

2) They introduce extra coupling betweencomponents, and therefore should be usedsparingly.

3) They have access to everything, rather thanbeing restricted to the members of interest tothem.

Friends are useful, and a case can be made forshades of grey between public, protected and private

members. An alternative to friends is multipleinterfaces which provide the functionality of friendsand avoid the above problems. Each interface to aclass can be exported to everything, or to selectedclasses only. A selective export mechanism is moregeneral than public, private, protected and friend,and explicitly documents the couplings betweenentities in the system. Selective export specifies notonly that a member is exported but to which classesit is exported.

One reason given for friends is they allow moreefficient access to data members than a memberfunction call. The way C++ is often used is that datamembers are not put in the public section, becausethis breaks the data hiding principle.

As mentioned in the section on inlines,implementation hiding is different to data hiding. Aslong as you access your data functionally, you donot have to hide your data, just the accessmechanism.

Another question is, since there are inlines, isthere a need for the similar mechanism of friends? Ifyou mark a function inline, it is going to expandinline, and avoid the function call overhead. So inthis case, friend is a superfluous mechanism.

In Java, classes in the same package can accessinstance variables from other classes in a friendlyfashion. This is contrary to good programmingpractice and OO design, as it means you can accessthings without going through the published interfaceof a class. However, in Java, explicit friends aregone.

Eiffel offers the pure OO approach, whereeverything must go through publicised interfaces.Note in Eiffel that data attributes in a class may beexported in the published interface, as access isuniform. In that case, external entities can read thedata, as if it were invoking a function, but youcannot write to a data item in an external class. Toupdate a data item, you must call an updateprocedure. Part of the purpose of friend is to updatean item directly, without the overhead of aprocedure call. In Eiffel the compiler willautomatically inline procedures where possible, sothe efficiency concern is addressed.

To summarise: Eiffel does not need the friendmechanism for two reasons: firstly, external classescan access data attributes for reading; secondly, forupdate, a procedure is expanded inline wherepractical. Accessing a data item does not contraveneencapsulation or implementation hiding. Data hidingis not encapsulation, although with encapsulationimplementation data is hidden, the operative wordbeing ‘implementation’, not ‘data’.

3.27 Controlled exports vs friendsAs noted in the section on friends, there is a case forfiner grained control of exports than public ,private and protected . Except for friends,Java uses the same mechanism as C++, but adds twomore categories, default and private

C++?? 31


protected . This complicates the mechanism, andit is difficult to remember exactly what eachcategory does. Eiffel does not have friends, it allowsclasses to be related by a finer grained exportmechanism; for any set of features, you can specifyexactly what classes they are exported to. Classesthat are closely related export to each otherinterfaces that are not available to other classesoutside of that group.

Also in Eiffel, you can export a routine to adifferent set of classes based on whether it is calledas a creation routine (constructor), or normal routinecall.

In Eiffel all features are implicitly public .Public can also be explicitly stated by exporting toclass ANY, ie., the universal set. If a set of featuresis to be protected , ie., internal and not visible toclients, it is exported to class NONE. Such a set offeatures is secret. NONE is the equivalent of theempty set in set theory, which is notionally a subsetof all sets and NONE is a subclass of all otherclasses, and has only one possible value: Void.

There is no equivalent of private in Eiffel,where features can be hidden from sub-classes. Butthis is not necessary, and in most cases private isundesirable. The Eiffel philosophy is that withinheritance you get unrestricted access to theimplementation as this is key to the flexibility ofreuse and extension. As a subclass, you can redefineany routine inherited from a parent. When youredefine a routine, you are changing theimplementation. Since you are changing theimplementation, the private restriction could be anuisance to some subclass that hasn’t been writtenyet. If you need to access a variable, and the parentclass designer has made it private , you are out ofluck. At the best you could go to the programmerwho owns that class, and try to convince them tomake the variable protected . Good luck: thatkind of request often generates a lot of heat. At theworst you can do nothing about it because the classmight be from outside and closed to you. Again inC++ the parent class designer is forced to makedecisions that should be left open. I wouldrecommend against using private , useprotected instead. At least protected leavesthe class open under inheritance.

In C++, private only restricts access, it doesnot restrict visibility in a subclass. With private ,it is still possible to redefine a privatevirtual function from a base class in a subclass.This is not a problem, but you cannot preventredefinition in a subclass, as you can with the Eiffelfrozen mechanism.

In Java you cannot override a private method,but you can overload it: “Note that a private methodis never accessible to subclasses and so cannot behidden or overridden in the technical sense of thoseterms. This means that a subclass can declare amethod with the same signature as a private methodin one of its superclasses, and there is norequirement that the return type or throws clause of

such a method bear any relationship to those of theprivate method in the superclass.” [Sun 96].

A further complication in C++ is that public ,private , protected can be specified wheninheriting a base class. This gives one policy forhow every inherited member from the base class isto be treated in the new class. A problem with this isthat once a member is private or protected, it cannotbe reexported, ie., protected cannot be madepublic , and private cannot be madeprotected or public . Thus the temptation for aC++ programmer is to keep things public, as aderived class might want something to be public,even though it does not make sense to be public inthe base class. Again decisions must be made earlyon issues you don’t know about.

Java has no equivalent. Each member isinherited with the same public , private ,protected attribute as the base class.

Eiffel again has a more fine grained approach.The export policy for each feature inherited from aparent class can be reviewed on a case by case basis.The export status of each feature can be changed andmade more or less restrictive. If there is no newexport policy, the default is the same as the parentclass. The designer of a parent class does not have toconsider what descendant classes need, or worryabout the case where their needs will be in conflictwith each other, as the designer of the descendantclass has complete flexibility, which enhances reuseand extensibility. Eiffel’s export mechanism istherefore vastly superior to the C++ approach.

3.28 StaticThe word ‘static’ is confusing in C++. Page 98 ofthe C++ Annotated Reference Manual (ARM)mentions this confusion and gives two meanings: aclass can have static members, and a function canhave static entities; and the second meaning comesfrom C, where a static entity is local in scope to thecurrent file. The choice of different keywords wouldeasily solve this confusing use of the same keywordfor several meanings. There is also a third moregeneral meaning that objects are statically orautomatically allocated and deallocated on the stackwhen a block is entered and exited, as opposed todynamically allocated in free space. Another generaluse of the word ‘static’ is in ‘static type checking’,which obviously has no relation to the C uses, butoverloads the language even further.

Static class members are useful. Page 181 of theARM states that statics reduce the need for globalvariables, which is good thing, but the C syntaxobscures the purpose.

Locals declared in functions can also be static.These are not needed in an object-oriented language.The reason and history is this: ALGOL has thenotion of ‘OWN’ locals in blocks. The semantics ofan OWN entity is that when a block is exited, thevalue of the OWN is preserved for the next entry tothe block, ie., the value is persistent. The

C++?? 32


implementation is that at compile time, the OWNentity is limited in scope to the block, but at runtime, it is located in the global stack frame. Thesame instance of the variable is used in allinvocations of the procedure, rather than eachinvocation using separate local storage on the stack.This causes complication in recursion.

Simula’s designers generalised the ALGOLnotion of block into class, and so object-orientationwas born. Instead of discarding a class block on exit,it is made ‘persistent’. Declarations within the classblock are persistent, and therefore provide thefunctionality of static and OWN, which wasremoved from Simula. Classes are more flexiblethan statics. Statics are persistent in the same way asglobals, ie., for the duration of the program. Classmember lifetime is governed by the lifetime of theobject so object-oriented languages do not needglobals, OWNs or statics.

Java implements class variables with static.Eiffel uses once routines in order to do away withglobals.

3.29 UnionUnion is another construct that is superfluous inOOP. Similar constructs in other languages arerecognised as problematic: for example,FORTRAN’s equivalences, COBOL’sREDEFINES, and Pascal’s variant records. Whenused to overload memory space these force theprogrammer to think about memory allocation.Recursive languages use a stack mechanism thatmakes overloading memory space unnecessary, as itis allocated and deallocated automatically for localswhen procedures are entered and exited. Thecompiler and run time system automatically allocateand deallocate storage as required, ensuring that twopieces of data never clash for the same memoryspace at one time. This is essential so that theprogrammer can concentrate on the problemdomain, rather than machine oriented details. Whenunion is used similarly to FORTRAN’sequivalences it is not needed.

Union is also not needed to provide theequivalent to COBOL REDEFINES or Pascal’svariants. Inheritance and polymorphism provide thisin OOP. A reference to a superclass can also be usedto refer to any subclass, and thus provides the samesemantics as union, only in a type safe manner, asthe alternatives can never be confused. An objectreference is implicitly a union of all subclasses.

Union can also be used to suppress typechecking. [Stroustrup 94] says “programmers shouldknow that unions and unchecked function argumentsare inherently dangerous, should be avoidedwhenever possible, and should be handled withspecial care when actually needed.”

Sun recognises that the union construct isunnecessary, and has removed it from Java. Noequivalent exists in Eiffel.

3.30 StructsStruct is only in C++ as a compatibility mechanismto C. When you have classes you don’t need structs.Again, C++ is unnecessarily complicated withunneeded features.

[Sun 95] says: “The Java language has nostructures or unions as complex data types. Youdon’t need structures and unions when you haveclasses - you can achieve the same effect simply byusing instance variables of a class.”

Eiffel and Smalltalk similarly have noequivalents to struct.

3.31 TypedefsTypedef is yet another mechanism not needed. Java,Eiffel and Smalltalk all build their type mechanismsaround classes.

3.32 NamespacesNamespaces are a new concept introduced in July1993. Namespaces address the problem that globalnames imported from different .h header files canclash. The C++ solution is namespaces, whereglobals are put in a namespace. Access to theseentities must be qualified with the namespace name.For example, A::x means access entity x innamespace A. Another namespace B might also havean entity named x , but these names will not clash.Entities not in a namespace are considered to be inthe global namespace.

In pure OO languages, namespaces are notneeded; classes themselves are namespaces. Thereare no global environments, so C++ introducescomplexities not needed in Java, Eiffel andSmalltalk.

Java and Smalltalk have class variables, whichcan be used in place of globals. Eiffel provides onceroutines, so that you can access object instanceswhere your ‘globals’ are stored.

Namespaces address the problem of nameclashing entities. However, the names of thenamespaces themselves can clash. For example, iftwo header files have namespaces called MY_NS,you have a clash.

As you might be aware by now, name clashesare a nuisance whenever you mix and matchsoftware entities together. An example we have seenis multiple inheritance. Eiffel provides a goodsolution to this with the rename clause in theinheritance clause.

Eiffel could also have a problem with classname clashes, as class names are global. Thesolution to this is to use a deployment languageseparate from Eiffel itself. This language is calledLACE, Language for the Assembly of Classes inEiffel. The concern of LACE is to mix and matchclass libraries together, and it provides mechanismsto rename classes, and resolve other conflicts. Thatway, deployment concerns are kept separate fromthe programming concerns.

C++?? 33


While namespaces in C++ address a problem,they rely on programmers to be courteous, and placeglobals in namespaces. Perhaps a better way, wouldbe to have a separate mechanism equivalent toEiffel’s LACE where such conflicts are resolved,rather than making the language even morecomplex.

3.33 Header FilesIn C++ a class interface must be maintainedseparately from its body. An abstract class interfaceis just the class with the implementation detailremoved so the interface and implementation canboth be maintained in one source. In C++ though,programmers must maintain the two sets ofinformation. This is because of the C/Unix style ofprogramming with separate modules but little or noglobal analysis. Replicated information has the wellknown drawback that in the event of change, bothcopies must be updated. Sun calls this “The FragileSuperclass Problem.” [Sun 95] This can lead toinconsistencies that must be detected and corrected.Classes that depend on another class must berecompiled if the layout of that class changes. Toolscan automatically extract abstract class descriptionsfrom class implementations, and guaranteeconsistency.

Splitting C and C++ programs into a myriad ofsmall, separately compiled files turns out not to be agood way to organise projects, and not a good wayto program, as you must maintain many header files.Some people are now finding it more convenient tokeep an entire large system in one file as it solvesmany maintenance problems, and also makes iteasier to find things during editing. Unfortunately,while this scheme on many systems allows forglobal analysis, this will still not solve the problemsarising from lack of global analysis in C++.

The programmer must also use #include tomanually import class headers. #include is anold and unsophisticated mechanism to providemodularity. #include is a weak form ofinheritance and import. C++ still uses this 30 yearold technique for modularisation, while otherlanguages have adopted more sophisticatedapproaches, for example, Pascal with Units, Modulawith modules, Ada with packages. In Eiffel the unitof modularisation is the class itself, and includes arehandled automatically. The OOP class is a moresophisticated way to modularise programs.Inheritance implements reusability andmodularisation, so #include is superfluous.

Another problem is that if header A includesheader B, and header B includes header A, a circulardependency occurs. The same problem occurs ifheader A includes headers B and C, and header Balso includes header C. A simple but messy fix in allheaders solves this problem:

#ifndef thismod#define thismod

... rest of header#endif

Headers show how C++ addresses the problem ofindependent modules with a non-object-orientedapproach that is sub-optimal; the programmer mustsupply this bookkeeping information manually.#include relates to the organisation andadministration of a project. Rational language de-sign eliminates such manual bookkeepingmechanisms.

A class interface is equivalent to a moduleheader. A module header contains data and routinesexported to other modules. This is exactly thepurpose of the class interface. Furthermore, in C++a tool like make must be used to specify thedependencies.

A class definition contains all knowledge ofaccessed classes and their dependencies (inheritanceand client) in the class text. Dependency analysis isderivable from the class text, and much of thefunctionality of tools like make can be integratedinto the compiler, so the errors and tediumencountered in the use of make are avoided.Dependency analysis also implements a level ofdead code elimination.

A traditional system is assembled by combiningmodules; an object-oriented system is assembled bycombining classes. Modules are a primitive form ofclasses; classes are more sophisticated. They expressmore precisely relationships with other classes. C++#include and modules have problems. Thisprimitive method is not required in an object-oriented language.

According to Stroustrup C++ would be a betterlanguage without the C preprocessor. Most uses of#define are now covered by other mechanisms.To remove #include would require some otherimport mechanism. [Stroustrup 94] says: “I’d like tosee Cpp abolished.”

Neither Java, nor Eiffel need header files or the#include mechanism. This means thatprogrammers do not have to maintain headersseparately. When Eiffel sees any declaration:

c: Cit knows the current class has a dependency on theclass C. C is implicitly imported, so there is no#include mechanism: Eiffel has done thedependency analysis for you. If you add a newdeclaration to a class that hasn’t be used before, thedependency is automatically generated the next timethe class is compiled.

Java maps qualified class names such asjava.lang.Math to the environments file directorystructure, for example java/lang/Math in Unix.

Eiffel provides a utility short that extracts classinterface definitions from the class implementation.However, the function of this is for humanreadability, not to provide the compiler with classdefinitions as in a C header file.

Eiffel also separates the bookkeeping concernsfrom the language. These functions are provided by

C++?? 34


the LACE language, Language for the Assembly ofClasses in Eiffel. LACE is used separately to theEiffel language, but is processed by the compiler tomap class names to their location (directory and filename in Unix style systems.)

Java and Eiffel also remove the need for make.Gone is the manual dependency analysis, orremembering to rerun makemake, when yourdependencies change.

3.34 Class InterfacesSection 9.1c of the C++ ARM points out that C++has no direct support for “interface definition” and“implementation module”. In a C++ class definition,all private and protected members must be includedin the public text of the class. The ARM points outthat whenever the private or protected parts arechanged, the whole program must be recompiled.Further to what the ARM says, all modules that aredependent on the header file must be recompiled,even though the private and protected members donot affect other modules. Private members shouldnot be in the abstract class interface, as this exposesimplementation details to programmers of clientmodules.

3.35 Class Header DeclarationsC’s syntax for function declarations is [<type>]<identifier> (<parameters>). For (a very simple)example:

class C{

a ();b ();int c ();d ();char e ();virtual void f ();

}

To find an identifier in this layout, the eye musttrace a course around the type specifications andmodifiers, which is a tiring activity. There is agreater chance of missing the sought identifier, andthe programmer must resort to using the searchfunction of a text editor to help out.

Other languages place the entity names first. Forexample:

class C{

a ();b ();c () int;d ();e () char;f () virtual void;

}

To those used to the ALGOL and FORTRAN styleof type first, this seems backwards. But name first is

logical as a real world example illustrates: imagineif a dictionary was published where the keywordswere not placed first, but rather the entry order is -

noun /obvrzen/ obversion, the act or result of obverting

Such a dictionary would not sell many copies, unlessthe marketeers managed to fool many people thatthe explanation of the meaning was better becausethe order of layout was mysteriously magical. Thisexample illustrates how important subtle syntaxdecisions are, and why Pascal style languages haveordered things contrary to FORTRAN, ALGOL andothers. The language designer must consider thesetrivial but important alternatives. The layout ofprogramming entities is essential for effectivecommunication. The dual roles of language syntax,and programming style affect comprehension. Adictionary or index style layout suggests placingentity names first, followed by their definition.

Java obviously has to retain this problem since itis C based. In fact the hello world program in Javashows how putting an entity name after modifierscan obscure the program:

public static void main(...)

Eiffel mostly puts the feature name first, except forthe frozen case, so that features are easier to find.The frozen modifier is not used very often though.

3.36 Garbage CollectionOne of the hallmarks of high level languages is thatprogrammers declare data without regard to how thedata is allocated in memory. In block structuredlanguages, local variables are automaticallyallocated on the stack, and automatically deallocatedwhen the block exits. This relieves the programmerof the burden of allocating and deallocatingmemory. Garbage collection provides equivalentrelief in languages with dynamic entity allocation.

In C++ the programmer must manually managestorage due to the lack of garbage collection. This isthe most difficult bookkeeping task C++programmers face that leads to two oppositeproblems: firstly, an object can be deallocatedprematurely, while valid references still exist(dangling pointers); secondly, dead objects mightnot be deallocated leading to memory filling up withdead objects (memory leaks). Attempts to correcteither problem can lead to overcompensation and theopposite problem occurring. A correct system is afine balance. This is illustrated in the figure below.

Dangling Correct MemoryPointers System Leaks

These problems contribute to the fragility of C++programs, and usually result in system failure.Garbage-collection solves both problems, but has anundeserved bad reputation due to some earlygarbage-collectors having performance problems,

C++?? 35


instead of working transparently in the background,as they can and should. These problems are oftenover-emphasised as a justification for C++ ignoringgarbage collection. A possible solution is to buildgarbage collection into the run-time architecture, butallow the programmer to activate and deactivate itmanually. Garbage collection can be disabled insystems where it is inappropriate.

In C++ it might be argued that the lack ofgarbage-collection is not an engineeringcompromise. Its inclusion is nearly an engineeringimpossibility, as a programmer can undermine thestructures required for implementing correctlyworking garbage-collection. While garbage-collection might not actually be an impossibility inC++ (EC++), it is difficult, and programmers wouldhave to settle for a more restricted way ofprogramming. This could be a good thing. But thenthe compromise to remain compatible with Cbecomes difficult, if the compiler is to detectpractices inconsistent with the operation of garbage-collection.

[Sun 95] states that “explicit memorymanagement has proved to be a fruitful source ofbugs, crashes, memory leaks and poorperformance.” Sun have built garbage collection intoJava.

Bertrand Meyer lists garbage collection in hissteps to object-oriented happiness. This is notsurprising in a language that has exception handling,keeping track of live and dead objects is even moredifficult, so Eiffel is also based on built-in garbagecollection.

Stroustrup is also an advocate of optionalgarbage collection. In [Stroustrup 94] he states“When (not if) garbage collection becomesavailable, we will have two ways of writing C++programs.” My question is not if or when, but how?Unless you restrict pointers and pointer operations,garbage collection will be very difficult, andprobably inefficient. By inefficient, I mean eitherslow, or it won’t clean up very well, or even both.

In Eiffel garbage collection is also optional. Thegarbage collector can be disabled during critical realtime phases of program execution. It cannot becompletely disabled, as if a program runs out ofmemory in this state, the garbage collector will beinvoked, which is always preferable to theapplication crashing irrecoverably.

3.37 Low level codingOne of the stated advantages of C++ is that you canget free and easy access to machine level details.This comes with a down side: if you make a greatdeal of use of low level coding your programs willnot be economically portable.

Java has removed all of this from C, and one ofJava’s great strengths is its portability betweensystems, even without recompilation.

The Eiffel solution is somewhat different again.In Eiffel you have no access to machine and

environment level details, in the language itself.You can use libraries that provide access to routineswritten in external languages like C. You can stillwrite your low level C routines, and easily accessthis level from Eiffel. The major advantage of thisapproach is that all system level code is centralisedin a few places, and this provides good separation ofconcerns. If you have to port your system, youknow exactly which parts of code will needattention. System interfaces are thus provided in aset of well designed classes and routines. In C++you can only enforce this as a matter of disciplineover your programmers.

3.38 Signature VarianceWhen redefining a routine, there is an opportunity toredefine the signature as well. There are three waysa language can do this known as: no variance,contra-variance, and co-variance. This is an issue oftype safety.

No variance means that the language does notpermit the signature to change. The signature mustexactly match the signature inherited from theparent.

Contra-variance means that the signature in asubclass can modify each argument so that it is asuperclass of the matching parent argument. Forexample, if you have classes A and B, and B inheritsfrom A, then given a parameter of type B in yourparent, you can keep it as B or modify it to A. Thisdoes seem counter intuitive, but there are some goodexamples of where it works.

Co-variance is the opposite of contra-variance.In the above example, if your parent has a parameterof type A, you can keep it as A, or redefine it to anydescendant of A. This is more intuitive than contra-variance. In either scheme, a compiler can check fortype-safety.

C++ and Java offer no variance for polymorphicmethods. The reason for this is that if you have aroutine with a different signature, even if theparameters of the parent and child are typeconformant, the method overloads rather thanoverrides the original method. Overloading can be amajor cause of confusion and errors. Many otherlanguages require that a redefined routine must beexplicitly marked as redefined or overridden.

As stated before a simple solution to theoverloading problem would be to require thatprogrammers mark the methods: override oroverload . The compiler could then check forconsistency, that the parameters for an overridingmethod are an exact, or co/contra-variant match, andthat for an overloaded method, the parameters aredifferent. Making overriding and overloadingexplicit is also good documentation, as it is a doublecheck of what the original programmer reallyintended. Remember that overriding choosesbetween the alternative methods at run-time, basedon the type of the owning object; overloading

C++?? 36


chooses between the alternative methods at compiletime based on the argument types.

Eiffel is an interesting case. Contrary to manystrong opinions and theoretical arguments in supportof contra-variance, Eiffel chooses the intuitive co-variant approach, claiming this is useful in manymore situations. Eiffel has also implemented co-variance in such a way that it is type safe.

3.39 Pure Virtual FunctionsPure virtual functions provide a means of leaving afunction undefined and abstract. While the conceptis correct, this section shows both the syntax, andthe terminology ‘pure virtual’ leave something to bedesired. A class that has such an abstract functioncannot be directly instantiated. A non-abstractdescendant class must define the function. The C++pure virtual syntax is:

virtual void fn () = 0;

This leaves the reader new to C++ to guess itsmeaning, even those well versed in object-orientedconcepts. ‘=0’ might make sense for the compilerwriter, as the implementation is to put a zero entryin the virtual table. This shows how implementationdetails which should not concern the programmerare visible in C++.

A better choice would have been a keyword suchas ‘abstract’. Abstract should have syntacticsignificance as abstract functions are an importantconcept in object-oriented design. The C++ decisionin keeping with the C philosophy of avoidingkeywords is at the expense of clarity. A keywordwould implement this concept more clearly. Forexample:

pure virtual void fn ();

or

abstract void fn ();

The mathematical notation used in C++ suggeststhat values other than zero could be used. What ifthe function is equated (or is that assigned) to 13?

virtual void fn () = 13;

A function is either implemented or undefined. Thisto any analyst suggests a boolean state, which asingle keyword conveys. A simple suggestion to fixthis is to define ‘= 0’ as abstract:

#define abstract = 0

then

virtual void fn () abstract;

Let’s look at =0 a slightly different way, as a keyphrase, or a keyword which is spelt with thecharacters ‘=0’ . If you do that, then the objection tokeywords becomes a non-issue.

As for the terminology, ‘pure virtual’ is acontortion of natural language. It combines words

that are somewhat opposite in meaning. Pure meanssomething that really is what it appears to be, as inpure gold. Virtual means something that appears tobe what it actually is not, as in virtual memory.Perhaps pure virtual gold is fools gold. Ashas been said before, virtual is a difficult concept tograsp. When it is combined with a word such as‘pure’, the meaning becomes more obscure.

[Stroustrup 94] gives the curious tale about the‘curious =0’ syntax: “The curious =0 syntax waschosen over the obvious alternative of introducing akeyword pure or abstract because at the time Isaw no chance of getting a new keyword accepted.Had I suggested pure , Release 2.0 would haveshipped without abstract classes. Rather than riskingdelay and incurring the certain fights over pure , Iused the traditional C and C++ convention of using0 to represent “not there.””

Mathematically, 0 does not normally represent“not there”. Usually, 0 is just another number. Using0 to represent “not there” leads to semanticproblems which lead to many interesting discussionson topics such as 3 value and 4 value logic, etc. Inthe C world, there are constant arguments overwhether NULL is 0 or something else. In thedatabase world, a value is needed for “not known.”If 0 is used for “not known,” then there is a problemif the value is known, but happens to be 0. The =0syntax is an aggregation of errors. Not only arekeywords such as virtual and staticoverloaded, but worse a number such as zero tomean things that it does not mathematicallyrepresent.

Java and Eiffel use much clearer syntax. Javasimply uses:

abstract void fn ();

In Eiffel you specify the routine as deferred,meaning the details of implementation are deferredto a descendant class:

r is deferred end

The ‘end’ might look like syntactic baggage, butyou can specify other abstract properties of adeferred routine in the form of pre and postconditions.

Eiffel uses the best terminology, as deferredmeans the implementation is deferred. A routine thathas an implementation still has an abstract form.The abstract definition of the routine is obtained bythe short tool, which extracts the routine signature,that is name, parameters, type, and pre and postconditions from the other details. The term abstractdoes not necessarily mean ‘not implemented’.

3.40 Programming by ContractA common problem programmers face is thatimplementation hiding is very nice in theory, butoften, you actually have to look at the internals of aclass and its routines to determine what the classdoes and how to use it. Often you must examine the

C++?? 37


internals of a routine before you call the routine sothat it works correctly, and to determine its exacteffect after the routine has executed. The signaturespecification of a routine is not enough; routinesoften have side effects.

Eiffel extends the concept of routine signature:what you must set up prior to calling a routine isdocumented as preconditions in the requires clause,and the exact effect of a routine is documented aspostconditions in the ensures clause. The short tool,extracts the preconditions and postconditions withthe abstract part of a routine signature, asdocumentation for clients of a class. Preconditionsdocument the obligations of the caller and benefitsto the called routine, and postconditions documentthe obligations of the called routine and benefits tothe caller: hence the term programming by contract.

Programming by contract is a major technique insaving programmers from having to look atimplementation code, and is most important tolibrary vendors who don’t want to give away theinternals of their implementation, but do wantpeople to buy and use their library.

Programming by contract is not just a fancydocumentation scheme, but the preconditions andpostconditions provide run time checks to ensurethat all units of the program are behaving correctly,and thus fulfilling their contracts. This is themechanism that detects the run-time inconsistenciesdiscussed in the section on correctness. In Eiffel,this mechanism is integrated with the exceptionhandling mechanism. In C++ and Java you can useassertions for run time checks, but these are notintegrated into the programmers mindset as inEiffel.

Programming by contract is the equivalent tointegrated circuit specifications in the electroniccomponent world, and also tolerances in morephysical engineering disciplines. In Eiffel, thecombination of static type checking withpreconditions and postconditions, integrated withexception handling form a significant way to testthat the software jig-saw puzzle fits together, andthat the resulting picture makes sense. Thesetechniques significantly reduce dependence on ‘afterthe fact’ manual testing.

Neither Java nor C++ have this mechanism.Another interesting case is CORBA IDL, whichbeing an interface language for distributed objects,contract information is important. It is a glaringomission from CORBA IDL which has glaringinclusions of struct, typedef, union, etc., all of whicharen’t helpful in a distributed object environment,where the concept of programming by contract iseven more important in considering how to connectall the system components together, and you wantmore confidence that the distributed jig-saw fitstogether. In fact this biases CORBA to Cimplementations. The industry should stop andthink, design things carefully and correctly, and stopdesigning things to look like C. So often C

constructs are inappropriate, and make adoptingmore advanced and necessary concepts difficult.

3.41 C++ and the software lifecycleThe software lifecycle has attracted a great deal ofattention. It is at least generally accepted that theactivities in the lifecycle are analysis ofrequirements, design, implementation, testing anderror correction, extension. Unfortunately, the resultof identifying these activities has resulted in aschool of thought that the boundaries between theseactivities are fixed, and that they should besystematically separate, each being completedbefore the next is commenced. It is often argued thatif they are not cleanly separated, then you are notpracticing disciplined system development.

This view is incorrect; someone who writes aprogram straight away is actually doing all the stepsin parallel. It might not be the best way to do thingsin many circumstances, might or might not suit thestyle and thinking of different people, but this worksin some scenarios, and can be the methodology ofchoice of disciplined thinkers. While that is anextreme example, the ideal way to work probablylies between that and a strictly regimentedenvironment that assigns different people or teamsto the lifecycle phases.

Some people can hold a whole problem andsolution in their head and work in a disciplinedfashion until the solution is complete. Mozart is saidto have composed this way, producing his last threesymphonies in as many months in 1788. Beethoventoiled far more over the production of his works,taking years to complete one symphony. Bothcomposers produced masterpieces. Mozart wrotemusic directly, whereas Beethoven wrote themesand ideas in his famous sketchbooks. WhileBeethoven and Mozart had their own methods, theproduction of masterpieces depends on skill, not onmethodologies.

A view that is gaining acceptance is that thesoftware lifecycle should be an integrated process.Analysis, design and implementation should be aseamless continuum. The activities of the lifecycleshould progress in parallel to expedite softwaredevelopment. Facts found out only as late as theimplementation stage can be fed back into theanalysis and design stages. The object-orientedapproach supports this process. Artificial separationof the steps leads to a large semantic gap betweenthe steps. The transformations required to bridgesuch semantic gaps are prone to misinterpretation,time consuming and costly.

We should cease dependence on testing. This isnot to say that systematic or even random testing byan independent test group is not important, but weshould rely more on better techniques in thepreceding phases. Software testing can never provethe absence of error, it can only be used to detecterrors if they are there.

The same people should be responsible for allstages, so that they take responsibility for the system

C++?? 38


as a whole, rather than passing the buck and blamewhich occurs when analysts, designers andimplementors are different groups. This is not apopular view in traditional hierarchical managementstructures where organisational structure is prizedover quality and programmers get promoted todesigners who get promoted to analysts, andmanagers stay aloof from the technical process, justmaking sure the old structure is maintained. Or evenworse, those who become analysts, designers andmanagers have little knowledge or experience ofprogramming and large scale software engineering.Since the second edition of the critique, ScottAdams’ Dilbert comics have become widely knownas accurate comments on such organisationalproblems. Hierarchical management discouragespeople from feeling responsible for a product. Thisculture must radically change if we are to producequality systems.

We should have learnt from the extremes ofSA/SD. Some quarters believed that methodologywas all important, while programming andprogramming languages were unimportant. Arcaneand machine-oriented programming languagesstrengthened this attitude, concentrating on the‘how’ of computation, whereas the modellerscorrectly demand notations that express the ‘what’,in order to be implementation independent. Amodern software language supports the integrationof the activities of design and implementation bybeing readable, and problem-oriented. A languageshould be as close to design as possible. The needsand requirements of an enterprise can change muchmore rapidly than programmers can keep up,especially in a highly competitive and commercialworld.

So how does C++ fit into this picture? Well it isbased on C that was designed mainly as animplementation and machine-oriented language. It isan old language, that did not need to consider theintegrated lifecycle approach. C++ might have someof the trappings of object-oriented concepts, but it isan uncomfortable marriage of a problem-orientedtechnique with a machine-oriented language. Itaddresses implementation, but does not addressother aspects of the software lifecycle so well. SinceC++ is not so well integrated with analysis anddesign, the transformation required to go fromanalysis and design to implementation is costly.There is a large semantic gap between designlanguages and the implementation language.

We should have learnt from the structured worldthat this is the incorrect approach to the softwarelifecycle. But in the OO world we are again fallinginto the trap of dividing the lifecycle into artificiallydistinct activities of OOA, OOD and OOP, insteadof adopting an integrated approach. Modernlanguages provide a much more integrated approachto the complete software development process thanC++. C++ supports classes and inheritance and otherconcepts of object-orientation, but fails to addressthe entire software lifecycle.

Eiffel is specifically designed around theclusterfall model of the project lifecycle. In thismodel, several subparts of a project may be indifferent phases at any instant. It also recognises thatfeedback occurs from later phases to earlier phases.Eiffel itself is quite a good specification language.Its assertions and invariants are something like youwould see in a formal specification language like Z.While not as comprehensive as Z, Eiffel’sspecification mechanisms suffice in most cases.(Bertrand Meyer was involved in the early work onZ). Thus you can use Eiffel as a documentationlanguage in phases as early as analysis. The problemof different notations in different phases, and error-prone translation between them is removed.

The mechanism that Eiffel includes to ceasedependence on testing is the assertion mechanism,integrated with exception handling. Organisationswill find it difficult to make significant progresstowards the higher levels of the SoftwareEngineering Institute Capability Maturity Model(SEI CMM) until techniques such as this in Eiffelare in widespread use.

Eiffel is also integrated with a graphical CASEtool called BON (Business Object Notation) forthose who feel more comfortable with classificationand component relationship diagrams. Mostimportantly, Eiffel and BON are based on the sameunderlying abstract concepts. Eiffel can be generatedfrom BON and vice-versa. This means you caneasily “reverse engineer” your text, but the majoradvantage is that your diagrams and your text arealways synchronised. There is no costlymaintenance when your program changes, anddiagrams have to be updated to reflect this fact.Thus Eiffel is a step towards seamless softwareengineering.

3.42 CASE ToolsThe previous section raises the question of CASEtools. [Madsen 93] has a good discussion ongraphical notation (18.8). BETA is a language thatcan be used for analysis, modelling and design. To acertain extent, this comes with any language thatsupports classes, as these are the elements of OOanalysis and design, but it is important to developthe language with analysis and design specifically inmind.

If you are using both graphics and textualnotations, it is important that both are based on thesame underlying abstract language: text andgraphics should represent the same concepts. Amajor problem with SA/SD was the graphicalnotations and programming notations were so farapart that costly and error-prone manual translationwas required between the two. Unfortunately, thishas set up the precedence in peoples minds thatgraphical and textual notations are necessarily farapart, and are surprised to see how close these are ingood object-oriented systems.

It should not be thought that graphics are highlevel, and text is low level; that is the nature of

C++?? 39


abstractions, not the tools or notations. In fact itshould be pointed out that text is a highly evolvedform of graphics; both forms of information enterour brains through our eyes Because of the nature ofgraphical notations less detail can be shown. Withan integrated editor detail in text can be suppressed.In identifying classes during analysis, it reallymakes no difference whether you document them asa series of graphical boxes with class names in themiddle, or a textual list of class names. In fact manypeople will find the list easier to work with and laterread. At any stage the notations should beinterchangeable. In some cases the graphicalnotation will abstract away details, which is anadvantage, when you don’t want to see the details.As you add details though, graphical forms becomeunwieldy, and text is easier to manage.Unfortunately, many sectors of the industry havebecome convinced that graphical forms are moreformal and result in magically better designs thattext equivalents.

Graphics and text are best in an integratedenvironment. A programmer may have a classdiagram as a starting point, like GUI file icons.Selecting a class will expand the class so that theinterface of the class can be seen. At a differentlevel, internal features of the class might be seen.Eventually, a level where text is seen is reached.The major failing of most CASE tools is they do notsupport this level of seamless integration. For themost benefit they should flow into the programminglanguage. So called ‘visual’ environments do littlebetter than putting program text in a GUI window.

Why bother with graphics then? For the simplereason that looking at the same problem in differentways aids understanding. It is also a matter of taste.Some people will find they understand graphicsbetter, and some text. It is a good idea to cater forpersonal tastes, as long as there aren’t too manyoptions, in which case everyone will end upspeaking their own language, and there will be noeffective communication, a tower of Babel. But thishas already been the case in the industry, as designmethodology notations are far apart, with theanalysts/designers not wanting to read programs,and programmers not wanting to read structurecharts and data flow diagrams.

A common design method with C++ is to useOMT (UML) or some equivalent methodology.However, the object models are different as thegraphical and textual languages are not based on thesame underlying abstract language. Thus there is asemantic gap between the text and graphics. Thisresults in more costly and error-prone development.But then as the OMT people have said “Eiffel isarguably the best commercial OO language in termsof its technical capabilities.” [RBPEL91], p327. Theobject model of Eiffel is certainly closer to OMTthan C++.

In conclusion, if CASE tools and graphicalnotations are to be of use, they and the programming

language must be based on the same abstractconcepts.

3.43 Reusability and CommunicationReusability is a matter of communication.

Clear communication is a courtesy concern. Inorder to use a software component, you must be ableto understand it. The writer must communicate thepurpose, intent, and correct usage of the componentto the client. In the object-oriented world, clear andconcise definition of software modules is not a merenicety, but essential for reusability. Arising out ofthe issue of reusability is extendibility. In order tomaximise the reuse of software, it must often betailored for new applications. The client programmermust decide whether a software component issuitable for a new task, and if so, what is the bestway to extend it?

Communication is aided by having integratedtext and graphics environments, where the concretelanguages of both are based on the same underlyingabstract languages, or object models.Communication is also dependent on clear and cleansyntax.

As C/C++ suffer from arcane and cryptic syntax,it does not support the goal of clear communication.

Java cleans up a fair bit of C/C++. The mess thatis caused by the preprocessor is removed. However,Java still suffers from some of the deficiencies of Cin this regard.

Eiffel has been designed with communication inmind, and is not bound by the shackles of C syntax.It borrowed from the clean syntax of Ada. Alongwith the Eiffel syntax were designed styleguidelines, so the Eiffel syntax lends itself to a clearstyle.

Eiffel also has utilities like short, where theabstract interface of classes can be extracted fromthe full details.

Eiffel provides an extra significant mechanism,that of integrated assertions. The short tool willextract the assertions with the interface descriptions.This has been described in the section onprogramming by contract. Programming by contracthelps decide whether a class is useable in a newsituation, and then how to use it, so this is animportant tool for communicating the purpose,intent and correct usage of a software module. Thusassertions are very much a courtesy concern.

Reusability is well supported with clearcommunication in Eiffel.

3.44 Reusability and TrustReusability is a matter of trust.

Building trustworthy components is a safetyconcern. Trust results from confidence that safetyconcerns have been met. If you do not haveconfidence in a software component, then you won’twant to reuse it. You could doubt that the softwarecomponent provides enough functionality, or correctfunctionality. You could doubt that the component

C++?? 40


is efficient enough, or worse it might fail. As somany traps in C++ result in ‘bugs’, it is difficult totrust a software module, so it is less reusable.

In the real world of reusability, the ideal oftrusting programmers is inappropriate, and results inless trustworthy software; in reality, customersdoubt the claims of suppliers. It is the onus of thesupplier to prove their claims, and thustrustworthiness of the software. The client is notrequired to trust the supplier’s programmers.Potential clients of a software component, requireassurance that the component is trustworthy.

Trusting programmers is against the commercialinterest of both parties. This is not to cast dispersionon programmers, but merely recognises thatcomputers are good at performing mundane tasksand checks, but people are not. If people were goodat such things, we would not need computers in thefirst place.

Even though you might not trust yourprogrammers, this is not an excuse to employanything but the best skilled programmers, andprogrammers should also be given the best training.Consider a Stradivarius violin: it will sound bad inthe hands of a bad violinist. But a good violinist willinsist on a Stradivarius, rather than a cheap brandwhere he won’t sound his best. In computing, wefrequently argue whether it is the tools or theprogrammers. It is a combination of the two; ifeither is lacking, trustworthy software will notresult.

Java “eliminates entire classes of programmingerrors that bedevil C and C++ programmers” [Sun95]. This means that you can better rely onexternally developed Java packages.

Eiffel also is not bedevilled by the same classesof errors. Thus you are more likely to producesoftware that can be used in other contexts, and beable to find software that can be reused in yourcontext.

Eiffel assertions are also important here. Asassertions are checked at run time, they ensure thatthe software is working correctly, so the level oftrust in external components is higher, and you reusethem with more confidence.

3.45 Reusability and CompatibilityDifferent compiler implementations need to becompatible in order to realise reusability betweenlibraries and components. Different C++ compilersgenerate different class layouts, virtual functioncalling techniques, etc. The name encoding schemesused for type safe linkage can also be different. Iftwo different compilers generate different run-timeorganisations, then different name encodings aredesirable as it will prevent two incompatiblelibraries from being linked. The C++ ARM (p122)states: “If two C++ implementations for the samesystem use different calling sequences or in otherways are not link compatible it would be unwise touse identical encodings of type signatures.”

This can be solved in two ways: firstly, a libraryvendor could provide the entire source of a libraryso it can be compiled with the customers compiler;if the sources are proprietary the vendor will need aseparate release for every environment, and everycompiler in that environment.

Because of this problem a strong case exists fora universal intermediate machine readablerepresentation of programs. Interestingly, somesystems are already using C as a ‘universalassembler’, notably AT&T C++ and Eiffel. But thiscannot solve the above problems of compatibilitybetween components without a standardisation efforton run time layouts and name encoding schemes.

An important feature of Java is that it isarchitecture neutral as Java compilers produce bytecode instructions for a virtual machine. Javaprovides a “universal intermediate machine readablerepresentation of programs” as I called for in thispaper’s second edition.

Eiffel implementations provide a high level ofsource code compatibility. However, the generatedC from different implementations can have differentobject layouts. Thus a class library will have to berecompiled if it is to be used in a system compiledwith a different vendors implementation.

Another form of incompatibility betweenlibraries is incompatibility of type definitions. Aglaring example in C++ is the number of ways thesimple type boolean can be defined. For more onthis see the section on booleans.

3.46 Reusability and PortabilitySince true OOP ensures that objects are looselycoupled to the external environment, portability todiverse environments is possible. C is highlycoupled to Unix style environments, and as such isnot particularly portable to diverse environments.

Java is also the winner in this category, due toits virtual machine, and removal of pointers. Eiffelcode is also highly portable, but you are currentlyconfined to systems where Eiffel compilers exist ofwhich there are many. As most Eiffel compilersgenerate C, you can port the generated C toplatforms where there is no Eiffel compiler. WithJava, only a virtual machine interpreter needs to beavailable on the system in order to run Javaprograms.

As the Java virtual machine seems to besufficiently semantically rich, it could be that otherlanguages target the Java virtual machine, and that itbecomes a universal machine code. Such a marriagemight not be as easy as it appears, if the objectmodels of different languages are sufficientlydifferent from the Java model. Sun does seem tohave kept the virtual machine independent ofphysical object layout, and any assumptions thatwould make this too hard.

C++?? 41


3.47 Idiomatic ProgrammingThe ability to program in different idioms is arguedas a strength of C++. Idiomatic programming,however, is a weak form of paradigmaticprogramming; it is programming in a paradigmwithout necessarily having compiler support for thatparadigm. The compiler cannot check forinconsistencies with the idiom, or paradigm. Definescan often be used to invent idioms. Anyone who hasattempted to do object-oriented programming in aconventional language using defines will realise thatit is impossible to realise the benefits easily, if at all,without compiler support.

Both Java and Eiffel are strongly object-oriented: the idiom is OO. You don’t have to bringtogether various sub-projects each of which mighthave used their own favourite idiom.

3.48 Concurrent ProgrammingThe object of concurrent programming is thatcomputing resources can be harnessed to efficientlycompute problems that would otherwise beinefficient to compute using a single processor. Inthe next ten years multiple processor arrays thatexecute programs concurrently will likely becomecommon. Concurrency requires much cleanerlanguages, than the single processor languages oftoday.

Object-oriented concepts support concurrentprogramming. Objects can execute state changingcode independently of each other. Concurrentprogramming will be enabled by the division of thestate space of a system into modules to achieve ahigh degree of independent processing. Objectsprovide a scheme to cleanly divide state spaces. Thedemand that everything be divided into looselycoupled modules, that only interact through welldefined interfaces might be perceived as inefficient;but it is precisely this scheme that will mean thatconcurrent solutions can be developed efficientlyand transparently to the programmer.

Concurrency should be transparent to theprogrammer, as concurrency is a low levelimplementation consideration; concurrency is how acomputation is done, not what is to be computed.However, there are examples where concurrency ismanifest in the problem domain, such as manysimulation problems like multiple queues, forexample check-outs in a supermarket. Theimplementation issue of concurrency is howprocesses are allocated to processors. Theprogrammer should not be concerned with this,rather what is to be computed, not how. Howsomething is computed is the concern of the targetenvironment, ie., the compilers, operating system,and hardware.

The aim of concurrent processing is to keep allthe processors in a processor array as fully utilisedas possible, so that processor resources are notwasted. There is nothing more mysterious toconcurrent programming than the efficient use of

resources. Keeping all processors busy is aninherently dynamic problem, which the programmercannot determine statically at compile time. All theprocessors can be kept busy, as long as there areenough threads in the system.

In concurrent programming, a thread is a unit ofsequential execution. Concurrency is achieved bythe splitting of threads. A thread can be split when astate changing routine is invoked, but not a valuereturning function, because it must wait for thevalue. State changing routines can easily be invokedon another processor. Object level granularity seemsto be a natural candidate for concurrent processing.An object can have only one update thread at a timeto avoid simultaneous update problems. Other levelsof concurrency are instruction level, and task orprocess level. Task or process level is the level usedin conventional multi-processing systems currentlycommercially produced, and instruction level isquite difficult, best left to instruction pipelines.

Object level is natural for the programmer, andhas the advantage that a programmer can implementa system without taking into account parallelprocessing at all. The same program will run andproduce identical results irrespective of whether thecustomer is running a single processor, or aprocessor array. This way the programmerconcentrates on the model and design of theproblem, not on deployment concerns.

Side effects must be avoided in concurrentsystems. Suppose a computation depends oncombining the results of two functions f and g, suchas f + g. f and g are parameters to the + function.Routine parameters can be computed concurrently,as long as the computation of each causes no sideeffects. If f and g are independent, then they can becomputed concurrently. If however, f produces sideeffects that g depends on, they must be computedsequentially.

C++ does not preclude the use of a globalenvironment. Access to shared global datapotentially causes a thread to lock, and if many suchaccesses occur, the advantage of concurrency is lost.This is because updates to a global environment areside effects. Programming in such an environmentrequires complex locking mechanisms to ensure thatthings happen in the correct order. Locks are ratherlike waiting for a plane to take off when it has towait for another connecting flight. This cannot beentirely avoided, but should be reduced as much aspossible.

It might not be impossible to implementconcurrent processing in C++, but it is difficult as inmany ways C++ is not suited to concurrentprocessing.

Java provides threads. It also removes C featureslike globals that are problematic to concurrency.

Eiffel has a recommendation [Meyer 96c] thatextends Eiffel with a single keyword separate toprovide concurrency. Both Java and Eiffel havesimple concurrency mechanisms due to their cleanerbase than C++.

C++?? 42


3.49 Standardisation, Stability and Maturity

Object-orientation is now nearly 30 years old, sinceSimula 67. Smalltalk is about 20 years old, Ada 95is only one year old, but based on Ada 83, which isabout 13 years old. C++ is 13 years old. Eiffel is 10years old, and Java is just one year old.

The age of a language does not relate to itsstability and maturity. Java is the youngestlanguage, but Java appears to have a well thoughtout and stable language base, also having acomprehensive set of OO libraries. Thus Java is offto a good start, but only time will tell. It already hasquite a number of books.

Ada 95 is one year old. But that is one yearsince the standard was ratified, so it is a good dealolder than a year. Ada 95 is the product of anISO/ANSI/DoD standard. Thus Ada 95 vendorshave a very stable base from which to implement.This gives Ada 95 a good start over other languages,where there might be implementations, but they areshooting at a moving target.

Eiffel is not subject to the ‘formal’ ISO/ANSIstandards; it has its own non-aligned standards bodyNICE (Non-profit International Consortium forEiffel). Eiffel is now in its third incarnation, Eiffel 3that is fully described in Eiffel: The Language[Meyer 92], the Eiffel equivalent of the C++ ARM.However, the definition of Eiffel 3 has been verystable since 1992, requiring only a few extra validityrules, and small clarifications: Eiffel is probably thebest designed language ever intended forcommercial use. The largest change to the languageis now under consideration, which is to add theseparate keyword to allow support for concurrentand distributed processing. This will not affectexisting programs, and early releases ofimplementations with this mechanism are nowavailable. Eiffel also has a standard library. Thestandard library is more changeable than the baselanguage, but is also under the control of NICE.Thus Eiffel has attained a great deal of maturity over10 years, and the standards are very stable. Thisgives Eiffel a considerable advantage in thatlibraries are much easier to update to address newand changed requirements than compilers.Therefore, Eiffel should evolve more quickly intonew problem domains, without the traditionalresistance from compiler vendors.

The most serious problem that Eiffel has facedin the past was stability of implementations. AsEiffel is an ambitious language and environment,many new and difficult concepts have beenpioneered and made into industrial strengthpackages. Eiffel is very demanding on compilers,which need to do things like global analysis, whichis an issue that C++ conveniently avoids. Eiffel doesnot concede to compromises which place burdens onthe programmer in the same way that C++ does.

However, stable forms of Eiffel environmentsare now becoming widely available. In 1996 TowerTechnology has released version 2 of its compiler

and environment, and ISE has announced version 4of its environment, which addresses many issuesthat users did not like previously, and now includesmenus and other facilities, which gives it a moreMacintosh/Windows look and feel. SIG Computerhas also announced its Visual Eiffel for releaseOctober 1996. There is also an independentexperimental version known as SmallEiffel, whichcan be downloaded for free.

Another problem that Eiffel has had is the lackof titles. [Meyer 88] is the classic book on OO,however, it is based on Eiffel 2.0, not version 3.Meyer’s next book “Eiffel: The Language” [Meyer92] is the language lawyer’s reference, but it ispossible to navigate for an overview. However, thereare now over ten titles on programming in Eiffel,quite a few of which are used to teach universitycourses on OO.

Smalltalk is now a widely used language, andhas proven to be very effective in someenvironments. Different implementations ofSmalltalk do not share libraries, and do notinteroperate.

Out of all the languages here, C++ although 12years old, provides the fastest moving target forvendors. It is claimed to be standardised, as it issubject to ANSI/ISO standardisation, but this workis still very much in progress. You can check statusof the standard on the X3J16 WEB page in theWEBliography). The number of issues to beaddressed by the committee keeps increasing, ratherthan decreasing. C++ was submitted to thestandardisation process too early, and the committeehas had to do too much design work that shouldhave been done before C++ was submitted to thestandardisation process.

The committee hopes to progress the standard toCD (Committee Draft) this year (1996). The FAQshows a timetable which will produce an IS byDecember 1998 (see WEBliography:http://reality.sgi.com/employees/austern_mti/std-c++/faq.html#B8). After IS is achieved, it willprobably be several more years before a significantnumber of vendors are fully compliant. By thatstage, users will probably be clamouring for morefeatures and fixes to old problems. I have alreadyheard stories of C++ tool vendors complaining thatthe standard is too horrendous to understand, andthen to implement anything compliant.Standardisation should stabilise the specification,but C++ has continued to become less stable. Thefact that the C++ standard is so unstable indicatesthat the C++ committee realises there are manyshortcomings in C++ that they must rectify. Thereare many flaws that the committee knows about thatI do not cover in this critique, but also many of theflaws that are covered in this critique, the committeehave no intention of addressing, as that would breaktoo many existing programs and C compatibility.

In the preface to [Stroustrup 94], BjarneStroustrup writes “C++ is still a young language.Some of the issues discussed here are yet unknown

C++?? 43


to many users. Many implications of decisionsdescribed here will not become obvious for years tocome.”

Coming to consensus in the C++ world is adifficult task. [Stroustrup 94] states this frustrationas “Dealing with stubborn old-time C users, would-be C experts, and genuine C/C++ compatibilityissues has been one of the most difficult andfrustrating aspects of developing C++. It still is.”

Many comments in [Stroustrup 94] show thatC++ is still a moving target. Garbage collection ismentioned as “when (not if)”. Thus when GC isfitted to C++, developers will be faced with quite atransition in paradigm. All of this uncertainty inC++ might keep the programmers busy, after allmany of them want to code exclusively in C++,while ignoring all else; but it will be very costly forthe companies that are locked into C++.

There are still unresolved things the X3J16committee must sort out, especially in the area of Ccompatibility. [Stroustrup 94] says “The“compatibility wars” now seem petty and boring,but some of the underlying issues are stillunresolved, and we are still struggling with them inthe ANSI/ISO standards committee. I stronglysuspect that the reason the compatibility wars weredrawn out and curiously inconclusive was that wenever quite faced the deeper issues related to thediffering goals of C and C++ and saw compatibilityas a set of separate issues to be resolvedindividually.” Since C compatibility results in somany problems, serious consideration should begiven to this basic tenet of C++.

The C++ community seems to think using afundamentally flawed tool is acceptable and that therest of the world must wait for them to straightenthese issues out, which in many cases isn’t evenpossible. It is also a hidden cost to companies thattheir programmers must continually keep up to date,and abreast of the arguments for and against certainconstructs. Many other languages have solved theseproblems.

As a postscript to this section, I will remark thata lot of argument for or against particular languagesseems to come from people who believe that therewill be an eventual winner in the evolution oflanguages, and they want it to be their favourite, sowill fight for dominance. I can see no evidence thatthis will happen. I think new languages willcontinue to be invented: some will be based oncontinuing mistakes from old languages whileadding new features for compatibility; others willavoid previous errors while adopting newparadigms. I can’t see that the programminglanguage world will ever become stable. If people inthe industry can accept that, then we will haveprogrammers that are more amenable to changelanguage, being able to use the language that is bestsuited for the purpose, and the maturity of languagecriticism will improve, as we see each language as apassing phase, to which we owe no long termallegiance.

3.50 ComplexityThere are several kinds of complexity. This critiquefocuses mainly on the complexity of the C++language itself. When considering complexity, oneneeds to consider the complexity of the developmenttask as a whole. The complexity of the languagemight only be a small part of that.

Apart from the language, we need to considerthe programming environment, that is editors, toolsfor example make, etc., the methodologies andtools, and the supporting libraries.

With C++ the conventional wisdom is often touse a methodology such as OMT. Here the conceptsof the methodology do not exactly match theconcepts in the programming language. Thus youhave a semantic gap, where translation must occur.This translation is costly, and frequently ends inspecifications that do not match what was eventuallyimplemented.

Both Eiffel and BETA see it as important todevelop their methodologies and graphical notationsbased on the same underlying concepts. Theimportance of this integrated approach should not beunder-appreciated.

As for environments, [Stroustrup 94] has thefollowing to say: “Every language in nontrivial usegrows to meet the needs of its user community. Thisinvariably implies an increase of complexity. C++ ispart of a trend towards greater language complexityto deal with the even greater complexity of theprogramming tasks attempted. If the complexitydoesn’t appear in the language itself, it appears inlibraries and tools. Examples of languages/systemsthat have grown enormously compared to theirsimpler origins are Ada, Eiffel, Lisp (CLOS), andSmalltalk. Because of C++’s emphasis on static typechecking, much of the increase in complexity hasappeared in the form of language extensions.”

“C++ was designed for serious programmers andgrew to serve them in the increasing large andcomplex tasks they face.”

P.J. Plauger in [Plauger 93] argues that thecomplexity of C++ has put it on par with PL/I, Ada(83) and Algol 68. He does not accept thecomplexity in C++ as a good thing. Criticising thecomplexity of Ada is somewhat unfair. An amountof Ada’s complexity is due to its support ofmultitasking and real-time programming. Simulaalso has facilities for co-routines and processes, andAda and Simula are reasonably unique for theirinbuilt support of these facilities. In the 1980s, theneed for such facilities was not widely recognised.However, the need for concurrency and distributionis now becoming recognised.

Another feature of Ada that might contribute tothe perception of complexity is genericity. Again thecharge that this makes the language over complex isbased on not understanding genericity. I havealready covered this topic in the section ontemplates. Thus Ada has been criticised for beingcomplex, but most of this criticism is due to not

C++?? 44


understanding essential features such as genericityand concurrency.

Many C programmers have been guilty ofdismissing features they don’t understand ascomplexity, and Ada has been a favourite target. Iam not saying that Plauger is in this category, as hemakes some valid points about Ada. But theaccusation of complexity against Ada should not beoverstated as it has too frequently emotionally beenin the past. In the computing industry, there is a lowlevel of understanding and experience that one musthave before becoming and expert or vocal critic,particularly of languages like Pascal and Ada.

C++’s complexity is not solely due to static typechecking. Eiffel is more strongly type checked thanC++, but doesn’t suffer from the same complexityproblems.

As for the environment. The burden ofenvironment is far less for the cases of Eiffel, Javaand Ada 95. In Eiffel, a separate simple languageexists, LACE to specify to the compiler how tocompile the program. This contains such things asenvironment variables, debug and other options, etc.It also provides the basis for separation of concernsso that environmental details are completelyremoved from the Eiffel language. Eiffel is alsointegrated with complete editing and developmentenvironments.

Java has removed such environmentalconsiderations as #include and make. EdmondSchonberg writes that the environmental baggagefor Ada and Ada 95 is far smaller than C++ (seeWEBliography for his Ada contrast to C++).

The Eiffel libraries are very large andcomprehensive; but this only reflects the richness ofdata structures that exist, and the number ofapplication domains. Eiffel libraries are available fornetworking, compiling and parsing, Windowsprogramming as well as platform independent userinterfaces and many other things. The Eiffel librariessimplify naming complexity by standardising thevocabulary between classes. For example, put isused to enter an item in any collection data structurelike ARRAY, LIST, QUEUE, and even STACK wherethe routine would normally be named push. Thelibraries enable the complexity of specific domainsto be removed from the language, which is simpleand yet general purpose.

Smalltalk also has a large library, which extendsan otherwise small and simple language. Classesthat a programmer adds also become part of theSmalltalk environment.

Java also provides a comprehensive library todeal with many aspects, including java.net, java.awt(abstract windows toolkit), etc. Eiffel, Smalltalk andJava do not ignore the issue of complexity; they putit where it should be: in the libraries. In terms ofcomplexity, they implement Stroustrup’s principlethat “what you don’t use, you don’t pay for.” InC++ you pay very much for complexity, as it is inthe language.

C++ can to some extent be extracted from thecomplexity of its environment. But as long as themechanisms of #include persist, theenvironments that C++ is ported to will have toadapt to the C/Unix way of doing things. Where theenvironment is separate from the language, there isno environmental adaptation that needs to be done,and less retraining of programmers for eachenvironment they need to program in.

I can accept that C++ was designed for seriousprogrammers. However, Ada 95 and Eiffel are bothdesigned for the serious software engineer. (Javaremains to prove itself in this arena.) Eiffel inparticular shows that complexity can be dealt within a serious industrial strength software engineeringenvironment.

Complexity is not the necessary companion ofseriousness. This does not ignore the complexity ofany application domain; in fact it enables you tofocus on the complexity of the programming task inhand, not on the complexity of the tool.

3.51 C++: the Overwhelming OOL ofChoice?This headline comes from Cutter Information Corps“Object-oriented Strategies” May 1996 edition.Based on their findings, C++ accounts for 80% ofall OOLs, with Smalltalk running a distant second at11%. They claim that in 1995 OO softwaredevelopment products hit $1.3 billion. However,let’s examine how C++ is used: many Cprogrammers have not wanted to touch C++, butthey do use a C++ compiler to compile their C. Thisgreatly exaggerates the market penetration of C++and the size of the OO market, so it is impossible todetermine the true market penetration of OO. Youare not doing OO just because you are compilingwith C++.

Microsoft and Borland have put most of theirdevelopment environment energies into C++, so thismakes it attractive to buy a C++ environment, evenif you are just programming C. Probably the truenumber of C++ installations being used for OOwould be between 10-50%, which cuts down thesize of the OO market by a large amount, the size ofC++’s predominance in that market, and means theother OOLs in the market have a much highersignificance than Cutter makes out. Smalltalk andEiffel are pure OOLs, so every one of their sales youcan count as an OO installation, whereas the same isnot true of C++. Measured C++ sales are riding onC’s success. C++’s success is less thanoverwhelming. It is a marketing success, rather thana technical or programming success. Companiesusing C++ are paying for it with longer cycledevelopment times, and less reliable end product.

One way a manager might perceive C++ to be awinner is the sheer number of books one sees in abookshop on C++. This is matched by a hugenumber of courses. An observation about the natureof many of these books is that they are often titledsomething like “How to build a widget in C++,” or

C++?? 45


“Compiler Construction in C++.” “Books appearlike mushrooms after rain” [Plauger 93].

The mushrooming book market is a great boonfor publishers, as it implies that for every possiblesoftware artefact you can build, they can publish abook about it in every possible programminglanguage. All you really need is the books“Programming in C++,” and “How to buildwidgets,” or “Compiler Principles andConstruction.” Then your programmer needsexperience, lots of it. Don’t be fooled by this trick toget a high title count.

Many C++ books are on how to avoid the trapsand pitfalls, and develop rigorous coding standards,which might appeal to management as the solution,but they don’t solve the root cause of the problem.Making sure everyone is well trained and versed inthese style standards is an expensive and usuallyineffective band-aid measure, especially wheredifferent companies have different standards andexpectations, so you need to retrain every newrecruit, who will probably decide they don’t likeyour way of doing it anyway, and leave after a shortperiod. Of course you can satisfy yourself that hisdissatisfaction was due to his inappropriateness foryour organisation, which is better organised thanmost. After all, you are ISO 9000 accredited and areturning out a very successful line of ‘concrete life-jackets’ (a Tom Peters quote).

[Sakkinen 92] observes the “Endemic C++Culture.” He notes that too many courses on“design” have the appended clause “with C++.” Thisis because C++ has its own curious terminology,which is in many ways different to the rest of theOO world. He makes a case that concepts andprinciples should be taught, then how to map themonto any particular language.

Of course books are aimed at differentaudiences: professionals versus those who justprogram for a hobby; those who have an academicinterest in languages; implementors of compilersand other language processing tools, who needformal non-ambiguous statements about how thelanguage works; beginners versus those for whomthis is their fourth or fifth language. C++ should notbe for beginners, as it is better to learn the principlesfrom a clearer language than be confused by what allthe syntactic knobs and dials, and superfluousconstructs do in C++.

As for courses, C++ has proven so difficult tolearn that you need lots of courses. Not only do youneed to learn the language, but the complexities ofthe environment add an even more substantialoverhead. It will probably be best to start on C++with a course. However, with simpler languagessuch as Java and Eiffel, buying a good book, andself experimentation will quickly cover every aspectof the language. It is a bonus if you can get a course,but it is not essential to get started.

4. Generic C CriticismsThese criticisms apply to the C base language, but ingeneral adversely affect C++. R.P. Mody [Mody 91]gives an excellent general criticism of C. Mody saysthat to properly understand C you must understandthe insides of the compiler, giving many examplesof how C obscures rather than clarifies softwareengineering. He concludes that he is “appalled at themonstrous messes that computer scientists canproduce under the name of ‘improvements’. It is toefforts such as C++ that I here refer. These artefactsare filled with frills and features but lack coherence,simplicity, understandability and implementability.If computer scientists could see that art is at the rootof the best science, such ugly creatures could nevertake birth.”

4.1 PointersC pointers are a low level mechanism that shouldnot be the concern of programmers. Pointers meanthe programmer must manipulate low level addressmechanisms, and be concerned with lvalue andrvalue semantics, which are machine oriented andnot problem oriented as you would expect of a highlevel language. A compiler can easily handle suchissues without loss of generality or efficiency.Memory models of different environments oftenaffect the definition of pointers. Memory modeldetails such as near and far pointers should betransparent to the programmer.

The programmer must also be concerned withcorrect dereferencing of pointers to accessreferenced entities. Use of pointers to emulate byreference function parameters are an example. Theprogrammer has to worry about the correct use of&s and *s. (See the section on function parameters.)

Pointer arithmetic is error prone. Pointers can beincremented past the end of the entities theyreference, with subsequent updates possiblycorrupting other entities, which is a major source ofthe undetected inconsistencies, which result inobscure failures, discussed in the section oncorrectness. In the STL library, iterators areprovided as the generalisation of C pointers foraccess to elements of structures such as arrays.

Programmers can by-pass encapsulation withpointers; C undermines OOP by providing amechanism where state outside an object’sboundaries can be changed. Since pointers areintrinsic to writing software in C this exacerbatesthe problem. Pointers as implemented in C make theintroduction of advanced concepts like garbagecollection and concurrency difficult.

Another consideration is that dynamic memoryimplementations vary between platforms. Someenvironments make memory block relocation easierby having all pointers reference objects via a masterpointer which contains the actual address of theblock. The location of the master pointer neverchanges, so relocation of the block is hidden fromall pointers that reference it. When the block is

C++?? 46


relocated, only the master pointer needs to be up-dated.

On the Macintosh, for example, the doubleindirection mechanism of ‘handles’ facilitatesrelocation of objects. Object Pascal makes handlestransparent to the programmer. This is similar to theUnisys A Series approach where object descriptorsaccess target objects via master descriptors that storethe actual addresses of objects. On the A Series thisis transparent to programmers in all languages, asthis transparency is realised at a level lower thanlanguages. The A series descriptor mechanism alsoprovides hardware safety checks that mean thatpointers cannot overrun, and arrays cannot beindexed out of bounds. C cannot be implementedparticularly well on such machines, as C’s pointermechanisms are lower level than the targetenvironment.

Simpler environments might not provide objectrelocation, so double indirection would be anunnecessary overhead. In order for programs to beportable and efficient in different targetenvironments, such system details should be theconcern of the target compilation system, not of theprogrammer.

C’s pointer declaration syntax causes anothersmall problem:

int* i, j;This does not mean, as might be easily read -

int *i, *j;

but

int *i, j;

and should be written thus to avoid confusion.Java has abolished pointers as “Most studies

agree that pointers are one of the primary featuresthat enable programmers to put bugs into their code.Given that structures are gone, and arrays andstrings are objects, the need for pointers to theseconstructs goes away,” [Sun 95]

Eiffel also has no pointers only objectreferences. In Eiffel, the exact referencingmechanism does not matter. For example in theexpression x.f the reference x might be a pointer toan object in the same address space, or it might bean Internet address of an object. References enablethe location and access method of an object to betransparent.

4.2 ArraysPage 137 of the C++ ARM notes that C arrays arelow level, yet not very general, and unsafe. Page212 admits, “the C array concept is weak andbeyond repair.” Modern software production is lessdependent on arrays, especially in the object-oriented environment. The trade off to be optimal,rather than general and safe no longer applies formost applications. C arrays provide no run-timebounds checking, not even in test versions of

software. This compromises safety and underminesthe semantics of an array declaration, ie., an array isa particular size, and can only be indexed by valueswithin the bounds of the array. The array size mightnot be determined at compile-time, but dynamicallyat run-time. An index to an array is a parameter inthe domain of the array function. An index out ofbounds is not a member of the domain, and shouldbe treated as severely as divide by zero. But in Cthis is another significant source of undetectedinconsistency, which can result in obscure failures.

C has no notion of dynamically allocated arrays,whose bounds are determined at run time, as inALGOL 60. This limits the flexibility of arrays.You cannot resize C arrays. Multidimensional arraysare only really one dimensional. You cannotindividually resize the rows of a multidimensionalarray. The C definition of arrays compromises bothsafety and flexibility.

There are many ways you can undermine arraysin C and C++, as an array declaration is really justequivalent to a pointer. The following examplecomes from [GWS 94]:

char *str = “bugy”;

then the following are true:

0[str] == ‘b’;*(str+1) == ‘u’;*(2+str) == ‘g’;str[3] == ‘y’;

This is amazingly flexible syntax for something asinflexible as C arrays, which is against Meyer’s“Principle of Uniqueness” (see introduction),providing several ways to do the same thing, butstill not doing it particularly well.

The unsafeness of C arrays is shown in the nextexample:

#include <stdio.h>#include <string.h>main (){ char str[] = "TEST"; char *p = "TEST2"; const char str3[] = "TEST3"; char *p3;

printf ("str = %s p = %s str3 = %s\n", str, p, str3); p3 = &str; strcpy (p3, "some junk"); printf ("str3 = %s\n", str3); str[6] = 'X';

printf ("str = %s p = %s str3 = %s\n", str, p, str3);}

The results (at least from my C compiler) are:

C++?? 47


str = TEST p = TEST2 str3 = TEST3str3 = junkstr = some Xunk p = TEST2 str3 = Xunk

One view of arrays is just another object-orientedentity which should be treated in an object-orientedmanner as a class of data structure. It should haveinterface definitions, and consistency checksinherent in object-oriented systems. Another view isthat an array is an implementation of a function,where pairs of values explicitly map the domainuniquely to the range, rather than being computed.This suggests that Algol was incorrect insyntactically distinguishing arrays by using squarebrackets. An array just maps the input argument (theindex) to the value stored in that location in thearray.

[Ince 92] considers that arrays and pointers neednot be relied upon so heavily in modern softwareproduction, as higher level abstractions such as sets,sequences, etc., are better suited to the problemdomain. Arrays, and pointers can be provided in anobject-oriented framework, and used as low levelimplementation techniques for the higher level dataabstractions. Ince suggests that arrays and pointersshould be regarded in the same way as gotos in theseventies. He suggests that languages such as Pascaland Modula-2 should be regarded in the same wayas assembler languages in the seventies. This applieseven more to C and C++, because pointers andarrays are far more intrinsic in the use of C andC++, with lower level, less flexible arrays. AlthoughPascal arrays are weak compared to those ofALGOL, they are still much better than C arrays.

In both Eiffel and Java, arrays are first classobjects. Both languages have no need of the sizeoffunction. In Java to get the size of an array you usemyArray.length. In Eiffel this is my_array.count.Arrays can also be resized.

Both Eiffel and Java provide bounds checkingon arrays. Java’s checking is built-in. Eiffel’schecking is integrated with the assertion mechanism.

Eiffel goes a step further in array elementaccess. You access an element with the itemfunction as follows:

v := my_array.item (i)

This can also be accessed by an infix operator, @:

v := my_array @ i

The item function is defined as:

item (i: INTEGER): Grequire

lower <= i;i <= upper

This shows how Eiffel’s assertion mechanism isused to document semantics in the interface, as wellas for a checking mechanism.

4.3 Function ArgumentsArguments are a fundamental mechanism for reusein software construction. Without arguments youwould be forced to write a different routine for everypossible input parameter. Arguments allow onealgorithm to be reused on sets of input values.

Arguments pass routines simple values (by-value arguments), or references to entities (by-refer-ence arguments). (Actually, there are morepossibilities than this. [Hext 90] is an excellent texton the possibilities.) Arguments are inputs toroutines, and should not be changed. When memorywas expensive, reusing parameter space couldconserve space. Changing arguments, however, issemantic nonsense, and most languages get thiswrong.

By reference arguments enable a routine tochange the value of an entity external to the routine.Such updates beyond the environment of a routineare side-effects. This introduces a mechanism ofupdating the state space, other than straightassignment (although the routine can use assignmentto achieve the ‘dirty deed’.) The problem is that thestate of an object can be changed without using thewell defined interface of the object, so encapsulationis compromised. By-reference arguments should notbe used to change external entities. Values shouldonly be passed to external entities by the returnvalue of a function. Semantically, this is different toassignment to a reference parameter; data flowsthrough the program in one direction, in viaarguments, and out via return values.Mathematically this maps a value of an input type toa value of an output type. Both input and outputtypes can be compositions of other types, ie., f: I1 xI2 x ... Im -> O1 x O2 x ... On. Abstract data typescan be used to design such systems. This will alsohelp target environments to increase parallelism andconcurrency in a way transparent to programmers.

In object-oriented programming, by referencearguments are used to pass the original object, not acopy. The called routine, however, should notchange the state of the referenced object. Onlycalling a routine in the passed objects interface canchange the state, although introducing side effectsinto arguments like this is dubious and should beavoided. Passing objects by-reference has thedesired effect of the object being given to you,without being yours to change, although you caneffect change in the object. C++ does have a niceconcept called const correctness, which provides amodifier on arguments const which disallows anychanges to that argument.

C shares faulty arguments with many otherlanguages. The interaction of C’s pointermechanism with a faulty parameter mechanism,however, makes C considerably worse than mostother languages. In C, pointers are used to simulate

C++?? 48


by-reference arguments with by-value arguments.The programmer must perform tedious bookkeepingby specifying *s and &s for referencing anddereferencing. Distinguishing between by-value andby-reference arguments is not just a syntactic nicety,included in most high level languages, but avaluable compiler technique, as the compiler canautomatically generate the referencing anddereferencing, without burdening the programmer.Again C adopts operators to provide thefunctionality, rather than a declarative approach,which would centralise decisions and let thecompiler do the rest.

In Java arguments can only be passed by-value(as in C). However, there are no pointers, so passingby-reference cannot be simulated.

Eiffel routine arguments are read-only. Thismeans that they are pass-by-constant which isstronger than pass-by-value, where the argumentsare treated as local variables which may be updated,pass-by-constant disallows this.

4.4 void and void *“Passing paths that climb half way into the void” -Close to the Edge, Yes.

Is void* the C equivalent of an oxymoron? Apointer to void suggests some sort of semanticnonsense, a dangling pointer perhaps? Maybe weshould tell the astronomers we have found a blackhole! While we can have some fun conjecturingwhat some of the obscure syntax of C suggests, aserious problem is that void* declarations are usedto compromise the purpose of the type system. Aconsistent strongly-typed system does not requiresuch facilities. In object-oriented type systems, theroot class of the inheritance hierarchy provides theequivalent of void.

When an entity is assigned to a reference ofvoid* , it looses its type information. When it is as-signed back to a typed reference the programmermust explicitly specify the type information with atype cast. This is error prone and should at leastresult in a run-time check. Without a runtime typecheck, the routines of one class can be mistakenlyapplied to objects of another class, which results inundetected inconsistencies leading to obscurefailures.

As [Stroustrup 94] points out: “having void*unsafe can be considered acceptable becauseeverybody knows - or at least ought to know - thatcasts from void* are inherently tricky.”

Interestingly, void* is the exact opposite ofvoid , so yes this is a programming oxymoron.Void means no object of any type; that is the emptyset. Void* on the other hand means any object ofany type; that is all objects of the all encompassingset, or the universal set of all objects that can existin a system. So void and void* representcomplementary sets.

Eiffel and Java both provide a class that is at theroot of the inheritance tree. In Java it is Object ,

and in Eiffel it is ANY. Any object can be assignedto a reference of these types. In C++ this is providedby void*, but void* is not at the root of theinheritance tree, hence its type unsafeness.

Eiffel also defines the type NONE at the bottomof the inheritance tree, which is a class to which noobjects belong. NONE is the complement of ANYand vice versa. Type NONE has the single valueVoid, which signifies no object. Void is theequivalent of 0 (meaning NULL) in C++. Thismeans that Eiffel’s type system is more consistent,as ANY and NONE reside within the type hierarchyat the top and bottom respectively. However, voidand void* do not fit into the type hierarchy inC++.

4.5 void fn ()The default return type of a function is int . Atypeless routine returning nothing should be thedefault, but this must be specified by void .Syntactically no <type> suggests nothing to return.This is an example of where C’s syntax is not wellmatched to the concepts and semantics. Also a typedfunction can be invoked independently of anexpression, which is a shorthand way of discardingthe returned value, but compromises type safety.Using a typed function as a void should result in atype error.

In fact there should be no such thing as a voidfunction. A void function is a procedure. Proceduresand functions should be distinguished. Thisdistinction belongs to the problem ‘what’ domain. Aprocedure is a routine that changes the state of itsobject, but returns no value. A function should, ingeneral, not cause any change to the state of anobject, but just return some result dependent uponthe objects state. Mathematically, a function is anentity that returns a value of a given type.Procedures are untyped, and do not return a value,so it is incorrect to regard procedures as functions.Functions have more in common with variables thanprocedures. Procedures may have side effects,functions should not cause side effects. Thesedistinctions are useful when consideringconcurrency.

[Stroustrup 94] also voices the opinion thatdefault int is bad. He had tried to make the typespecifier explicit, but was forced to withdraw byusers: “I backed out the change. I don’t think I had achoice. Allowing that implicit int is the source ofmany of the annoying problems with C++ grammartoday. Note the pressure came from users, notmanagement or arm-chair language experts. Finally,ten years later, the C++ ANSI/ISO standardcommittee has decided to deprecate implicit int.”

One improvement in Java is that the result typeof the method is not optional. That is you don’t getint by default. Otherwise, Java does not clean upmost of the deficiencies of C. In order to specify aprocedure rather than function, Java still requires thevoid specifier. Java does discard the C termfunction (which was wrongly used anyway), but

C++?? 49


makes the situation no better by calling bothprocedures and functions methods. Thus there is noclear distinction between procedure and function.Java also allows you to ignore returned values.

Eiffel uses the term routine for called units ofcode and distinguishes that there are two kinds ofroutine, procedures and functions. It isrecommended practice that only procedures changeobject state, and functions do not. Functions alwaysreturn a value. That is they follow the mathematicaldefinition of function that takes a value of one type(the type may be compound, hence multiplearguments), and maps it to a value of another type.

4.6 fn ()We have already seen that C functions are a poorcousin of mathematical functions in the section oninlines. C functions expose implementation detail;that is, whether an entity is implemented as aconstant, variable or value returning routine. Cfunctions are different to the mathematical conceptof a function. C functions are really parameterisedinvokable code, which other languages callprocedures, subroutines, etc. Java calls themmethods. Data can be accessed functionally in themathematical sense, but this is different to insistingthat all data is accessed through a C function.Functional access to data really means that data canonly be retrieved, not assigned to.

Empty parentheses represent the function calloperator in C. Even though ‘()’ is mathematicallooking, it is semantically equivalent toFORTRAN’s CALL, COBOL’s PERFORM, andJSR in assembler. The design of these operators wasinfluenced by the underlying machine architectures.The function call operator is low level, machine andexecution oriented, and in the ‘how’ domain. Truehigh level languages require no such operator, thecompiler realises from the declaration that the entityreferenced is a function and automatically generatesthe machine call operator.

This is opposite to most Unix shells, whereinvocation operators such as ‘run’ and ‘exec’ are notneeded. One of the nice things about Unix shells isthat the set of in-built commands is extensible. Theability to execute file names as commands extendsthe command repertoire. The shell runs executablesand interprets shell scripts. Unix shells do notdistinguish between inbuilt commands, shell scriptsand executable programs. This is a widely acceptedas an elegant and effective convenience. C’s ()operator introduces the equivalent of a run commandinto the language.

No invocation operator exists in the problemoriented domain of high level languages. This isbecause the semantics of a function is to return avalue of a given type. How this value is computed isunimportant: it could be computed by a routineinvocation; by sending a message across a network;by forking an asynchronous process; or by retrievinga precomputed result from a memory location, ie., avariable.

Languages that have an invocation command oroperator have an unnecessary distinction betweenvalue returning routines and constants and variables.

It is trivial for a compiler to providetransparency of view for constant and variableaccess and function invocation. In ALGOL stylelanguages, the compiler automatically deducesinvocation when it sees a name that was declared asa routine, rather than a variable. The compilerknows that the identifier refers to a routine becausethe compiler stores much information about anentity. A compiler can check that the programmeruses the entity consistently with the declaration. Acompiler can generate correct code, withoutburdening the programmer with having to use anexplicit invocation operator. This enhancesflexibility and implementation independence.

Variables and functions should beinterchangeable for optimisation. ‘()’ is a goodexample of where the operator approach of low levellanguages adversely affects flexibility as opposed tothe declarative approach of high level languages. InC, it is not possible to change a function to avariable without removing all the ‘()’, or a variableto a function without adding ‘()’ to all theinvocations. This might be spread over many files,and the programmer might not bother withoptimisation to avoid the tedium of the task. So the() operator reduces flexibility. The () operator isanother bookkeeping task imposed on the Cprogrammer. The C++ recommended style is tocode superfluous accessor functions to blur thedistinction. Pure functional languages such as SMLremove the variable/function distinction altogether,by not having variables at all.

Java has made no improvement here. The visibleimplementation difference between variables andfunctions remains. Eiffel removes this distinction asconstants and variables are accessed functionally. Aprogrammer can flexibly change a variable to afunction in a class interface and vice versa foroptimisation or extension, without the need for allclients to change their code. Thus even thoughchanges have been made, the class interface remainsunchanged.

C also has pointers to functions. Functionpointers are analogous to the call by name facility inALGOL, and this was recognised as having pitfalls.Consistent application of the object-orientedparadigm avoids the need for function pointers. Acommon use of function pointers is to explicitly setup jump tables. Jump tables are the mechanismbehind virtual functions. The design of a programcan take advantage of this fact, without resorting toexplicit jump tables. Another use is to jump to afunction in a table that is indexed by an inputcharacter. A switch statement can cater for thismechanism that makes what is meant explicit, whilekeeping underlying mechanisms (and possiblyoptimisations) transparent. C++ allows functionpointers to member functions to be stored in tables(via the .* and ->* operators).

C++?? 50


4.7 fn (void)In C f() means the function f can take any numberof arguments of any type without type check. ANSIC has adopted f(void) to mean a function thatreally has no arguments. C++ sensibly differs fromthis in that f() now means a function that has noarguments [Stroustrup 94].

4.8 Metadata in StringsThe implementation of strings in C mixes metadatawith data. Metadata is information about an object,but is not part of the data itself. Examples ofmetadata are addresses, size and type information.Such metadata is often referred to as datadescriptors, and can be kept independently of thedata, with the advantage that the programmer cannotmistakenly corrupt the metadata.

In C strings, metadata about where stringsterminate is stored in the string data as a terminatingnull byte. This means that the distinction betweendata and metadata is lost. The value chosen as theterminator cannot occur in the data itself. Sinceinserting a null is often the responsibility of theprogrammer, not the run-time environment, there isthe potential for more undetected inconsistenciesresulting in obscure failures.

A common alternative is to store a length byte ina fixed location preceding the string as Pascal does.The advantage is that the length of a string is easilyobtained, without having to count the number ofelements up to the terminating null. Anotheradvantage is that 0 is a valid value in a string. Thisimplementation is hidden from the programmer andother methods could be used without theprogrammers having to change the program. C’snull terminator makes the implementation visible tothe programmer.

Java’s strings are first class objects. You can’tdetermine the length of a string by scanning for anull. You use the string.length method (function).Eiffel’s strings are also first class objects.

4.9 ++, --The increment and decrement operators are oftenused as an example that C was designed as a highlevel assembler for DEC PDP machines. Theseoperators provide a shorthand convenience, but areunnecessary. There are no less than four ways toperform the same thing -

a = a + 1a += 1a++++a

For full generality, only the first form is required;the last two forms a++ and ++a are the postfix andprefix forms, which can be used in the context ofanother expression. Thus several updates can beperformed in one expression. This is a verypowerful and convenient feature, but introduces sideeffects into an expression that sometimes have

surprising effects, and can lead to program errors.The following example is given on p.46 of the C++ARM -

i = v[i++]; // the value of ‘i’ is // undefined

The ARM points out that compilers should detectsuch cases, but the exact interpretation appears to beleft to the implementation, which contributes to non-portability. If this can’t be defined for a sequentialprocessor, then it is even worse for a concurrentenvironment.

The shorthand += and -= are more powerful asvalues other than 1 can increment the variable. It hasbeen suggested that there should also be &&= and||= operators.

If it is believed that a multiplicity of operators isrequired to produce more optimal code, then itshould be pointed out that code generators,especially for expressions, can produce the best codefor a target architecture. A plethora of operatorscomplicates the task of an optimiser. A compiler canoptimise well beyond what a programmer can do.An optimising compiler will analyse thesurrounding code, and if an entity is used severaltimes in a local scope, it will keep the value of thatentity handy locally at the top of a stack, or in aregister, rather than retrieve it from slow mainmemory several times. The nature of suchoptimisations depends on the machines architecture,which a programmer should not need to be aware of.Open systems demands that programs can be portedamongst diverse architectures and environments,very different to the original machine, and not onlyrun, but run efficiently. Optimisers work best withsimple, well defined languages.

In fact constructs such as:

while (*s1++ = *s2++);

might look optimal to C programmers, but are theantithesis of efficiency. Such constructs precludecompiler optimisation for processors with specificstring handling instructions. A simple assignment isbetter for strings, as it will allow the compiler togenerate optimal code for different target platforms.If the target processor does not have stringinstructions, then the compiler should be responsiblefor generating the above loop code, rather thanrequiring the programmer to write such low levelconstructs. The above loop construct for stringcopying is contrary to safety, as there is no checkthat the destination does not overflow, again anundetected inconsistency which could lead toobscure failures. The above code also makes explicitthe underlying C implementation of strings, that arenull terminated. Such examples show why C cannotbe regarded as a high level language, but rather as ahigh level assembler.

Memory update is a problematic, but necessarypart of programming. A language should provide itin a consistent and expected way. Many languagesrecognise that memory update is problematic, and

C++?? 51


typically only provide the assignment operator as asufficient update mechanism. (Many languages haveblock memory copies as well, but assignment canprovide block copy.) Furthermore, many languagesavoid side-effects by limiting updates to only oneper statement. C provides too many ways to updatememory. These add nothing to the generality of thelanguage, increase the opportunity for error, andcomplicate automatic optimisation. Restrictivepractices are justifiable in order to accomplishcorrectly functioning and efficient software.

Java retains the ++ and -- operators, althoughwith the removal of pointers and the addition of adecent string class, they are less necessary foridioms such as string and array manipulation. It isnot clear whether they could cause side effects andsubsequent problems as in C.

Eiffel has no such operators. They would merelybe an unnecessary shorthand in Eiffel.

4.10 DefinesThe define declaration -

#define d(<parameters>)

has a different effect to -

#define d (<parameters>)

The second form defines d as ‘(<parameters>)’.Extra white space between tokens should not affectsemantics of constructs.

#defines are poorly integrated with thelanguage. The ‘#define ’ must be in column 1,and is not subject to scope rules. Defines can lead toobscure errors, as the preprocessor does not detectthem, but leaves them for the compiler.Programmers must be familiar with the particularpreprocessor implementation on their system, aspreprocessor implementations are different,particularly between Classic C and ANSI C.

#define also exhibits a multiple updateproblem:

#include <stdio.h>#include <string.h>

#define dfn(x,y) ((x)<(y)?(x):(y))

main (){ int i, j, k;

k = dfn (i++, j);

printf ("i = %d j = %d k = %d\n", i, j, k);

i = 0; j = -1; k = dfn (i++, j);

printf ("i = %d j = %d k = %d\n", i, j, k);

i = 0; j = 5; k = dfn (i++, j);

printf ("i = %d j = %d k = %d\n", i, j, k);}

The results are as follows:

i = 1 j = 0 k = 0i = 1 j = -1 k = -1i = 2 j = 5 k = 1

This is even worse, if the actual parameter you passis a function that updates other variables. All thevariables will be updated the number of times theformal argument appears in the body of the define.

C++ at least reduces the need for defines byhaving inline functions. The problems with inlineshave been discussed in their own section.

Java and Eiffel have no such preprocessingfacilities. Where #defines are used as ‘cheap’functions, ie., the code of the define is expandedinline in the invoking code, Eiffel and Java inlineroutines that meet certain criteria, without the sideeffects of #define .

#defines have often been used to provide aform of unrestricted genericity. In languages wheregenericity and templates are provided, this use for#defines disappears.

[Stroustrup 94] says he would like to see thepreprocessor abolished: “The character and fileorientation of the preprocessor is fundamentally atodds with a programming language designed aroundthe notions of scopes, types, and interfaces.”

4.11 NULL vs 0[Ellemtel 92] recommends that pointers should notbe compared to, or assigned to NULL, but to 0.Stylistically, NULL would be preferable. It wouldalso allow for environments where null pointershave a value other than 0. ANSI-C, however, hassubtle problems with the definition of NULL.

[Stroustrup 94] adds that “nothing seems tocreate more heat than a discussion of the proper wayto express a pointer that doesn’t point to an object,the null pointer.” And, “The ARM further warns“Note that the null pointer need not be representedby the same bit pattern as the integer 0.””Continuing on: “The warning reflects the commonmisapprehension that if p=0 assigns the null pointerto the pointer p, then the representation of the nullpointer must be the same as the integer zero, that is,a bit pattern of all zeros. This is not so. C++ issufficiently strongly typed that concept such as thenull pointer can be represented in whichever way theimplementation chooses, independently of how that

C++?? 52


concept is represented in the source text.” Nowonder people are confused, and there is muchheated debate.

In Java null is a reserved word. Eiffel usesVoid, the single value of type NONE, to indicate noobject is referenced.

4.12 Case SensitivityIt is good to adopt typographic conventions fornames, which make a program more readable, butshould not affect semantics. Distinguishing betweenupper and lower case in names can cause confusion,which leads to errors and systems that are difficultto maintain and modify. Case distinction is based onthe implementation paradigm of how charactercodes work. Why do we have names? To giveentities identity, and aid our memory of thatidentity. Philosophically, case distinction is contraryto the fundamental purpose of names, whichintroduces another form of overloading, thedisambiguating mechanism being the underlyingcharacter codes.

Case distinction makes names harder toremember so is contrary to the purpose of a memoryaid. Remembering command mnemonics or filenames is difficult enough, let alone exactly the lettercase. Your brain remembers the sound fred, not thecharacters used in spelling. In a case sensitivesystem, you must remember the letter case, whetherit was fred, Fred or fREd, etc., greatly complicatingthe memory process.

Names are easier to remember than addresses. Ifwe did not have names, we would have to retrievefiles by addresses, access all machines on theInternet by their TCP address instead of host name,or call people by their social security number.

Case distinction in interactive systems is a pooruser interface, being clumsy to continually use theshift key, which slows typing. Case sensitivity isone of the worst features of the Unix interface.

Consider the paradigm of letters and words.Words are spelt by assembling letters in order.There are 26 distinct letters. With the addition ofdigits 0 to 9, and the underscore character, we havea complete lexical definition for identifiers. Letterscan be written in a number of styles. They can bebold, italic, upper or lower case. Such typographicrepresentations, however, do not change themeaning of a word. Thus if we write ALGOL,Algol, algol, Algol or Algol (or maybe a star), werecognise the word to represent a computerlanguage. The case of the letters or type style doesnot change the semantics.

Case distinction is based on the low levelparadigm of character codes such as ASCII usedinternally in the computer. This weakens thepurpose of using names to replace addresses, asnames are reduced to a string of character codes.

Case distinction also contributes to errors,introducing ambiguity, which as has already beenmentioned, weakens the purpose of names, as

identity is lost. As every programmer will haveexperienced, one character errors are moredifficult to find than one would think. Forexample, if an identifier is declared Fred, anotherone can be declared fred, which are easily mistypedand confused.

We are generally poor proof-readers. Thepsychological reason for this is that the the braintends to straighten out errors for our perceptionautomatically. The human brain is an excellentinstrument for working out what was intended, evenin the presence of radical error. (This makes us goodat difficult tasks like speech recognition.)Programmers must use their powers of concentrationto override the natural tendency of the brain.

Case distinction adds cognitive difficulty. Goodlanguage design takes into account suchpsychological considerations in these small butimportant details, being designed towards the wayhumans work, not computers. Such considerationsof cognitive science make a big difference to theeffectiveness of people, but do not have any impactat all on the efficiency of code generated for thecomputer. What is more important, people orcomputers? With C the answer is often computers,as case distinction saves compiler processor cycles.

Case distinction provides a form of nameoverloading which is a double-edged sword as itleads to ambiguity, confusion and error. Nameoverloading, as has been suggested in the section onname overloading, should only be provided incontrolled and expected ways, where overloadingprovides a useful function such as moduleindependence or polymorphism. Where a name isoverloaded in the same scope the compiler shouldreport an error.

Another example of name overloading error is:

class obj{

int Entry;

void set_entry (int entry){

entry = Entry;}

}

If you have not spotted the error in the aboveexample, what was it supposed to mean?

A common practice in C is to representconstants in upper case. This is actually badpractice, as a calling programmer should invoke aconstant as a function that returns a value. Thecalling programmer does not need to know whethera class has implemented a feature as a constant,variable or value returning routine. This means thatthe class is free to change the implementation of thefeature later, without having to bother allprogrammers to change the case of all occurrencesof the identifier in order to follow some style rule.

C++?? 53


It is amazing the passion that comes from thosewho defend case sensitivity. In fact, since I haveargued for case insensitivity, some have said thatthis invalidates the whole of my critique of C++because I don’t agree with them on this point. Theonly point that is close to being valid for casesensitivity is that it forces all programmers to followthe same typographic convention for identifiers.This assumes that the burden of typographicconsiderations must be on programmers. I don’tthink it should be. This burden should be on thepresentation medium, that is the editor or printformatter. For example, a program editor will knowwhat an identifier is, and present it in lower case. Orit could even do this optionally, as someprogrammers might like to see identifiers in uppercase, while others in lower case. This gives the bestenvironment, where each programmer can tailor totheir individual taste, and silly fights over style rulesare forgotten.

Java has not improved this situation. In fact it iseven worse, as Java uses Unicode instead of ASCII.The typographic form ‘a’ and ‘a’ could be differentidentifiers if one represents LATIN small letter a,and the other CYRILLIC small letter a. In Eiffel allwords are case insensitive.

4.13 Assignment OperatorUsing the mathematical equality symbol for theassignment operator is a poor choice of symbols;assignment is not equality (:= != =). Designers ofALGOL style languages realised they weresemantically different, so took the care todistinguish, only using ‘=’ in the sense ofmathematical equality assertion. In C the confusionof notation leads to error, being easy to use =(assignment) where == (equality) is intended.

This leads to a more general criticism of C, inthat it has a pseudo mathematical appearance. Butthen C is not very mathematical at all, as ‘=‘ doesnot represent equality, and C functions are not reallyfunctions. Few people are proficient at interpretingmathematical theorems, most passing over suchsections in text, making the assumption that themathematics proves the surrounding text. Thepseudo-mathematical appearance of C is difficult toread, while lacking the semantic consistency andprecision of mathematical notation. One of the keysof reusability is readability.

Java also uses the = symbol to mean assignment,so this has not improved. However, the = vs ==confusion has been improved as in the syntax:

if ( Expression ) Statement

the Expression must have type boolean, or acompile-time error occurs.

Eiffel makes the clear distinction between theassignment operator choosing the ‘:=‘ symbol andmathematical equality ‘=‘.

4.14 char; signed and unsignedWhat is the meaning of +’a’ , -‘b’ , etc.; there issimply no real world equivalent. In C char ,unsigned char , and signed char yieldthree distinct types all occupying 8 bits. These typesare integers rather than characters. The definition ishighly platform dependent, and the semantics isnonsense. Pascals technique of specifying integersubranges: 0..255, -127..+127, -63..+154, and soforth is far superior.

4.15 SemicolonsAs with case sensitivity any discussion of this topicarouses passions that you wouldn’t believe. BjarneStroustrup makes a very good observation on suchdebates: “Curiously enough, the volume of interestand public debate is often inversely proportional tothe importance of a feature. The reason is that it ismuch easier to have a firm opinion on a minorfeature than on a major one; minor features fitdirectly into the current state of affairs, whereasmajor ones - by definition - do not.”

I am not overly concerned whether thesemicolon is defined as a terminator or separator.Arguments that languages which define thesemicolon as terminator are superior to those thatdefine it as separator are, however, baseless. Thesemicolon as separator is really quite logical,viewing the semicolon as a statement sequencing orconcatenation operator. It is therefore a binaryoperator, requiring both a left and a right hand side.Some people claim to find this concept difficult tounderstand, but if we consider it in the context of amathematical expression, it would be silly to expectthat an addition be written as:

a + b +

Another way to look at a separator is to consider thestructure of a program. A program is a list ofelements. The executable part of a program is a listof sequentially executed instructions. Elements in alist must be separated, and the semicolon is syntaxto separate elements in a list. The semicolon istherefore part of the syntax of the list, not part of thesyntax of the individual instructions. Languagessuch as FORTRAN separated instructions byrequiring that they be placed on different lines orcards. If an instruction overflowed a line, acontinuation character was required, like thebackslash in C. Well defined languages do notrequire continuation characters, as line breaks areunimportant, and have no effect on semantics.Languages should have very regular grammars, sothat the semicolon could be an entirely optionaltypographic separator.

In natural language both the comma andsemicolon are separators, only the full stop is aterminator. If the comma were an expressionterminator rather than separator, functioninvocations would look like:

C++?? 54


fn (a, b+c, d, e,);

It is often argued that the semicolon as separatorleads to irregularities. C’s handling of the grammarof semicolons, however, leads to an irregularity inif/else’s:

if (condition) statement1; /* Semicolon

required */else

statement2;

if (condition){ statement1;} /* Semicolon must be omitted */else statement2;

This is an irregularity, as a parser will reduce bothof the above to the grammatical form:

“if” <condition> <statement>“else” <statement>

In fact why do conditions in C if and whilestatements have to have parentheses around them?Why also must a semicolon follow the closing braceof a class, but must not follow the closing brace of afunction?

Java being C based retains the semicolon asterminator. Eiffel views the semicolon as aseparator, but has one advantage: semicolons areoptional. The semicolon can be used to visuallyemphasise the separation between two commands,for example, where two commands are placed onone line.

4.16 BooleansA serious omission from C was the boolean type.Booleans are fundamental to programming asconditions in if ..then and loop constructs. C++ alsohas no built in boolean. It is interesting to see longInternet discussions on how booleans should bebuilt, and how to represent the values, true and false.Using 0 to mean false, but any other value to meantrue is unsatisfactory.

Java includes the basic type boolean, and so hasrectified this situation. To accomplish C-styleconversions you can use the expressions:

b = (i != 0);i = (b)?1:0;

Eiffel takes a slightly different approach. As alanguage, Eiffel provides the mechanisms forbuilding types. It has no assumptions aboutparticular types built into the language. Types likeBOOLEAN are defined as classes in the EiffelKernel Library, as are other basic types such asINTEGER, REAL, STRING, ARRAY, etc. This viewis very similar to Smalltalk. These types are notbuilt into the language, but they are usually built

into an Eiffel compiler so that there is no run-timeperformance penalty. This illustrates Eiffel’sphilosophy of keeping the language as small aspossible, and as open as possible, so thatprogrammers can build their own powerful types.

Recently the ANSI/ISO C++ committee hasaccepted bool as a distinct integral type. Before thedefinition of a boolean type in C/C++ could be anynumber of definitions which had slightly differentsemantics. If you were combining libraries that usedthese slightly different definitions, life could bedifficult. This is probably a fundamental reason whylibraries have not been as successful in C++ as theyshould be in an OO environment. Not all compilerimplementations have implemented bool yet, soyou can expect it to be years before this mess iscleaned up.

4.17 CommentsThe following example comes from [GWS 94].

main (){ int *i, *j; int k;

k = *i/*j;}

As they point out: what a good charactercombination ‘/*’ was for delimiting comments.

4.18 Cpaghe++iThere are three kinds of spaghetti that occur inprograms: gotos, globals, and pointers.

4.18.1 Cpaghe++i GotosMost people know about spaghetti code that ispresent in programs which use gotos in anundisciplined fashion. As Donald Knuth has pointedout it is entirely possible to produce well structuredprograms with gotos. The well tempered gotoemulates high level structured statements such asconditionals, loops, switch or case statements inhigher level languages.

Where a language provides the correct controlstructures, and the programmer programs into thatparadigm, gotos are not needed. The reverseargument could also be made: if gotos cover all usesof high level control structures and even more, whyhave the high level control structures at all; why notjust use gotos? The problem with gotos is that theyare too powerful. They are too powerful in the sameway assembler language is too powerful.

You can do everything with assembler or gotos,but it takes more work, and the result is often lessthan structured, difficult to understand andunmaintainable. The more work you do, the lessefficient you are. It is not working harder that makesyou more efficient, it is working smarter. I’m a greatfan of laziness!

C++?? 55


Consider what you must do to construct a loopwith gotos: you must declare a label, then place thelabel and the goto somewhere; you also have tothink about identifiers for labels that are non-ambiguous. For label identifiers, some languagesuse names, others numbers. With a high level loopconstruct, labels are implicit, meaning theprogrammer does not have this extra bookkeepingoverhead. Then making changes becomes a lot moredifficult, as you must create new labels, move themaround, and delete others.

One legitimate use for gotos is to avoid overlycomplex nesting. Complex nesting usually occurswhere there are many checks that result in multiplynesting if...thens, which often arise because of errorchecking. Proponents of gotos legitimately defendthem for this situation. However, where the controlstructures are right, even this use of gotos is notneeded.

Both Java and Eiffel abandon gotos. Javaprovides an extension to control structures whichallows control structures to be named, and multi-level break and continue statements can be used tojump to an outer level conditional or loop.

In Eiffel the philosophy is to program insufficiently small atomic routines, so that multi-level control structures are avoided. Thus Eiffel’ssolution to the nesting problem is integrated with itsroutine mechanism and the way programmers areexpected to use routines. In object-orientedprogramming, it is good practice to keep routinessmall, with only one operation in a routine, as thisenhances the possibility of reuse. Someprogrammers will object to small routines, as thereis an overhead to routine calls, particularly inregister based machines, where environments andregisters must be saved. However, an Eiffelcompiler will automatically inline small, non-polymorphic routines.

The high level language concept to remove theneed for gotos altogether for error checking isexception handling. In this mechanism, the errorcondition triggers an exception. When an exceptionis raised, a search for its handler occurs. This searchprogresses down the run-time stack until anembedded exception handler is found. In Eiffel,exception handlers are specified in rescue clauses.Note that in an environment where exceptions caninterrupt the flow of the code, garbage collection iseven more important, as in a system with manualmemory management, it is even more difficult todetermine where to clean up, and which objects todispose.

If exception raising and handling soundsexpensive, then it should be realised that it oftenworks out cheaper. Most of the time, the code runsnormally, an exception being raised is the exception.Only then is the stack search for the handlerperformed. The mechanism actually works outcheaper in many cases. Consider divide by zero. Inmost systems, this exception is detected by theprocessor. If you don’t have exception handling, you

must test that the divisor is not zero before a divideoperation. With exception handling, you assume thatthe division will work in most cases, and so do nothave to test. If the divisor is zero, you simply cleanup in the exception handler. Only if there is noexception handler does the software fail.

The bottom line is that with the common highlevel language constructs of if ..then, loops, cases,you can avoid most uses of goto. Add a high levelconstruct for exception handling, and you can avoidgotos altogether.

4.18.2 Cpaghe++i GlobalsThe second kind of spaghetti is globals. Where twoor more objects access the same set of globals,interdependencies arise between those objects. Thismakes it far more difficult to determine thecorrectness of a program, even more so inconcurrent environments. These interdependenciesshould be viewed as strands of spaghetti wormingtheir way through a system, which are going tomake maintenance, extension, and reuse difficult inthe future.

Globals can be abandoned. Objects are toglobals as control structures are to gotos.

Again Java and Eiffel abandon globals, and thusease the problems of maintenance, extension andreuse. Note that I use the word ease, not solve. Eventhough Java and Eiffel make significantimprovements, there are no silver bullets to solvethe problems involved in programming. Java andEiffel are significant improvements.

4.18.3 Cpaghe++i PointersThe third kind of spaghetti is pointers. The problemswith pointer based programming are well known.The kind of spaghetti you get worming through thesystem is undisciplined pointers pointing to otherelements, by-passing the whole concept of interfacesand object-orientation. Pointers introducedependencies that would not otherwise be there.Furthermore, this can of worms results in danglingreferences and memory leaks. In order to do awaywith the problems of pointers, garbage collection isnecessary. In order to implement good garbagecollection pointers must be abandoned. C++ iscaught in this Catch-22.

Neither Eiffel nor Java have pointers. Both havegarbage collection built in from scratch.

While C++ overlays object-oriented conceptsonto C, it is one of its greatest weaknesses thatoverlays OO on top of the spaghetti of a now old,low-level and flawed language. C++ does notenforce the advantages of the OO approach toremove these problems by programming only usingpublished interfaces. The advantages of the OOparadigm are so effectively undermined in C++ as tobe worse than useless. Many C programmers havethus stuck to C, and people like P.J. Plauger havebeen motivated to write papers such as“Programming Language Guessing Games: If C++is the answer, what’s the question?” [Plauger 93].

C++?? 56


5. ConclusionsC++ is complex including too many constructs toovercome problems with itself and C, while lackingsophisticated mechanisms such as garbagecollection, global analysis and automaticoptimisations. C is thought of as being a simplelanguage; but this is doubtful, as it has manyoperators, and a difficult precedence system. C’spointer style of programming is low level anddifficult. Overall, C has many traps that lead todifficult to detect errors in software. Now C++ as alanguage is looking like the equivalent of computersof the 1950s, with large knobs, dials and patchpanels; the C++ equivalents being pointers,structures, unions, #defines , etc., all of whichhave no place in a modern OO language, and are notin Java and Eiffel.

Compared to other OO languages, C++ looksmore and more like an anachronism. C++ is nowimpeding the progress of the programmingtechnology.

Object-oriented languages should providesophisticated concepts in the simplest possibleframework. In C++ the framework is not simple andthe concepts are obscured. OOP addresses manyissues in order to facilitate the production ofcomplex and sophisticated programs. Many of theseissues are addressed in implicit and subtle ways, butare lost in C++. Subtle errors can be introduced intoC++ software in many ways; the combination ofthese causes further problems. C++ has devices forpetty convenience, even the ‘++’ itself, whilesacrificing major conveniences, long-termcorrectness and safety, and the convenience ofdeclarative programming, rather than operators. C++forces the programmer to perform manyadministrative bookkeeping tasks that a compilershould automate.

It can be considered: what application domain isC++ relevant for? The answer to this is that C++might be used as a better C. But for whatapplications is C relevant? C is relevant for lowlevel Unix style programming, and is not an ideallanguage in view of its low level nature, and flaws.C is not applicable for large project organisation:hence C++’s attempt to improve it. C++, however,has not solved C’s flaws, as I once hoped it would,but painfully magnified them.

Better languages exist for higher level functionssuch as communications and networks, scientificwork, compilers, etc. I envisage that C has a place asa high level assembler that can be used to implementsmall pieces of code, where efficiency is of primeimportance, on suitable platforms. Thus the use of Cwould be limited and well controlled, rather likesmall assembler routines are currently used in somesystems. Indeed the move to C++ should only beconsidered in the case of upgrading a body of Cprograms for backwards compatibility. In the case ofnew projects alternatives to C and C++ shouldseriously be considered.

A programming language should embody thecollective wisdom of common sense practices thathave been learnt over many years, by common andpainful experience. C++ does not implement muchof this wisdom. [Sakkinen 92] observes that muchof the C++ literature has few references to externalwork or research. It fails to draw on the insights andprogress made by many researchers. This leads meto believe that C++ is parochial and removed fromthe many advances that will make production ofsystems easier and more cost effective.

C encourages gurus who spout false wisdom onobscure subjects. Writing programs in C is oftencalled ‘coding’. Coding is writing obscureencryptions that will later have to be decoded, bynone else than a guru! C also encouragesprogramming by guesswork. C programmers oftensolve ‘bugs’ by adding extra ()s, *s and &s, withoutunderstanding the problem, but then ‘test’ thechange to see if it miraculously ‘cures’ the problem.People who attain proficiency at this guesswork, areknown as, well you guessed it, gurus!!

The view that correctness checks are trainingwheels for students, which gurus don’t need must bedispelled. Many disciplines have techniques toensure correctness. For example, the metronome inmusic is not just for students, but will help anadvanced musician ensure that the tempo of a pieceis correct, and since playing with a metronome ismore difficult it will help sharpen the musiciansperformance of the piece. The musician does notjust view the metronome as an aid for beginners, oras something that restricts him to a set beat, but as atool that helps produce a polished and professionalperformance. C should not be seen as a language towhich you graduate after you have learnt to programin languages with safety checks. In fact changing toC or C++ is a great step backwards. Languages withconsistency and semantic checks are essential aids tothe production of professional software.

A programming language cannot be seriouslyviewed as some authoritarian that stops us doingwhat we want or need to do. This view is still quiteprevalent about languages with type safety andconsistency checks.

This paper has shown many cases where C++uses old C mechanisms to provide things that canand should be expressed consistently within theobject-oriented paradigm. For example type casting.The move to pure object-oriented languages willfacilitate more consistent programming and avoidmany typical errors that occur in softwareproduction. C++ also makes distinctions that belongin the ‘how’ implementation domain. For example,‘.’ vs ‘->’, and variables vs functions. Thesedistinctions make bookkeeping work forprogrammers, which a compiler should handle. Butthen C++ fails to make distinctions that belong inthe ‘what’ problem domain. For example,procedures vs functions. Making distinctions in the‘how’ domain adds inconvenience to the language.Failing to make distinctions in the ‘what’ domain

C++?? 57


limits the expressiveness of the language. Theamount of change required in C++ to address theissues raised in this paper is seen as largelyinsurmountable, and Sun agrees with this.

A programming language is just a tool, in thesame way that an axe is a tool. If the axe is bluntwhen chopping down a tree, then procedures,processes and methodologies could be invented tomake it as effective as possible; but that leaves thereal problem unsolved: that the axe that does the realwork is blunt. So it is with programming languages.To develop a system, it must be implemented, and aprogramming language is the tool to do the realwork. If the language is blunt, then procedures,processes and methodologies might alleviate the sit-uation, but they do not solve the problem. Once theaxe is sharpened, then real progress is made, and theprocedures, processes and methodologies mightbecome more effective, although the need for manyof them will disappear. A good axeman will havegood axe wielding technique, but given a choice ofaxes will choose the sharpest implement. A pooraxeman could be ineffective with even a sharp axe,but the axe maker will still strive to produce thesharpest axe for the good axeman. The argumentthat poor programmers will produce bad programsin any language so we shouldn’t bother with betterlanguages is fallacious.

As mentioned in the introduction, both sides ofthe analysis/design vs implementation debate needto compromise in order to bridge the semantic gap.The perpetuation of low level languages such as Cinto OOP is proof that the implementationcommunity has not compromised, or sharpened itsaxe to bridge this costly gap. On the other hand theanalysis/design community must realise that whatthey do is part of the general practice ofprogramming.

It has been four years since the 2nd edition ofthis critique. The criticisms are still valid, but nowmany people have had first hand experience of beingburnt by the OO hype and trying to implementsystems in C++.

The work on languages such as Java and Eiffelhas vindicated the criticisms previously made in thecritique. [Stroustrup 94] lists as current C++problems many of the criticisms I have also made inthe critique. Java has recognised many shortcomingsin C++ and rectified them. Many of the problemsthat Java fixes are the same problems as addressedin the original critique.

Eiffel serves as another example of betterlanguage design than C++. It has none of theproblems of C++. In Java there still remain a fewdeficiencies, but it is a major advance.

Since the last edition of the critique, manypeople have asked what do I recommend. Whatshould people choose then? Certainly Eiffel is thebest out of these three languages. If you are doinglarge scale system software and applicationdevelopment, then the choice is Eiffel, althoughEiffel is also simple and elegant enough for small

applications development. Eiffel is a language forthe serious software engineer who wants to get onwith the job, not be bogged down in syntactic andmachine-oriented obscurities, weird ‘bugs’ andendless maintenance cycles to get things right.

Java is still an unproven entity for large projects,and the byte code is interpreted. Eiffel and C++ areroughly equivalent in performance. Interpreted Javawill be around 10 times slower. But Java byte codescould be compiled into native code.

For small applets and other Internet loadedapplications, Java is a good choice. Some peoplehave predicated that Java will sweep all away, andthat even Eiffel will die because of this. I cannot seethis, as Eiffel and Java are really significantlydifferent tools. Java has still to be tested in the largescale Eiffel league.

I have not yet mentioned languages such asBETA, Ada 95 or Smalltalk. BETA is still really inacademia. It might make a stronger presence in themarket place in the coming years. If not BETAmight have the same profound influence as Simula.It is certainly something to be watched. Ada 95 iscertainly aimed at serious software engineering.

Smalltalk is already firmly in the market place,and there are a significant number of systems that itis used for. Smalltalk is still a language for seriousconsideration. The biggest question here is do youwant the development speed and flexibility of adynamically typed system as opposed to therobustness and run-time speed of a statically typedsystem? Having answered these questions foryourself the choice between Smalltalk and Eiffelshould be easier.

The most important aspect of C++ that theindustry must realise is that the definition of C++ isunstable. As the X3J16 committee work on C++,more problems are uncovered. It will be years beforea stable standard is reached, and probably years afterthat before compiler vendors are compliant with thestandard.

Today’s C++ programs will be tomorrow’sunmaintainable legacy code. As [GWS 94] says ofC++: “The seeds of software disasters for decades tocome have already been planted and well fertilised.”They compare C++ to COBOL in terms ofunmaintainable legacy code which we have now inCOBOL’s case, and we will have in the future forC++.

Perhaps the most important realisation I hadwhile developing this critique is that high levellanguages are more important to programming thanobject-orientation. That is, languages which have theattribute that they remove the burden ofbookkeeping from the programmer to enhancemaintainability and flexibility are more significantthan languages which just add object-orientedfeatures. While C++ adds object-orientation to C, itfails in the more important attribute of being highlevel. This greatly diminishes any benefits of theobject-oriented paradigm.

C++?? 58


In a nutshell, an object-oriented language thatlacks the qualities of a high level language entirelymisses the point of why we have progressed frommachine coding to symbolic assembler and beyond.Without the essential high level qualities, OO isnothing but hype. Eiffel shows that it is important tobe high level as well as OO, and I hope that thelesson to be learned by any programming paradigm,not just OO, is that the fundamental is to make thetask of programming (that is system development asa whole) easier by the removal of the burden ofbookkeeping.

C++ adds object-orientation to a low levellanguage, so you still have all the bookkeepingburden of C. Java improves this situation byremoving many of the low level features that have aknown bad track record. Eiffel provides a true highlevel base for object-oriented programming.

The concluding advice of this critique is clear.Be wary of C++. Seriously consider the alternativelanguages.

Bjarne Stroustrup writes “My hope is that it willhelp C++ become accepted into areas that C failedto penetrate, and thus support programmers whohave not been represented in the C and C++culture.” [Stroustrup 94] 6.5.3.1. My hope is that theindustry establishes a professional softwareengineering culture, not a programming languageculture based on seriously flawed and arcanelanguages. The software engineering culture is notwell represented in C++.

Ian JoynerOctober 1996

6. BibliographyC++ ARM ELLIS and STROUSTRUP The annotatedC++ Reference Manual, AT&T 1990.

[Adams 96] SCOTT ADAMS The Dilbert Principle,Harper Collins 1996.

[Aho 92] AHO and ULLMAN Foundations ofComputer Science, Computer Science Press 1992.

[Brooks 95] FREDERICK P. BROOKS The MythicalMan-Month, 20th Anniversary Edition, AddisonWesley.

[Bruce 96] KIM B. BRUCE Progress inProgramming Languages, in ACM ComputingSurveys, Vol. 28, No. 1, March 1996.

[Capretz 87] PIERRE J. CAPRETZ French in Action,A Beginning Course in Language and Culture, YaleUniversity Press.

[Cline] MARSHALL CLINE C++ Frequently AskedQuestions, comp.lang.c++ newsgroup.

[DDH 72] DAHL, DIJKSTRA, HOARE StructuredProgramming.

[Deming 82] W. EDWARDS DEMING Out of theCrisis, Cambridge University Press 1982.

[Dijkstra 76] E.W. DIJKSTRA A Discipline ofProgramming, Prentice Hall 1976.

[DM&L 87] TOM DE MARCO and TIMOTHY LISTER,Peopleware: Productive Projects and Teams, DorsetHouse 1987.

[Ege 96] STUART HIRSHFIELD and RAIMUND K.EGE Object-Oriented Programming, In ACMComputing Surveys, Vol. 28, No. 1, March 1996.

[Ellemtel 92] Programming in C++: Rules andRecommendations, Ellemtel TelecommunicationSystems Laboratories, Sweden.

[Flan 96] DAVID FLANAGAN Java in a Nutshell,O’Reilly & Associates 1996.

[GWS 94] GARFINKEL, WEISS, STRASSMANN TheUnix-Haters Handbook, IDG books 1994.

[Hext 90] J.B. HEXT Programming Structures:Machines and Programs. Volume I, Prentice Hall ofAustralia 1990.

[Ince 92] D.C.INCE Arrays and Pointers ConsideredHarmful, ACM SigPlan Notices, January 1992.

[Kilov and Ross 94] HAIM KILOV and JAMES ROSS,Information Modelling: An Object-orientedApproach, Prentice Hall 1994.

[L&S 95] WILLIAM J. LATZKO and DAVID M.SAUNDERS, Four days with Dr. Deming: A strategyfor modern methods of management, AddisonWesley 1995.

[Madsen 93] MADSEN, MØLLER-PEDERSEN,NYGAARD, Object-Oriented Programming in theBETA Programming Language, Addison Wesley1993.

[Meyer 88] BERTRAND MEYER Object-orientedSoftware Construction, Prentice Hall 1988. (2ndedition soon to appear.)

[Meyer 92] BERTRAND MEYER Eiffel: TheLanguage, Prentice Hall 1992.

[Meyer 94] BERTRAND MEYER Reusable Software:The Base Object-oriented Component Libraries,Prentice Hall 1994.

[Meyer 95] BERTRAND MEYER Object Success,Prentice Hall 1995.

[Meyer 96a] BERTRAND MEYER A Taxonomy ofInheritance, IEEE Computer vol 29 No 5 May 1996.

[Meyer 96b] BERTRAND MEYER Using InheritanceWell, Chapter 25 of forthcoming Object-orientedSoftware Construction, 2nd edition Prentice Hall.Draft available on internet athttp://www.eiffel.com/doc/manuals/technology/oosc/inheritance-design/

[Meyer 96c] BERTRAND MEYER Concurrency,Distribution and the Internet, Chapter 28 offorthcoming Object-oriented Software Construction,

C++?? 59


2nd edition Prentice Hall. Draft available on internetat:

http://www.eiffel.com/doc/manuals/technology/concurrency/CONCURRENCY.html

[Mody 91] R.P.MODY C in Education and SoftwareEngineering, ACM SIGCSE Bulletin Vol.23 No. 3September 1991.

[Morgan 90] CARROLL MORGAN Programmingfrom Specifications, Prentice Hall 1990.

[P&S 94] JENS PALSBERG and MICHAEL I.SCHWARTZBACH Object-oriented Type Systems,Wiley 1994.

[Plauger 93] P.J. PLAUGER Programming LanguageGuessing Games: If C++ is the Answer, what’s thequestion?, Dr Dobb’s Journal, October 1993.

[Reade 89] CHRIS READE Elements of FunctionalProgramming, Addison-Wesley, 1989.

[RBPEL91] RUMBAUGH, BLAHA , PREMERLANI,EDDY, LORENSEN Object-Oriented modelling andDesign, Prentice-Hall, 1991.

[Sakkinen 92] MARKKU SAKKINEN Inheritance andOther Main Principles of C++ and Other Object-oriented Languages, University of Jyväskylä, 1992.(Also published as selected papers in ECOOP ‘88,Computing Systems Vol. 5 No. 1, and StructuredProgramming Vol. 13 (1992).)

[Shaw 96] MARY SHAW and DAVID GARLANSoftware Architecture: Perspectives on an emergingdiscipline, Prentice Hall 1996.

[SJE 91] SAAKE, JUNGCLAUS, EHRICH Object-Oriented Specification and Stepwise Refinement, inIFIP Workshop on Open Distributed ProcessingBerlin, 1991.

[Stroustrup 94] BJARNE STROUSTRUP The Designand Evolution of C++, Addison Wesley 1994.

[Sun 95] The Java Language Environment: A WhitePaper, Sun 1995. (http://java.sun.com)

[Sun 96] The Java Language Specification, Sun1996. See WEB address.

[Weg 91] PETER WEGNER Concepts and Paradigmsof Object-Oriented Programming, ACM SIGPLANOOPS Messenger Volume 1 no. 1 August 1990.

[Wiener 95] RICHARD WIENER SoftwareDevelopment Using Eiffel: There can be life otherthan C++, Prentice Hall 1995.

[X3J16 92] Members of the X3J16 working groupon extensions How to write a C++ LanguageExtension Proposal for ANSI-X3j16/ISO-WG21,ACM SIGPLAN Notices Vol. 27 No. 6 June 1992.

[Yoshida 92] KOICHIRO YOSHIDA Title and book inJapanese.

7. Webliography

Ada 95:

Home page

http://lglwww.epfl.ch/Ada/

Ada 95 Guide for C/C++ Programmers

http://lglwww.epfl.ch/Ada/Ammo/

Cplpl2Ada.html

Contrast to C++ by Edmond Schonberg

http://www.csci.unt.edu/faculty/ryan/languages/ada/9x-cplus.txt

Beta:

http://www.daimi.aau.dk/~beta/

C++:

FAQ

http://www.cs.bham.ac.uk/~jdm/CPP/cppfaq.html

ISO SC22/WG21 standards

ftp://research.att.com/dist/c++std/WP

ftp://ftp.maths.warwick.ac.uk/pub/c++/std/WP

http://www.cygnus.com/misc/wp/index.html

http://reality.sgi.com/employees/austern_mti/std-c++/faq.html#B8

STL

http://www.cs.rpi.edu/~musser/stl.html

Comments on Critique

http://www.cs.oberlin.edu/students/jbasney/critique/critique.html

Dilbert :

The Dilbert Zone

http://www.unitedmedia.com/comics/dilbert/

Eiffel :

EiffelWorld magazine

http://www.eiffel.com/doc/eiffelworld/

Downline load site for SmallEiffel

ftp://ftp.loria.fr/pub/loria/genielog/SmallEiffel/

Interactive Software Engineering

http://www.eiffel.com/

Dynamic Linking in Eiffel

http://www.eiffel.com/doc/manuals/dle/book

Vendor independent home page

http://arachnid.cs.cf.ac.uk/CLE/

Books on Eiffel

C++?? 60


http://www.eiffel.com/doc/documentation.html

SIG computer and Visual Eiffel

http://www.sigco.com/

Tower Technology

http://www.twr.com/

Eiffel locater page

http://www.progsoc.uts.edu.au/~geldridg/stop-press.html

Java:

Main page

http://java.sun.com/

Demonstration Applets

http://www.gamelan.com/index.shtml

Java Language Specification

http://java.sun.com/doc/language_specification/

index.html

Oberon:

The Oberon Reference Site

http://www.math.tau.ac.il/~laden/oberon/

Sakkinen, Markku :

http://www.cs.jyu.fi/~sakkinen/

References to other papers on C++ and other topicsby Dr. Sakkinen.

X3J16 C++ ISO standardisation:

http://www.x3.org/tc_home/x3j16.html

Ian Joyner - archive.adaic.comarchive.adaic.com/intro/ada-vs-c/cppcv3.pdf · 3rd Edition Ian Joyner ... The last edition was addressed to people who were considering ... completely

Documents