Comparison of C++ and Fortran 90 for object-oriented scientific programming

ELSEVIER Computer Physics Communications 105 (1997) 20-36

Computer Physics Communications

Comparison of C++ and Fortran 90 for object-oriented scientific programming

John R. Cary a, Svetlana G. Shasharina a, Julian C. Cummings b, John V.W. Reynders b, Paul J. Hinker c

a Center for Integrated Plasma Studies and Department of Physics, University of Colorado, Boulder, CO 80309-0390, USA b Advanced Computing Laboratory, Los Alamos National Laboratory, Mail Stop B287, Los Alamos, NM 87545, USA

c Dakota Scientific Software, Inc., 501 E. Saint Joseph St., Rapid City, SD 57701, USA

Received 12 November 1996

Abstract

C + + and Fortran 90 are compared as object-oriented languages for use in scientific computing. C + + is a full-featured, object-oriented language that provides support for inheritance and polymorphism. Fortran 90 can mimic some object-oriented features through combinations of its TYPE and MODULE syntax elements, but it lacks inheritance and thus does not permit code reuse to the same extent as C÷+. Each language has other useful features unrelated to object-oriented programming, but the additional features of Fortran 90 can be included in C + + through the development of class libraries. In contrast, including the additional features of C + + in Fortran 90 would require further development of the Fortran 90 syntax. A critical feature missing in Fortran 90 is the template, which allows C + + programmers to build portable, reusable code and to dramatically improve the efficiency of the evaluation of complex expressions involving user-defined data types. ~ 1997 Elsevier Science B.V.

Keywords: C++; Fortran 90; Object-oriented programming; Scientific computing

1. Introduct ion

There is currently considerable debate in the scientific computing community about how to change the programming paradigm to take advantage of modern software design principles, particularly o b j e c t - o r i e n t e d

p r o g r a m m i n g [ 1-3] . It is perceived that the community will benefit from these principles at both the top and bottom of their computing applications. Well-structured application codes (the top) can be more easily modified through replacement of internal pieces as needed for the changes in scientific models. New computer architectures will require changes in the subroutines for processing large arrays (the bottom). It is recognized that we must move beyond Fortran 77 to implement modern software principles, but in which direction shall we go?

This paper compares Fortran 90 [4,5] with C + + [6,7] for these purposes. We look at the degree to which the syntax of these two languages supports important object-oriented programming (OOP) features, as well as some useful non-OOP features, and how these features can strengthen and simplify the programming of

0010-4655/97/$17.00 (~) 1997 Elsevier Science B.V. All rights reserved. Pll S001 0-4655 (97) 00043-X

J.R. Cary et al./Computer Physics Communications 105 (1997) 20-36 21

scientific application codes. (Here we shall assume little familiarity with either F90 or C + ÷ and introduce some basic syntax features of each language; the reader is encouraged to consult the reference materials for more details.) Object-oriented programming allows one to have the modularity and data integrity needed for ease of code expansion. The capabilities of C + + in this area are fairly well established, while the use of Fg0 for OOP is relatively new [8]. We will introduce examples from physical problems to illustrate how object methods can be useful for scientific programming. In addition, we will provide performance comparisons of these languages and the procedural languages F77 and C on several typical computational kernels from scientific application codes.

We shall see that all the desired object-oriented features, including the important notion of inheritance, are present in C + + . In contrast, F90 does not have inheritance, which is invaluable in providing opportunities for code reuse, and has no single construct for representing an abstract object. The template feature of C + ÷ , which similarly allows for code reuse and is an important means of optimizing the evaluation of complex expressions involving user-defined types, is also absent from F90. On the other hand, F90 offers a built-in array type with simple array syntax for mathematical operations, plus the ability to specify numerical precision in computations. Performance comparisons indicate that, contrary to popular myth, there is no significant performance advantage gained by using a procedural language instead of one of these newer languages with object-oriented features.

2. Object-Oriented Programming (OOP) features

There are four major elements in Booch's object model: abstraction, encapsulation, modularity, and hierarchy [9]. In object-oriented programming, one develops objects that represent certain concepts or entities from the problem domain and then embodies these abstractions in computer code. Each of these objects consists of both the internal data representation and methods (both for internal use by the object itself and for external Itse) for interacting with that data. By hiding the data and internal methods from the user, a technique known as encapsulation, the author of an object can change the internal representation without breaking software that uses the object. By providing a fixed interface between objects, one achieves code modularity and flexibility, and one greatly simplifies the task of building a program in stages or programming in teams, since program components are naturally separated.

Object-oriented languages have the property of inheritance. This means that an object can inherit the data and methods of its "parent" object(s), and it can have additional properties defined. One can think of inheritance as a means of taking a general sort of object and tailoring it to fit specific needs. Inheritance offers tremendous t~otential tbr code reuse, and it is helpful in organizing the modules that compose a particular application into a hierarchy that indicates their relation to one another. A sensible hierarchy can be a great aid in managing the ,:omplexity of modern scientific computing application codes.

Object-oriented languages also often allow for polymorphism, the ability of a function to have different behavior depending on the type of object invoking the function or the specific parameter list provided to the function. The former case involves objects that redefine the behavior of an inherited method. The function is dynamically bound (at run time) to the object, with each object implicitly knowing its type and thus its defined behavior. In the latter case, functions utilize knowledge of the parameter list to infer which behavior is desired. Polymorphism enables programmers to avoid writing inflexible, high-maintenance code in which objects must contain every possible behavior and then use large I F - E L S E or switch code blocks to determine the desired behavior at run time.

Before examining how well these features are supported in Fortran 90 and C + + , a brief discussion of the difference between abstract and concrete objects is in order. By an abstract object, we mean a set of data structures and methods that describe the properties of the thing we wish to model with our object. An example is an abstraction for a particle. Perhaps it would be described by its mass and a pair of 3-vectors for its position and velocity. Among its methods might be functions for advancing the particle in phase space or computing

22 J.R. Cary et aL /Computer Physics Communications 105 (1997) 20-36

the particle's kinetic energy. In addition, we may require a construction or initialization function that describes how to create an instance of one of these objects. Such an instance is what we refer to as a concrete object, since it actually exists, has specific values for its data, and is not merely a conceptual description.

3. Language support for OOP features

3.1. Abstract and concrete objects

The C + + class provides the concept of an abstract object. A class defines a type of object by specifying the data it contains and methods that interact with the data. Thus, it describes the contents and behavior of the concrete objects that are its instances. Member data and internal functions are typically declared private, so that only objects of that class can access them. Meanwhile, interfaces with other objects or pieces of code are labeled as public. The code fragment

dass Particle { private:

double mass; double position [ 3 ], velocity [ 3 ] ;

public: Particle(double imass = 0.0) {

mass = imass; for (int i = 0; i < 3; i + + ) {

position[i] = 0.0; velocity [i] = 0.0;

} } virtual ~Particle() { } virtual double Position(int i) const { return position[i]; } virtual double Velocity(int i) const { return velocity[i]; } virtual double Kinetic_Energy () const {

double ke = 0.0; for (int i = 0; i < 3; i + + ) ke+ = velocity[i] * velocity[i]; return mass • ke;

) virtual double Charge() const { return 0.0; }

};

defines a class Particle containing the data described earlier. The constructor, Particle(double), initializes this data and includes a default value of 0.0 for the mass, in case no argument is provided. (We also include the destructor, ~Particle(), which in this case does nothing, but is typically used to free up storage space allocated for internal data by the constructor, once the object is no longer needed.) Because the member data is declared private, code outside of the class cannot access it directly. Instead, the ith component of the velocity, for example, is obtained using the method Velocity(int i). Similarly, the particle's kinetic energy is obtained via a member function that performs a simple computation.

This is an illustration of encapsulation: the internal data is accessed only through fixed interfaces. If later the class Particle is redesigned to use momentum instead of velocity for its internal representation, no code calling the member functions of this dass need be changed. Only the appropriate member function definitions would require alteration. In this case, they would become


class Particle {

:

private:

double position [ 3 ], momentum [ 3 ] ; public:

virtual double Velocity(int i) const { return momentum [ i] /mass;

} virtual double Kinetic_Energy( ) const {

double k e = 0.0; for (int i = 0; i < 3; i + + ) ke+ = momentum[i] * momentum[i]; return ke /mass;

}

Notice that class Particle also provides a member function Charge(), which in this case simply returns 0, since Particle has no data describing its charge state• This function, along with others in the class, has been declared as virtual, which means that a class inheriting from Particle may override, or redefine, the behavior of this function while maintaining the same interface (i.e., the same function name and parameter list)• In this ~ay, we provide a common, fixed interface for obtaining a particle's charge from objects of class Particle and any class that may be derived from it in the future.

To use class Particle, we declare instances of it as shown below:

Particle p l, p2;

This statement constructs two concrete objects, and the data values for each object are independent. Further- more, p l and p2 are only known locally in the routine in which they are declared. (The practice of avoiding the use of global variables, which are accessible anywhere in the program, is another rule that modern software designers adhere to in order to keep codes modular.) To obtain the kinetic energy, for example, we use the member function through the following dot syntax:

double pl _ke = p l .Kinetic_Energy( ) ;

Now let us discuss the implementation of abstract and concrete objects in Fortran 90. F90 has two language elements that have object features: TYPE and MODULE. TYPE allows one to group data together but does not associate methods with the data. The code fragment

TYPE Particle REAL mass REAL, DIMENSION(0:2) :: position, velocity

END TYPE Particle

defines a data type corresponding to the member data of the C + + class Particle introduced earlier. One can declare an instance of a TYPE and access its data components as follows:

TYPE(Particle) :: p REAL :: pmass = p%mass

By itself, a TYPE does not provide encapsulation, since any section of code that defines a TYPE and declares an instance of it may directly access its data components. Encapsulation is achieved by placing the TYPE

24 J.R. Cary et al./Computer Physics Communications 105 (1997) 20-36

definition inside of a MODULE. The MODULE construct is a code block containing data or TYPEs, as well as FUNCTIONs and SUBROUTINEs for manipulating this data. A MODULE may declare its contents to be PRIVATE, disallowing direct access to those contents by program units outside the MODULE. If data elements are declared PRIVATE, the MODULE may provide FUNCTIONs that are PUBLIC for accessing the data. The following code illustrates how an abstract object similar to the C + + class Particle might be built in F90:

MODULE ParticleModule TYPE Particle

PRIVATE REAL mass REAL, DIMENSION(0:2) :: position, velocity

END TYPE Particle CONTAINS

SUBROUTINE Initialize(p, imass)

TYPE(Particle), I N T E N T ( I N O U T ) :: p REAL, OPTIONAL :: imass I N T E G E R i IF ( P R E S E N T ( i m a s s ) ) THEN

p % m a s s = imass

ELSE p % m a s s = 0.0

ENDIF DO i = 0 , 2

p%pos i t ion ( i) = 0.0

p % v e l o c i t y ( i) = 0.0 END DO

END SUBROUTINE Initialize REAL FUNCTION Position(p, i)

TYPE(Particle), I N T E N T ( I N ) :: p INTEGER, I N T E N T ( I N ) :: i Position = p %posi t ion ( i )

RETURN END FUNCTION Position REAL FUNCTION Velocity(p, i)

TYPE(Particle), I N T E N T ( I N ) :: p INTEGER, I N T E N T ( I N ) :: i Velocity = p % v e l o c i t y ( i ) RETURN

END FUNCTION Velocity REAL FUNCTION KineticEnergy(p)

TYPE(Particle), I N T E N T ( I N ) :: p I N T E G E R i REAL :: ke = 0.0 D O i = 0 , 2

ke = ke + ( p % v e l o c i t y ( i ) ) * *2 END DO KineticEnergy = p % m a s s * ke

J.R. Cary et aL / Computer Physics Communications 105 (1997) 20-36 25

RETURN END FUNCTION KineticEnergy REAL FUNCTION Charge(p)

TYPE(Particle), I N T E N T ( I N ) :: p Charge = 0.0 RETURN

END FUNCTION Charge END MODULE ParticleModule

The PRIVATE statement resets the default access for items within the current code block, which is normally PUBLIC. Because of this statement in TYPE Particle, the data of Particle cannot be accessed except through the FUNCTIONs defined in the enclosing MODULE ParticleModule, such as Position(p, i).

The above construction provides for an abstract object, because once this MODULE has been included in a code segment, one can define multiple instances of TYPE Particle. For example, consider the code fragment

USE ParticleModule TYPE(Particle) :: p l , p2 Initialize ( p l ) Initialize (p2)

The USE statement indicates that the contents of ParticleModule are available for use within the current code block, p l and p2 are separate instances (concrete objects) of TYPE Particle with encapsulated data accessed via the FUNCTIONs of ParticleModule. We have.included an Initialize SUBROUTINE in ParticleModule that mimics the behavior of the C + ÷ constructor. F90 does not automatically call object constructors or destructors, so FUNCTION Initialize must be called explicitly for each Particle after it is created.

In summary, Fortran 90 does allow one to build an abstract object by combining TYPEs and PROCEDUREs ~ i.e., SUBROUTINEs and FUNCTIONs) in a MODULE. The TYPE contains the internal data, which can be declared PRIVATE inside of a MODULE and thus encapsulated. The MODULE provides an interface via PROCEDUREs that are PUBLIC and operate on the contained TYPE.

3.2. Inheritance and polymorphism

Suppose one has developed an abstract object with a data structure and methods well suited to a given problem. Now imagine that a new problem comes along in which one needs a new data structure identical to the old one except for some additional data elements and the methods needed to work with this new data. Rather than developing a completely new abstract object, one would like to define a new abstract object that ~nherits the old data and methods, can alter the behavior of some functions to match the semantics of the new object, and can add new data or methods as needed.

This concept of inheritance was explicitly included in the C + + syntax. For example, the class definition

class Nucleus : public Particle { private:

Jut numProtons , numNeutrons; static double elemCharge;

public: Nucleus(Jut inumProtons, Jut inumNeutrons , double imass = 0.0) : Part ic le( imass) {

numProtons = inumProtons; numNeu t rons = inumNeutrons;

}


~Nucleus() { } double Charge() const {

return numProtons * elemCharge; }

};

specifies that class Nucleus inherits from class Particle. We refer to Particle as the base class and Nucleus as a derived class. Nucleus contains all the data and methods of Particle, and it has added the new data members numProtons, numNeutrons, and elemCharge. Note that elemCharge (which represents the value of the electronic charge e) has been declared static, which means that it is a single piece of data shared by all objects of class Nucleus. The constructor for Nucleus initializes the new data members and can pass other parameters back to the base class constructor for the initialization of inherited data members. Also, the virtual member function Charge() has been redefined to compute the charge based upon the internal structure of the derived class, namely by multiplying elemCharge (which is presumed to be initialized elsewhere) by the number of protons in the nucleus, numProtons.

If we declare two concrete objects,

Particle p; Nucleus n;

then p.Charge() returns the charge according to the definition of Particle (i.e., returns 0.0), while n.Charge() returns the charge as dictated in Nucleus. This is an example of polymorphism: the function Charge() acts differently depending on what kind of object is invoking it.

C + + also has a means of invoking polymorphic behavior for a collection of base class objects using pointers. Via the syntax

Particle .pptr;

one declares a pointer to a Particle object. That is, pptr is used to contain the memory location of a Particle. It could be assigned a value with the address-of symbol &, as in the statement

pptr = &p;

which gives pptr the memory address of Particle object p. The Particle referred to by pptr is obtained with the dereferencing opera tor . , so that

(*pptr) .Charge( ) ;

returns the same value as p. Charge(), since pptr points to p. An alternate notation for accessing members of an object by pointer is the arrow syntax:

pptr ---+ Charge( ) ;

A powerful type of polymorphism occurs when the assignment

pptr = &n;

is made. The assignment of the address of a derived class object (such as the Nucleus n) to a base object pointer is allowed because a Nucleus is a kind of Particle. Since the method Charge() was declared virtual, the statement

pptr ~ Charge( ) ;

now performs the member function as defined in class Nucleus! All virtual functions are dynamically bound to their objects, so the object referred to by the pointer pptr chooses the correct definition of the function at

J.R. Cary et al./Computer Physics Communications 105 (I 997) 20-36 27

run time. This mechanism is very useful. One can refer to a collection of objects that may be either Particle or

Nucleus objects (or objects of any other class that may be derived from Particle, such as an Atom class or a lVlolecule class) with an array of pointers to Particle objects. The Charge() member function may be called orl each object, and the correct version is invoked for each one without need for any special coding. This gives programmers a tremendous flexibility in reusing previously designed objects, inheriting much of the needed c~de, modifying only what they must to solve a new problem, and writing general code that can operate on a wtriety of objects.

To create a Nucleus object in Fortran 90, one can first define a new T Y P E that has as one of its components a Particle TYPE, plus the additional data needed to describe a nucleus. Fortran 90 allows for one M O D U L E to c,)ntain another via the USE statement. Hence, the M O D U L E representing a nucleus can include ParticleMod- uie, so that when it comes time to define a Nucleus TYPE, the definition of T Y P E Particle is known. One then d_~fines the P R O C E D U R E s needed for manipulating a Nucleus, as illustrated in the following code fragment:

M O D U L E NucleusModule USE ParticleModule T Y P E Nucleus

PRIVATE TYPE(Part ic le) p I N T E G E R numProtons, numNeutrons

END T Y P E Nucleus R E A L , PRIVATE, P A R A M E T E R :: e lemCharge = 1.6e - 19 SAVE I N T E R F A C E Initialize

M O D U L E P R O C E D U R E Initialize, NucInitialize END I N T E R F A C E Initialize I N T E R F A C E Position

M O D U L E P R O C E D U R E Position, NucPosition END I N T E R F A C E Position

' Similar interfaces for Velocity, KineticEnergy, Charge

C O N T A I N S S U B R O U T I N E Nuclnitialize(n, inumProtons, inumNeutrons , imass )

TYPE(Nuc leus ) , I N T E N T ( I N O U T ) :: n I N T E G E R , I N T E N T ( I N ) :: inumProtons, inumNeutrons

R E A L , O P T I O N A L , I N T E N T ( I N ) :: imass

Initialize(n%p, imass ) n%tzumProtons = inumProtons t z%numNeutrons = inumNeutrons

END S U B R O U T I N E Nuclnitialize R E A L F U N C T I O N NucPosition(n, i)

TYPE(Nuc leus ) , I N T E N T ( I N ) :: n I N T E G E R , I N T E N T ( I N ) :: i NucPosition = Position(n%p, i) R E T U R N

END F U N C T I O N NucPosition

! Similar functions for NucVelocity, NucKineticEnergy


REAL FUNCTION NucCharge(n) TYPE(Nucleus), INTENT(IN) :: n NucCharge = n%numProtons * elemCharge RETURN

END FUNCTION NucCharge END MODULE NucleusModule

As with the Nucleus class in C ÷ + , we have added integers for the number of protons and neutrons, and a parameter for the electronic charge. The SAVE qualifier indicates that only one copy of the parameter elemCharge is used for multiple invocations of the MODULE. (This is similar to the behavior of a static member in C + + . ) We have included a SUBROUTINE Nuclnitialize that calls Initialize on the contained Particle TYPE and then initializes the new data. We have also introduced the NucCharge FUNCTION to calculate the charge based on the number of protons.

In order to access the position and velocity data of TYPE Particle, we must define new FUNCTIONs in NucleusModule that take a Nucleus as an argument, even though all they do is turn around and call the respective FUNCTIONs from ParticleModule! The basic reason for this is that Fortran 90 PROCEDUREs act upon a TYPE (take it as an argument), whereas C + + member functions are invoked on behalf of an object. Hence, a TYPE can contain data from another TYPE by including it as a component, but there is no way to inherit PROCEDUREs because the two TYPEs are not interchangeable as arguments to a PROCEDURE. In C + + , the specification of member functions that are unchanged is not necessary. The derived class inherits all of the member functions of its base class. New code is written only for those member functions that must change behavior.

Prior to listing the contained PROCEDUREs of NucleusModule, we can provide INTERFACE blocks Ibr those PROCEDUREs common to both ParticleModule and NucleusModule. The purpose of these blocks is to allow either version of a PROCEDURE to be called using one generic name. For instance, INTERFACE Position allows the name Position to be used for either the Position or NucPosition FUNCTION. Thus, if we first USE NucleusModule and then declare (and initialize) a Particle p and a Nucleus n, a piece of code such as

REAL :: px = Position(p, 0) REAL :: nx = Position(n,0)

is perfectly valid. Using these generic interfaces, one can very roughly approximate the notions of inheritance and virtual member functions from C + + , although the mechanism is clearly rather clumsy. Since no PRO- CEDUREs are actually inherited, one must repeat the definitions using new names, and then link these new names back to the previous names with INTERFACEs.

In OOP parlance, we have used a has-a construct to represent what is really an is-a relationship. The Nucleus is a Particle in the sense that, like a Particle, it has a position, velocity, and mass. In addition, it has an internal structure described by the number of protons and neutrons. To represent this in Fortran 90, we had to use a has-a structure in which a Particle was one of the data components contained inside a Nucleus.

This construction has two negative practical consequences. The first has already been mentioned: we must recode FUNCTIONs of ParticleModule for NucleusModule, even if there is no change in behavior. Rewriting this code is time-consuming and can introduce bugs. Moreover, it is this recoding of PROCEDUREs, along with the rather cumbersome INTERFACE structure, that accounts for the disparity in the length of the C ÷ + and F90 code fragments needed to define a derived abstract object, as demonstrated in our example. Inheritance allows code to be far more compact and easier to maintain. (This problem could be sidestepped if we did not introduce the secondary TYPE Nucleus, but then we could not create separate instances of a Nucleus!) Secondly, we do not have the powerful polymorphism of C + + discussed earlier. Unlike in C ÷ + , one cannot


have a Particle pointer refer to a Nucleus, since in Fortran 90 a Nucleus has a Particle but is not itself a Particle. Thus, there is no way to refer to a group of Particle and Nucleus objects collectively and have FUNCTION Charge return what is appropriate for each object.

4. Non-OOP language features

In the previous sections, we have highlighted some differences in C + + and Fortran 90 support of what we consider to be important object-oriented programming features. There are a few other language features that have nothing to do with the OOP model, but nevertheless can be very useful to the scientific applications programmer. Here we examine some of these capabilities, in order of importance to scientific programming.

4.1. C++ features missing in Fortran 90

4.1.1. Templates Class and function templates allow for parameterization, whereby a class or function can be implemented in

a manner that is independent of the types of objects it contains or operates on. One example is a Stack class that maintains a stack of objects of some arbitrary type. The definition of such a class might look like the l~llowing:

template ( class T ) class Stack {

private: T* theStack; T, Top;

public: Stack(int size) { theStack = Top = new T[ size] ; } ~Stack() { delete[ ]theStack; } void push(T newltem) { *Top++ = newltem; } T pop() { return . ( - - T o p ) ; }

};

The unspecified type T will be replaced by an actual type by the compiler whenever a Stack object with a specific element type is requested by the programmer. The Stack dass may rely on dass T having certain data or methods available (e.g., in this case, the assignment operator). If they are not found, a compile-time error will result when the compiler attempts to instantiate the template.

Another common use for the template is to generalize code on the basis of some numerical parameter, rather than by type. For example, we could create a template for class Particle, with the number of dimensions of our simulation domain as a template parameter.

template ( unsigned int Dim ) class Particle {

private:

double position [ Dim], velocity [ Dim ] ; public:

Particle(double imass = 0.0) { mass = imass; for (int i = 0; i < Dim; i + + ) {


position[i] = 0.0; velocity [i] = 0.0;

} }

virtual double Kinetic_Energy() const { double ke = 0.0; for (int i = 0; i < Dim; i + + ) ke+ = velocity[i] * velocity[i]; return mass * ke;

}

};

This would allow Particle objects instantiated from this dass to function in the same way for any dimensionality. Furthermore, by making array sizes and loop bounds template parameters of a function or class, they become available at compile time and enable aggressive optimization and function inlining.

The template provides another level in abstraction• While thought of primarily as a convenience by some, templates have become increasingly important because of the opportunity they offer for code reuse. A Standard Template Library (STL) is now available, which contains templates for many useful classes such as vectors, queues, and maps [ 10]. As with any set of C + + classes, these generalized data structures are packaged along with the commonly used algorithms for each, thus providing the software development community with a "standard toolkit" on which a wide variety of applications may be based.

The true power of the template mechanism has only recently begun to be fully recognized. Because C + ÷ templates involve replacement of code at compile time, one can use the compiler to do fairly elaborate "computations"• One example of such computation is the use of expression templates [11 ]. The expression template technique addresses the problem of pairwise evaluation of expressions involving user-defined types.

For instance, let us imagine that you have a Matrix class and you have overloaded mathematical operators such as ÷ and * to do matrix operations element by element• If A, B, C, and D are Matrix objects and you write the expression

D = A , ( B + C ) ,

the C-q-q-÷ compiler will evaluate this by first applying the overloaded operator ÷ to B and C (generating a temporary Matrix object), then applying the overloaded operator • to A and the intermediate result (generating a second temporary Matrix), and finally applying the overloaded operator = to assign the result to D. Since each operation involves a loop over all the elements of the two Matrix operands, this pairwise evaluation destroys code efficiency by going through several consecutive loops where one would suffice and by generating lots ol intermediate temporary objects.

With the expression template technique, one creates class templates that act as wrappers around expression operands, operators, and elements• Operand wrappers know how to access the data inside the operand, and operator wrappers simply apply the operation. Expression element wrappers generate a new type that describes a combination of operators and operands, thus allowing one to build (through the nesting of class template definitions) a type for the entire right-hand side of a complex expression. This type is then evaluated and assigned to the object on the left-hand side using inlined, highly optimized code that has been built by the compiler in the process of instantiating the templates. In effect, the compiler automatically generates code for efficient evaluation of complex expressions involving user-defined objects and operators! Tests of this technique indicate that one can expect performance that is up to 95% of hand-coded C kernels [ l l ] . Of course, one receives an enormous benefit for this small cost in performance; namely, the ability to code scientific equations using high-level objects in an easily understandable manner.


4.1.2. Pointer casting Pointer casting allows one to change the type of object that a pointer points to. An example is to cast a

pointer to a user-defined complex number type to a pointer to double. In this case, dereferencing might give the real part of the complex number (depending upon how the conversion from complex number to double was defined). Casting pointers is often frowned upon because it can lead to invalid memory access, although recent additions to the C + + standard have made casting safer and more explicit (see Appendix A of [6] ). In any ase, pointers are indispensable in using general callbacks, as one must with the Motif windowing system [ 12].

In addition, pointer casting may be used in order to access polymorphic object behavior, by casting a pointer 1o an object of some base class to be a pointer to an object of a derived class.

4.1.3. Exception handling Exception handling allows convenient transfer of program control when an error occurs, by using the tO'-

ihrow-catch mechanism. In C + + , one can try execution of a particular piece of code by placing it inside a try block. This code may then throw an exception if a certain sort of run-time error is detected by using a throw ~tatement. The try block is followed by a catch block that will catch a particular exception that may have been hrown and respond by executing the specified code. This provides a very clean, extensible way in which the igrogrammer can deal with possible errors.

4.2. Fortran 90features missing in C++

4.2.1. A rrays Mathematical arrays are present as elementary types in Fortran 90, but must be provided as a class library

in C + + . (In C + + , as in C, one has the ability to allocate arrays of objects of elementary types such as int or float, but one cannot utilize array syntax to request mathematical array operations.) The operations defined by such a class library may not be as efficient as those provided for the intrinsic Fortran 90 arrays. The Fortran 90 arrays have another nice feature as well, which is that when passed into a PROCEDURE, they can then be queried to determine their shape (the lengths of the axes). This feature is typically also provided in C + + array class libraries.

4.2.2. Specifiable precision Specifiable precision of floating-point numbers is possible in Fortran 90 through use of the KIND syntax.

One can specify how many significant digits in a floating-point number must be retained, rather than just declaring a variable to be REAL or DOUBLE PRECISION. This makes it easier to ensure that the numbers one is using are sufficiently accurate for the calculation at hand. A similar capability could be provided by a class library in C + + .

4.3. Addition of missing features

In summary, the most important features of Fortran 90 that are missing in C + + can be replicated by employing class libraries (see [ 13] for discussion of an array class library implementation). On the other hand, for Fortran 90 to add some of the important C + + features, most notably templates and inheritance, the syntax of the language will have to be enhanced. It is questionable whether Fortran 90, in its current form, can deliver efficient code written in an object-based manner, given the problem of evaluating complex expressions involving user-defined types and operations.

32

5. Non-syntax issues

J.R. Cary et al./Computer Physics Communications 105 (1997) 20-36

Non-syntax issues can also be of importance in deciding which route to take to object-oriented programming. Three non-syntax issues worthy of examination are availability, usage, and performance. For each of these issues, customer pressure can help ensure that one language or the other is better. Below we briefly discuss each of these issues.

5.1. Availability

For the vast majority of computers, C + ÷ compilers are more widely available than F90 compilers and are likely to remain so, primarily because more commercial tools and software libraries are currently being developed in C ÷ ÷ . Fortran 90 compilers may become more widely available on supercomputing platforms because of pressures applied by the scientific computing community. However, the supercomputer industry is no longer a stand-alone market and has become highly leveraged with the commercial sector. Developing scientific codes in C-l-I-÷ will enable researchers to benefit from the extensive commercial efforts in C ÷ ÷ software development. In contrast, the F90 code developer runs the risk of writing codes in a language which is considered to serve only a niche market.

5.2. Usage

This issue is linked with availability, so many of the same arguments apply. For now (and we believe for the foreseeable future), C ÷ + is the most popular object-oriented language in the software engineering community. Consequently, if one might ever by employed outside of the scientific sector, then it is better to have C ÷ + programming skills. Learning C ÷ ÷ gives one the added advantage of drawing on a much larger developer community, which is producing not only a large amount of literature on its use (e.g., see [ 14] ), but also a large body of reusable code modules. For example, the Standard Template Library [ 10] has been proposed to be part of the ANSI C-l-t-÷ standard. Because of the much smaller user base, comparable efforts in F90 are bound to lag behind.

5.3. Performance

If two languages can represent the same concepts, both in terms of constructs and actions, then there is no reason that they cannot perform equally well. Hence, in language performance comparisons, one is mostly comparing the skill of compiler writers, not the languages themselves. On the other hand, one language may almost demand that certain performance features be written, while the other may not. For example, Fortran 90 has its array object built into the language. Thus, every F90 compiler writer should ensure that arrays perform optimally for the given architecture, whether it be a massively parallel, single-processor, or modest multi-processor machine. In C ÷ + , "smart array" objects (knowing their sizes and other attributes) must come from a class library. The operations for such objects might not be optimally written for each architecture. Thus, Fortran 90 users may get better array performance, especially on leading-edge machines [ 15].

Nevertheless, for a wide variety of computational kernels running on several different computing platforms, one finds that there is little or no inherent performance advantage acquired by sticking to a procedural language such as F77 or C vs. an object-oriented language, or by choosing Fortran 90 over C++ . To verify this statement, we have thoroughly examined the Livermore Fortran Kernels (LFK). The LFK is a well-known set of computational kernels that exhibit a wide range of performance. The computational algorithms represented range from I-D hydrodynamics in Kernel 1 to 2-D particle-in-cell in Kernel 13 to general linear recurrence equations in Kernel 19. The performance range extends from nearly 100% to less than 1% of theoretical machine peak speed.


o L I -

500

450

400

350

300

250

200

150

100

50

0

Se lec ted L lvermore K e r n e l s

C÷÷ SGI R10K I~ C SGI R10K ~ F77 SGI R IOK

R

i T I I

10 15 20 Kernel

Fig. 1. Language performance comparison on SGI RI0000 processor.

t

5

Minor modifications to the 24 computational kernels included separating each kernel into a stand-alone bench- mark, converting the Fortran source to C source (this entails loop re-ordering in some cases), and compiling with standard compiler options for best performance. As indicated in Figs. 1-4, the computational efficiency of the code produced by the four languages is consistent across the platforms examined. (The one Clear exception is the F90 compiler on the Sun Ultra-2 platform; however, this product was provided to Sun by a third-party ven- dor and is likely not as well tuned for this computer architecture as the other compilers provided by Sun. An F90 compiler from Sun for the Ultra-2 was not available at the time of this writing.) There are noticeable exceptions for some kernel/platform/language combinations. These differences can be understood to result from a particular compiler recognizing a computational pattern and either making clever internal optimizations that reduce overall work or calling upon a tuned assembly-language routine. (For a complete discussion of each of these exceptions, the reader is directed to the WWW page at h t t p : / / w - w , a c l . l a n l . g o v / H i n k e r / K e r n e l s , html.) Iqae overall result is that computational efficiency shows no direct correspondence to source language for a wide range of computational kernels.

One possible objection to these results might be that only the procedural aspects of the four languages are examined by the LFK, and that once the object-oriented language features of F90 or C + + are utilized, performance will suffer. While it is true that there is a certain amount of overhead involved when taking advantage of an object-oriented programming style, by making the proper level of abstraction of the problem domain, one can assure that typical large-scale scientific computing applications spend a vast majority of their computational time actually crunching numbers. Determining the best means of assessing and minimizing this data abstraction penalty is a topic of current research (e.g., see [ 16] ).

6. Summary and discussion

C + + is a full-featured, object-oriented programming language. It supports the notions of abstraction, encapsulation, modularity, and hierarchy. Moreover, inheritance and dynamic binding provide C + + with powerful polymorphism features. Fortran 90 allows object-oriented programming to some degree. However, in Fortran


250

2OO

1 5 0

IO0

5O

S e l e c t e d L l v e r r n o r e K e r n e l s

C + + I B M R S 6 K - - e - - C I B M R S 6 K

F 7 7 I B M R S 6 K F 9 0 I B M R S 6 K

i 5 10 15 2 0

K e r n e l

Fig. 2. Language performance comparison on IBM RS6000 processor.

Selected Livermore K e r n e l s 3 5 0 I ' " ' ' '

| 1 C + ÷ S u n U l t r a - 2 | C S u n U l t r a - 2 ,

] ~ F 7 7 S u n U l t r a - 2 cl

° I r r

150

1 O0

5 1 0 15 2 0 K e r n e l

Fig. 3. Language performance comparison on Sun Ultra-2 workstation.

90 the concept of an abstract object must be constructed using two distinct syntax elements, the TYPE and the MODULE. With this approach, one can represent the has-a relationship of two objects, but not the is-a relationship needed to construct an object hierarchy. This implies that code reuse will come more easily in C++.

Certain useful non-OOP features are found exclusively in both languages. However, all of the additional features of Fortran 90 can be readily included in C + + by the development of class libraries; that is, without any language extensions. In contrast, the inclusion of the important features exclusive to C + + in Fortran 90

ZR. Cary et al./Computer Physics Communications 105 (1997) 20-36 35

S e l e c t e d L i v e r m o r e K e r n e l s 1 0 0 ~ w ~

C + * - T 3 d

9 0 T 3 d El T 3 d ~1

8 O

7 O

6 O

§ 50

4 0

3 0

2O

1 0

0 I I I 5 1 0 1 5 2 0

K e r n e l

Fig. 4. Language performance comparison on T3D (DEC Alpha processor).

will require new language syntax. The most critical feature missing in Fortran 90 is the template, which allows C + + programmers to build portable, highly reusable libraries and to dramatically improve the efficiency of the evaluation of expressions involving user-defined types.

With regard to non-syntax issues, the languages are comparable until one considers the broader software ,tevelopment community. For high-performance computing, one may find either Fortran 90 or C + + to be more mature. On Symmetric Multi-Processors (SMPs), C + + is likely to be further advanced, while on Massively Parallel Processors (MPPs), Fortran 90 compilers may be more highly optimized. In the general software engineering community, however, C + + is orders of magnitude more common than F90. Free software and libraries useful for physics modeling as well as other problem domains will be more rapidly developed in the C + + community. Thus, software developers in the physics community who hope for greater mobility or ,ncreased leveraging with the commercial sector would be better served learning C + + .

~eknowledgements

JRC and SGS acknowledge useful discussions with Peter Stoltz and Scott Hendrickson. Their efforts were supported by the U.S. Department of Energy under grants DE-FG03-95ER54297 and DE-FG03-95ER40926. JCC and JVWR acknowledge the use of computational resources at the Advanced Computing Laboratory, Los Alamos National Laboratory, and support from the U.S. Department of Energy under Contract No. W-7405- ENG-36.

References

I I I A. Eliens, Principles of Object-Oriented Software Development (Addison-Wesley, Reading, MA, 1994). [21 G. Booch, Object-Oriented Development, IEEE Transactions on Software Engineering 12 (1986) 211. 13] G. Buzzi-Ferraris, Scientific C++: Building Numerical Libraries the Object-Oriented Way (Addison-Wesley, Reading, MA, 1993). 141 T.MR. Ellis, I.R. Phillips and T.M. Lahey, Fortran 90 Programming (Addison-Wesley, Menlo Park, CA, 1994). I 51 M. Metcalf and J. Reid, Fortran 90 Explained (Oxford Univ. Press, Oxford, England, 1990).


[ 6 ] B. Stroustrup, The C++ Programming Language (Addison-Wesley, Reading, MA, 1991 ). [7] S. Lippman, C++ Primer (Addison-Wesley, Reading, MA, 1991 ). [ 8] V.K. Decyk, C.D. Norton and B.K. Szymanski, Introduction to Object-Oriented Concepts using Fortran 90, Institute of Plasma and

Fusion Research Report No. PPG-1559 (1996). 19] G. Booch, Object-Oriented Analysis and Design with Applications (Benjamin-Cummings, Redwood City, CA, 1994).

[ 10] D.R. Musser and A. Saini, STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library (Addison- Wesley, Reading, MA, 1996). [Note: Code for the Standard Template Library may be downloaded from the WWW via the URL http :/lisx. comlNdsupel/wwwlsponge/stl or via anonymous FFP at ftp. cs. rpi. edu in directory Ipublstllbook. ]

[ I I ] T. Veldhuizen, Expression Tcmplatcs, C++ Report, 7 (Junc 1995), [ 12] D.A. Young, Object-Oriented Programming with C++ and OSF/Motif (Prentice Hall, Englcwood Cliffs, NJ, 1992). [ 13] R. Parsons and D. Quinlan, A++/P++ Array Classes for Architccture-lndepcndent Finite-Difference Calculations, Proc. of the 2nd

Annual Object-Oriented Numerics Confcrencc (1994). 114] E. Gamma, R. Helm ct al., Design Pattcrns (Addison-Wesley, Reading, MA, 1995). [15] C.D. Norton, B.K. Szymanski and V.K. Dccyk, Object-oriented parallcl computation for plasma simulation, Communications

of the ACM, 38 (1995). [Note: also available on thc WWW via thc URL http://www.acm, org/pubs/cacm/ previous/oct95_toc, html. ]

[16] A.D. Robison, The abstraction penalty for small objects in C++, POOMA '96 Conference (1996). [Note: also available on the WWW via thc URL http://www, acl. lanl. gov/Pooma96/abstract s/robi son. html. ]

Comparison of C++ and Fortran 90 for object-oriented scientific programming

Documents