Parallel object-oriented programming for parallel simulations

NORI"I-I . ~

Informatics and Computer Science

Parallel Object-Oriented Programming for Parallel Simulations

FRAN~OISE BAUDE FABRICE BELLONCLE DENIS CAROMEL NATHAL1E FURMENTO YVES ROUDIER

13S-CNRS, University of Nice, Rte. des Colles, B.P. 145, 06903 Sophia Antipolis Cedex, France

and

PHILIPPE MUSSI GIJNTHER SIEGEL

INR1A Sophia Antipolis, 2004 route des Lucioles, B.P. 93, 06902 Sophia Antipolis Cedex, France

ABSTRACT

This paper presents the development of a parallel object-oriented language which is an extension of C + + , called C ++/ / . C + + / / offers reusability, flexibility and extensibility in concurrent programming through a set of language primitives (indeed a Meta-Object Protocol), independent of any parallel paradigm. It permits us to build libraries of nearly all concurrent programming models. One of them, presented here, is an MIMD model based on data-flow synchronizations (wait-by-necessity). C + + / / runtime main concern is versatility so its implementation is in fact an interface to any low-level runtime support.

C + + / / is used to define and implement PROSlT, an object-oriented framework for distributed discrete event simulation. In the paper, we highlight the features which are crucial for the development of this generic simulation environment, mainly expressiveness, end-user adaptation and a strong reuse potential. All those important features are directly provided by the use of C ++/ / .

PROSIT and C + + / / are part of the SLOOP project developments.

1. I N T R O D U C T I O N

Discrete event s imulat ion and object -or iented languages have a long c o m m o n h i s t o ry - - t hey probably started back in 1967 with the Simula

INFORMATION SCIENCES 93, 35-64 (1996) © Elsevier Science Inc. 1996 0020-0255/96/$15.00 655 Avenue of the Americas, New York, NY 10010 PII S0020-0255(96)00060-6

36 F. BAUDE ET AL.

language [26], If simulation somehow triggers or favors the development of the object-oriented paradigm, we do not believe it is by mere chance, but rather because the programming of simulations is very demanding to programming languages, leading to breakthroughs in the technology. For some years now, simulation is challenging another aspect of programming languages: parallelism and distribution. Again, the technology is pushed up to its limits, and we believe that this cross-fertilization will be beneficial for both domains.

The work being described here covers these two aspects: parallel object-oriented programming and parallel simulations. Our group (the SLooP INRIA-I3S/CNRS-Univ. of Nice project) works on three main domains:

(i) parallel and distributed discrete event simulations, (ii) parallel object-oriented languages,

(iii) interconnection networks.

These three different research goals articulate one with another in the construction of the SLOOP system: each level uses the primitives and possibilities of the layer just underneath. We developed a parallel object~ oriented language, and we are using it to define and implement a generic environment for parallel simulations. As we will detail in the paper, simulation raises many crucial questions regarding parallel object-oriented programming.

The next section presents the overall structure of our system. Section 3 introduces C + + / / (a parallel extension of C + + , pronounced C + + parallel): the parallel object-oriented model and language we use for developing our simulation platform. In this section, we shall try to highlight all the features which are crucial for the development of distributed simulations, mainly expressiveness, end-user adaptation, and a strong reuse potential. This leads us to Section 4, where the generic environment for distributed simulations is described. We show how the customization capabilities of C + + / / are put at work in order to obtain a very general and versatile environment. Finally, Section 5 is dedicated to interconnection network aspects, mainly the actual communications over the network.

2. STRUCTURE OF THE SLOOP SYSTEM

Figure 1 summarizes the system structure. The bottom layer deals with communication algorithms, mapping strategies (both static and dynamic), and overlapping communication/computations.

PARALLEL OOP FOR PARALLEL SIMULATIONS 37

Fig. 1. The SLOOP system.

These primitives are used to program the middle layer, a parallel object-oriented language (C + + / / , an extension of C + + ), which in turn offers reusability, flexibility, and extensibility in concurrent programming. We achieve these features with polymorphism between objects and processes: a variable statically defined with an object type (not a process) can dynamically reference a process.

Finally, the third level is devoted to the development of a new discrete event simulation system based on the object paradigm. This system is currently designed and implemented with two main concerns in mind: (1) performance on conventional, networked, and multiprocessor machines,

38 F. BAUDE ET AL.

and (2) versatility and ease of use. The distributed version of the simulator uses the C + +/ / language.

Altogether, SLOOP is not a self-contained system since we are using external components in order to program each level. At the bottom, the interconnection network layer uses communication primitives such as PVM [37]. The simulations environment layer requires a statistics library in order to analyze results [3], and needs some persistence to deal with the large amount of data being manipulated. The parallel object-oriented language level would benefit from resilience; dealing with long and distributed simulations requires the ability to handle the failure of one or several machines without restarting the entire execution.

Our system builds and uses primitives, components, or libraries for its implementation, but in order to be as flexible as possible, we try to give the final user some control over the building blocks and algorithms being used --in order to map the system on a specific architecture, for instance. This trend is sometimes referred to under the name open implementat ion [24].

3. PARALLEL OBJECT-ORIENTED LANGUAGE

We defined an extension of C ++ called C + + / / ; this work pursues the work done with the design of E i f f e l / / [14]. While E i f f e l / / offers a specific model of parallel programming, we defined with C + + / / a first layer which is a set of language primitives, independent of any parallel paradigm, and which permits us to build libraries of nearly all concurrent programming models. Indeed, such language primitives constitute a meta- object protocol (MOP) [24]. A MOP mechanism for C + + being defined, various libraries of concurrent programming models can be designed and implemented; the model presented here is an MIMD model based on data-flow synchronizations (wait-by-necessity), but other paradigms are possible, and likely to be defined in the future.

In the remainder of this section, we first present the parallel model and language we use for most of our applications, and especially for programming the simulation classes and their parallel execution. Then, we present our environment, and, without going into the details of the implementation, we give an overview of the MOP.

3.1. BASIC MODEL OF CONCURRENCY

The following subsections deal with dimensions of the programming model that are quite recurrent in parallel programming: parallel activities (processes), communication between processes, synchronization, and con-


trol programming. We shall adopt an MIMD model without shared mem- ory, which means no shared objects in an object-oriented framework.

3.1.1. Processes

One of the object-oriented breakthroughs is the unification of module and type aspects into one construct: the class. When it comes to adding parallelism, another unification is to bring together the concepts of class and process:

V Model: the process structure is a class; a process is an object executing a prescribed behavior.

However, not all objects are processes. At run-time, we find two kinds of objects:

- -process (process object, or active object): active by itself, with its own thread of control,

- -objec t (passive object): a normal object waiting for a call to execute its routines.

An example of objects and processes at run-time is given in Figure 2.

(i) [ ; ; g X

Legend

C------") Sub-system O Process/Active Object ~ Object

~- Pointer ~i Void pointer

Fig. 2. Processes and objects at run-time.

o Member

40 F. B A U D E ET AL.

At the language level, there are two ways to generate active objects. In the first one, an active object is obtained from the instantiation of a standard sequential c+ + class:

A* p; / / A is a normal sequential class p = (A*) new Process_alloc (typeid (A) . . . . );

In that case, a standard sequential class i is instantiated as an active object, and is given a FIFO synchronization. The P r o c e s s _ a l l oc class is part of the C + + / / library, while t y p e i d is the standard c++ run-time type identifier (RTI ' I ) operator. We will refer to this technique as allocation-based, which produces an allocation-based process, or allocation-based actiue object. The allocation style is convenient, but limited because it only allows us to create processes with a FIFO behavior.

The second technique, which we call class-based, is more general:

17 Modek all objects which are an instance of a class that publicly inherits from the class P r o c e s s are processes.

The P r o c e s s class is also part of the C + + / / l i b r a r y . Processes are objects which are instances of subclasses of the P r o c e s s class. In that case, the programmer has to define a specific class in order to obtain a process. Similarly to the instantiation technique, the P r o c e s s class gives a default behavior which is FIFO; however, as we will see in the following sections, it will be possible to change this default activity. The class-based technique generates class-based processes, or class-based active objects. A class that inherits from P r o c e s s is called a process class:

class Paral le l_A: public A, public Process {

};

Parallel_A* p; p = new Paral le l_A ( . . . ) ;

A process is sequential, that is, single-threaded. We believe that single- threaded processes are better adapted to reuse, not speaking about the correctness problem of multi threaded processes. Accordingly, a process has a unique thread of control for the definition of its activity.

The model does not allow multi threaded processes at the language level, but this does not prevent multithreading at the implementation level:


sequential processes can be implemented with multithreaded operating system processes for the sake of lightweightness.

3.1.2. Communication

A process is an object; it has member functions. When an object owns a reference to a process, it is able to communicate with it: call one of its public members. This basic object-oriented mechanism is the inter-process communication (IPC) mechanism:

17 Model." communications between active objects are syntactically programmed as member function calls.

Using this idea, introduced in particular by the Actors model [20, 1], the syntax of an IPC is unified with a standard call; what is sometimes called a process entry point is identical to a normal routine or member function:

p ~ f (parameters) ;

While this principle is widely recognized and used, many divergences appear when it comes to defining the semantics of such IPCs. In c ++//, the calls are implicitly (by default) asynchronous:

17 Model: communications are asynchronous between processes.

Function calls between passive objects remain synchronous as happens in standard sequential c++. This choice allows and encourages parallel execution of objects, and makes each process code more independent and more self-contained, mainly because of the asynchrony. However, a synchronous function call is also possible, but must be stated explicitly in the function call itself or in the process class definition.

In Figure 2, we can observe that a system is structured into independent asynchronous subsystems labelled (i) to (v): all the communications between subsystems are asynchronous [labelled (1) and (2)], while others remain synchronous [(3) to (6)].

If two processes refer to the same object, routine calls to this object may overlap. To address this issue, the choice has been made that each nonprocess object is a private object, accessible to exactly one process; there are no shared passive objects. Only one thread of control has access to a passive object: the process that directly or indirectly refers it. We say that it belongs to this process subsystem.

The programming model ensures the absence of sharing: the semantics of communication between processes is a copy semantics for passive

42 F. BAUDE ET AL.

objects. All parameters are automatically transmitted by copy from one process to another (deep-copy of the object). Of course, active objects remain subject to the reference semantics: all processes are always transmitted by reference. The implementation automatically and transparently handles the marshalling of data and pointers implied by this strategy. Figure 2 demonstrates the absence of shared objects: each passive object is accessible by one and only one active object; a subsystem is one active object and all the passive objects he can reach.

3.1.3. Synchronization

A simple rule permits us to address synchronization: wait-by-necessity.

V Model: a process is automatically synchronized, i.e., it waits, when it attempts to use the result of a member function call that has not been returned yet.

When starting an asynchronous function call, the caller does not wait for the return value until explicitly used for some computation. Should a value not have been returned at this point, a wait is automatically triggered until the value has been returned. This mechanism implicitly adds synchronization between processes. Two primitives (Wait and Awaited) provide for explicit synchronization:

v = p ~ f (parameters);

v ---, foo( ); / / A u t o m a t i c a l l y triggers a wait / / i f v is awaited

if (Awaited (v)) -.. / / T e s t the status of v

Wait (v); / / E x p l i c i t l y triggers a wait / / i f v is awaited

obj ~ g(v); v2 = v;

/ / N o wait if pointer access

The result of a function call not yet returned, is called an awaited object. A wait occurs only when one needs to access an awaited object itself, syntactically a pointer access to the object, or a transmission (a copy) to another process.

Wait-by-necessity is an implicit, user-transparent, future mechanism; it can be related to the future concept found in several languages: Act l [27]


and the primitive Hurry, ConcurrentSmalltalk [40] with the CBox objects, ABEL/1 [41] with the future type message passing. However, the important difference is that the mechanism presented here is systematic and automatic, reflected in the absence of any special syntactic construction. A quite close approach is used in the Mentat language [19]. However, data-driven synchronizations, which are based on information generated by the Mentat compiler, seem to be limited to user-defined Mentat classes.

3.1.4. Control Programming

This section deals with the programming of the processes, the definition of the behavior of active objects, the synchronization between the services (public members) of a process--the control programming as it consists of programming their thread of control.

The basic mechanism to program the process behaviors in the c++// language is:

V Model." a centralized and explicit control.

The explicit control programming consists in the definition of the L ive routine of the P r o c e s s class and its heirs (Fig. 3), using all the classical

class Process {

public:

Process (..-) {

}

protected:

virtual void Live 0 {

}

};

//Process creation

/ /The process body

//Default behavior: FIFO policy

Fig. 3. The Process class.


sequential control structures of the language (for, while, if . . . . ); all the expressive power of c + + is available, without any limitation.

Besides explicit control, other features are needed in order to effec- tively achieve the programming of process control.

First, the programming of a process thread of control consists, often, in defining the synchronization of its public member functions. Such an activity implies a dynamic manipulation of c + + functions. Accordingly, we need:

V Model." member functions as first class objects.

Not a full-fleshed first class mechanism is needed, but some limited features such as the ability to use routines as parameters, systemwide valid function identifiers, etc., are necessary. At the language level, we provide the primitive mid ( ), which returns function identifiers, with the following usage:

member_ id f;

f = mid (put); f = mid (A::put); f = mid (A: :put, A::get); f = mid (A::put(int, P* ));

In order to deal with overloading, this function returns either a single identifier, or a representation of all adequate functions.

Because we need to explicitly program the service of requests, it implies manipulating them as objects (passing them as parameters of other functions, assigning them to variables . . . . ), which in turn requires:

I7 Modek requests as first class objects.

In c + + / / , a particular class (Request) models the requests; every request is an instance of this class. Finally, to be able to fully control and program the service of requests, either for programming abstractions or service routines, we need a complete:

17 Model: access to the list of pending requests.

With these three basic features, it is possible to program a complete library of service routines [9]. Part of the library is shown in Figure 4, where f and g are member identifiers obtained from the function

P A R A L L E L O O P F O R P A R A L L E L S I M U L A T I O N S 45

Non-blocking services:

serve_oldest 0; / /Serve the oldest request of all

serve_oldest (f); / / T h e oldest request on f

serve_flush (f); / / T h e oldest on f , wipe out the others

Blocking services: Wait until there is actually a request to serve

bl_serve_oldest (); / /Serve the oldest request of all

bl_serve_oldest (f,g, -..); / /Serve the oldest request on f or

bl_serve_flush (f); / / --.

g

Timed blocking services: Blocking, but wait for a limited time only

tm_serve_oldest (f,t);

tm_serve_flush (t);

Information retrieval:

exlst_request 0

exist_request (f)

Wai t ing primit ives:

wait_a_request 0;

wait a request (f);

Fig. 4.

/ / The oldest request on f

/ / . . .

/ / Return True if a pending request exists

/ / True if a pending request exists on f

/ / Wait until there is a request to serve

/ / W a i t a request to serve on f

A library of service routines.

m i d ( ) in t roduced in the previous section. Most of the p rogramming of service routines actually requires accessing the list of pending requests.

Service routines are defined in the class P r o c e s s , and ready for use when programming the L i v e routine. There is actually no limitation in the range of facilities that can be encapsulated (e.g., selection based on parameters of the requests, rescheduling of requests). Moreover , if the p rog rammer does not find the particular selection function he needs, he is able to p rogram i t - -d i rec t ly scanning the list of pending requests which is made available through a member of P r o c e s s . Thus, libraries of service

46 F. BAUDE ET AL.

class Buffer: public Process, public List {

protected:

};

virtual Live 0 { / / T h e process body

while ( ! stop ) {

if ( ! full)

serve_oldest (mid(put) );

if ( I empty)

serve_oldest (mid(get) );

}

}

Fig. 5. A bounded buffer.

routines, specific to programmers or to a particular application domain (for instance, simulation), can be defined. Part of these possibilities come from the MOP (see Section 3.2.3).

As an illustration, within the framework of an explicit control programming, a bounded buffer will be defined as in Figure 5.

3.1.5. Library of Abstractions

Using the basic features defined in the previous section, it is possible to program what is usually a built-in feature in parallel languages [10-12]: abstractions for concurrency control.

An abstraction is a specific framework for expressing concurrency control. A lot of implicit control frameworks exist (path expressions [8], synchronization counters [36, 30], behavior abstractions [22], Synchronizers [2], just to name a few), each one having a different expressive power and different properties. Here, a decisive advantage is the possibility within c + + / / to design and program implicit control frameworks. Such abstractions can be put into libraries to offer a wide choice of styles for programming the synchronization of processes.

In order to define a new abstraction, one inherits from the Process

class, and sets up a specific framework. In order to exploit the abstraction, the final user inherits from it instead of the P r o c e s s class when defining an active object.


For instance, we can program a simple abstraction whose principles are (1) a blocking condition (a function returning true or false) is associated with each public function, (2) a function is not served when its blocking condition is true. A class named Abst_Process can define such a framework, with a function associate to permit the user to specify a blocking condition for a public function. Using this abstraction, the synchronization of a process will be defined within the synchronization function: the process body, replacing the L i v e routine.

Using the abstraction above, a bounded buffer can be programmed in an implicit style:

class Buffer: public AbshProcess, public List {

protected: virtual synchronization ( ) { / / T h e process body

associate (mid(put), mid(full)); associate (mid(get), mid(empty));

} );

In that case, the definition of the buffer is much more abstract than the previous one; here, the programming is made in a declarative manner, and is nondeterministic.

Figure 6 illustrates the possibility to define and program a library of abstractions. Such a mechanism is of first importance regarding simulation: specific frameworks, dedicated to the synchronization of distributed simulations, are necessary and can be programmed.

3.2. C ++// ENVIRONMENT

This section describes the set of facilities supporting the development of C + + / / applications. Currently these are the compilation of source code together with executable generation, a mechanism for mapping active objects onto machines, and the set of MOP techniques which make the system open and user-extensible.

3.2.1. Compilation

Compilation is achieved solely by a preprocessing of the source files. The preprocessor does not modify the user classes, but only generates extra code-- in separate files.


A B S T R A C

T :

O Class

Inherit from

Fig. 6. Library of abstractions.

For each user file, a corresponding C + + file is generated; it contains stub and proxy classes for the user classes. Then, these generated files and the original user files are compiled with a standard C+ + compiler (preprocessing, compilation, assembly), and finally all files are linked together with a specific c + + / / l i b r a r y ; Figure 7 illustrates this compilation scheme.

In order to achieve the processing of its source files, the user applies the command c ++ 11. For each source file (e.g., name f i 1 e), code generation is actually achieved in two phases which are transparent to the user. The first one analyzes the source and generates an information file (named f i l e - l l ) in a directory called . c + + l l underneath the original file. The second phase generates a c++ file ( f i l e - l l . cc).

3. 2. 2. Mapping

The mapping assigns each active object created during the execution of a c + + / / program to (1) an actual machine (or processor), and (2) an operating system process.


C++fl classes (files) Generated C++

. . . . . . . . . . . . > I

Legend "~" ~ ' ~ f Q

~ Standard C++ compilatio~

Fig. 7. Compilation of a C+ + / / system

In order to avoid confusion, the active object--actually, one active object and all the passive objects that belong to it, i.e., a subsystem--is called here a language process (a concept of the language), while we term OS process the usual notion of an operating system process.

The mapping of a language process to a pair (actual machine, OS process) is controlled and defined by the programmer through the associa- tion of two criteria:

V Modek

(1) the machine where the language process is to be created, (2) its lightweight or heavyweight nature.

The machine itself can be specified in two ways. The first method is to specify a virtual machine name, which is plainly a character string. This name is related to an actual machine name through a specific file named • c ++ 11 -mapping.

The alternative technique used to specify the machine makes use of an already existing language process: the new process is created on the same machine where the language process is running. With this technique,

50 F. BAUDE ET AL.

processes can be linked together, ensuring locality. A process is anchored to another one: its mapping will automatically follow that specified for the process it is grounded on. An anchor can transitively reference another anchored process.

The lightweight switch permits creation of several language processes inside a single OS process. In the case of heavyweight, only one language process is mapped into an OS process.

The user accesses these mapping possibilities through a special class n a m e d Mapping:

class Mapping { public:

on__machine (const String& m); with_process (Process* p);

set_light ( ); set_heavy ( );

};

/ / S e t a virtual machine name / / S e t the machine to be the / / s a m e as / / f o r the already existing / / p r o c e s s p

/ / S e t to lightweight process / / S e t to heavyweight process / / ( O S process)

When creating a language process, an object of type Mapping can be passed to the allocator (new) in order to specify the desired mapping of the process to be created.

From the basic functionalities, more sophisticated mapping strategies can be developed, deriving heirs from the class Mapping; for instance, cluster classes allowing the gathering and managing of processes in a more abstract manner, especially in the framework of distributed simulations. In the longer term, we aim to develop automatic or semiautomatic load balancing classes, for instance, through the modeling and evaluation of machine and network load.

3.2.3. A reflection-based system

The C+ + / / system is based on a meta-object protocol [24]. There are various MOPs, for different languages and systems, with various goals,


compilation and run-time costs, and various levels of expressiveness. Within our context, we use a reflection mechanism based on reification. Reification is simply the action of transforming a call issued to an object into an object itself; we say that the call is "reified." From this transformation, the call can be manipulated as a first class entity: stored in a data structure, passed as parameter, sent to another process, etc.

MOP techniques have been used in many works in order to have an elegant modeling of various language concepts, and an extensible design and implementation [6, 7, 9, 13, 16, 28, 38, 39]. Works using more traditional methods such as the so-called proxy generators are sometimes very close techniques [5].

Within C+ + / / , the first principle consists in giving the access to the calls issued to an object through a special class, called R e f l e c t , which presents the following behavior:

V Model: all classes inheriting publicly, directly or indirectly, from R e f l e c t are called reified classes: a reified class has reified instances, and all calls issued to a reified object are reified.

The fact that all calls issued to a reified object are reified is important regarding reusability: it permits one to take a normal class, and to globally modify its behavior, to transform it into a process, for instance.

The R e f l e c t class implements the reflection mechanism with reification:

class Reflect { protected:

virtual void reify (Call* c) { c --* execute ( );

} public:

Reflect (type_ info t . . . . ) {

} ];

/ / A call reification

A R e f l e c t class creation returns a meta-object (a proxy) for the type being passed in the constructor's first parameter; t y p e i n f o is the standard RTFI class of c + +.

52 F. BAUDE ET AL.

All the calls issued to this object will trigger the execution of the member function r e i f y with the appropriate object of type C a l l as a parameter:

class Call { public:

virtual void execute ( ); List(Any>* eff_params; / / e f f ec t ive parameters member_id m; / / m e m b e r to be called Any object; / / t a r g e t object Any result_place; / / R e s u l t address

};

The instances of the C a l l class are the reified calls, the objects which represent the reif{cation of calls.

From these elementary mechanisms, we implement the basic classes of the programming model presented in Section 3.1 (Process_alloc, Process). For instance, the class Request modeling remote calls between processes is defined as an heir of Ca l l .

Such a system is an open system, sometimes called open implementation [24], extensible by the end-user, and adaptable to various needs and situations (defining libraries of various concurrent programming models, debugging environment and tracing, etc.). In our case, it is important regarding the various customizations needed for distributed simulations (Section 4), and efficient implementation of communications (Section 5).

4. PROSIT: AN OBJECT-ORIENTED FRAMEWORK FOR DISCRETE EVENT SIMULATION

C + +/ /development is driven by reusability and flexibility. These goals are tested again in the development of PROSIT, a discrete event simulation application. The programming of simulations applications being very demanding, the C + + / / definition substantially benefits from this challenging testbed.

4.1. OBJECTIVES

PROSIT is a new event simulation framework, designed from the ground up with distributed simulation in mind. Its design is based on the object paradigm, from which naturally derive several interesting features:

(i) Modularity and reusability: these allow both PROSIT programmers and end-users to develop high-level model libraries. These libraries may


include submodels for high-level subsystems, or highly optimized simulation classes for commonly used subsystems.

(ii) Target independence: distributed simulation, in both optimistic [21] and conservative [15] variants, and parallel replication, is implemented in such a way that application programmers do not have to take it into account. Their simulation classes inherit parallel methods only if needed, and the choice of sequential or parallel implementation will only be made at the final compilation stage. Furthermore, simulation classes and user programs are independent of the simulation method used.

(iii) Extensibility: various tools may be incorporated, with little or no changes in basic or user written classes. These tools include statistics gathering and processing, automatic load-balancing, animation, submodel aggregation, analytical solvers, etc.

4.2. MODELING PROCESS

In PROSIT, the user builds his model by initializing and assembling class instances (the design philosophy and the modeling process have been presented in [29]). The whole process is described by Figure 8. The PROSlT framework defines the following base classes:

--simulation classes used to build the simulation engine, --modeling classes used to program models.

Base classes of Prosit simulator

t CI .asses library for a given problem

Model I

Executable [

Fig. 8. Modeling process.

Prosit system

Library Programmer

Final user Simulationist

54 F. BAUDE ET AL.

To mask the simulation paradigm to the final user of the simulator, a library programmer will define a set of classes for a specific field of application. These classes, gathered in a library, will allow one to build a model at a higher description level than the one corresponding tO the simulation paradigm, e.g., by manipulating cashiers instead of FIFO servers in a supermarket simulation. This possibility to define high-level libraries is crucial if we want the simulator to be usable by engineers of the simulation domain. In fact, the simulation paradigm itself is contained in the description and code of the modeling classes, and hence hidden.

4.2.1. Control of execution

In many simulation systems, the user describes a simulation model as being a set of servers which (1) execute some kind of service on the customers they receive, and then, (2) forward the customers to another server. Active entities in the model are the servers; they decide what to do with the customer they are processing. The path of customers inside the model can be obtained only by analyzing all server descriptions to retrieve information about customer transit; this structuring can be named the server architecture.

In our system, we decided to revert the control of execution from the server to the customer: the customers are active entities in the model, they decide themselves their path of transit in the set of servers. In a more object-oriented fashion, the user programs the customer with a body method, which describes the behavior of a customer of that type. We call this structuring a customer architecture, by opposition to the usual server- oriented architecture.

For instance, in a supermarket simulation, the objects entrance, shelvingl .... , shelvingN, exit being servers, the body method of a customer (a supermarket client in that case) could be defined as in Figure 9.

We believe this inversion allows for more reusability. For instance, the servers do not have the path followed by the customers embedded within their code, making both customer and server classes more self-contained, and thus more reusable. Another important issue in simulation being statistics collection (server, end-to-end, and client statistics), the customer architecture provides the user with a frame where client measurement can be placed: the client itself. With the server architecture, these measure points have to be placed into the code of service methods of server objects, which is again incompatible with self-contained clients and servers. Let us note that both architectures may be easily mixed, and are supported in PROSIT.

PARALLEL OOP FOR PARALLEL SIMULATIONS

class Client: public Active_Customer {

public:

virtual void body() {

entrance- >enter (...);

shelvingl->serve (...)

shelvingN->serve (...);

exit->exiting (...);

}

Fig. 9. Partial implementation of a customer class.

55

4. 2. 2. Component specification

A Prosit simulation can be thought of as a collection of concurrently active simulation objects (active simulation objects derive from P r o c e s s class) interacting, via service calls, in the simulated time. An active simulation object encapsulates both the specification and the behavior (see Section 3.1.4) of the modelized entity. Each simulation entity is character- ized by its behavior and by the offered services. The modeling classes are used to create structural and functional identical entities. When coding a simulation, three situations can arise:

(i) The modeling class already exists in the library. (ii) The modeling class is not in the library, but a base class with the

corresponding behavior exists. The simulationist simply has to code a class, inheriting from the simulation class, and having new members (data and function) implementing the services.

(iii) The behavior of the component is specific, therefore the simulationist has to code it. This can be done in a systematic way thanks to the adopted architecture and the facilities provided by object-oriented languages (mainly overloading and redefinition). This new class m , st define the internal management of arriving requests and the policy for creating and scheduling activities.

56 F. BAUDE ET AL.

4.3. AN OBJECT-ORIENTED CONCURRENT EXECUTION MODEL

4.3.1. Active simulation objects

Active simulation objects are the basic components that make up the model (e.g., in a queuing network model, queues and customers are active simulation objects). An active simulation object executes its main activity in an autonomous way, independently of, and concurrently with, other active simulation objects. Active simulation objects can also have secondary activities. The activities are all running concurrently in the simulated time (Fig. 10).

We have extended the basic concurrency model of C + + / / ( s e e Section 3.1) by introducing multiactive objects specially designed for simulation. Multiactive objects are made up of activities, implemented using a corou- tine-like facility developed at the University of Washington [23].

When created (using standard C + + / / c r e a t i o n rules), the active simulation object is automatically managed by the simulation kernel, and is ready to be activated. Its activation time is indicated at creation as a parameter of the constructor. When activated, the object begins to execute its main activity.

During its life (Fig. 11), the object can either be in a running mode (currently executing or consuming time), in a blocked mode (the main activity is blocked due to a synchronous request), or in a sleeping mode (it has put itself in idle-wait state, waiting to be reactivated by another object).

Fig. 10. Basic active object architecture.

PARALLEL OOP FOR PARALLEL SIMULATIONS

ahoY. . . f ............... :.....

F ig . 11. P o s s i b l e s t a t e s f o r a n ac t i ve s i m u l a t i o n o b j e c t .

57

The object is considered to be finished when its main activity has termi- nated. We distinguish normal termination, abort (the object decided to terminate prematurely), and external abort (another object has used the termination primitive).

To be active, an object must be an instance of a class inheriting directly or indirectly from the simulation class Sim_Process, a process class that itself inherits from the C + ÷ / / P r o c e s s class. Such a class-based process must at least define the main activity of its instances.

4.3.2. Activities

Activities are used to execute concurrently member functions of active simulation objects. The main activity executes the behave( ) function, and secondary activities are attached to other functions. For example, in a queuing network simulation, customer behaviors and service executions are activities.

An activity has a duration in the simulated time and can halt and be later reactivated (Fig. 12). Between the suspension and the reactivation, some predefined or random time may have elapsed. An activity will halt either when explicitly consuming time (wait()) or when making a synchronous request to an active simulation object.

An activity terminates when the corresponding function finishes. We have also implemented primitives allowing an activity to terminate itself explicitly or to kill another activity.

An activity is an instance of the simulation class Sire_Activity. This class is specially defined for simulation purposes, and is used for the definition of the S im_ P r o c e s s class.

58 F. BAUDE ET AL.

Activity execution a c t i ~ _-

reactivation

Fig. 12.

D suspension point

~ suspension point

Noncontinuous execution of an activity.

4.3.3. Interaction

The basic interaction between active simulation objects is the service call; C + + / / u n i f i e s member function calls and service calls. A service call can be seen as sending a request to an active simulation object. The call can either be synchronous (the calling activity blocks until termination of the service) or asynchronous (the calling activity carries on immediately).

The request management policy and the service progress depend on the receiving object characteristics. The receiver decides whether, when, and how to execute the service. This means that a request can be fully or partly served or even not served at all.

We also distinguish two levels of parameters in a service call: (1) the service parameters which are related to the model semantics, (2) the control parameters related to the execution, by the active simulation object, of the service (priority, maximum execution time, etc.).

With regard to interaction between simulation objects, the C + + / / capability to manipulate requests as first-class objects is very important. It makes it possible to define specific request classes, inheriting from the basic C + + . / / one (Reques t ) , that add the various control parameters needed for the simulation.

4.4. SEQUENTIAL AND DISTRIBUTED IMPLEMENTATION

The PROSIT simulation and execution model are now completely specified. We have also developed a C + + sequential version of the simulator [34, 33]. In order to illustrate possible applications and to validate our propositions, we have designed a library for the queuing network paradigm and we are starting to work on a Petri-net library.


Regarding the distributed execution, we focus our investigation in two directions. First, we are extending our execution model in order to resolve the problems arising when distributing the execution of the active simulation objects (modification of the synchronization algorithms, model parti- tioning, active simulation object migration, load balancing, etc.). Second, we are actively collaborating in the development of the C + +/ / language, in order both to validate its implementation and to ensure that distributed simulation in C + + / / w i l l get optimal performance and ease of use.

5. DISTRIBUTED IMPLEMENTATION OF C + + / /

The implementation of the C + +/ / language onto distributed architectures requires an appropriate run-time for active objects creation and communications between them. Moreover, we require that this run-time can be easily portable on various targets, including networks of heteroge- neous workstations. This implies that all the run-time specific parts dedicated to interprocess communications and process creation have to be achieved through the use of standard and portable communication libraries, e.g., PVM.

We will first present the various asynchronous communication protocols used for C + + / / , then describe their implementation as a library, and finally point out how this library can be used in other contexts.

5.1. COMMUNICATION PROTOCOLS

Data exchange between active objects (instances of classes deriving from the system class P r o c e s s , so-called language processes) can follow three different protocols, all being asynchronous:

(i) communication with rendezvous: the sender is blocked until: (1) a rendezvous is made with the receiver which is interrupted in order to enter the rendezvous, (2) the request is transmitted into the receiver context.

(ii) reactive communication: the communication is nonblocking for the sender, but the receiver is also interrupted when a new request arrives (no rendezvous).

(iii) proactive communication: the communication is nonblocking for the sender, and the receiver is not interrupted; some explicit receiving action has to be undertaken by the receiver in order to get the newly arrived request(s).

The location of an object is provided by means of a parameter for the allocator of the object class. At the present time, this location is corn-

60 F. BAUDE ET AL.

pletely settled by the programmer, using the Mapping class which defines the location for an active object. It is conceivable to compute this location at run-time, using a load regulation tool suited to the target application domain, as for instance, discrete event simulations.

5.2. C ++/ / COMMUNICATION L IBRARY

Each of the communication protocols is implemented in the form of a communication port using a C + + class. The organization of these classes using an inheritance graph (see Fig. 13) enables one to easily add a new communication protocol.

The interface of each type of port has a sending function and a receiving function (see Fig. 14).

In order to block a sender, a classical acknowledgment mechanism is used. The implicit waiting of a message by the recipient object (which is the case in the rendezvous and reactive communication modes) requires one to use a signal mechanism and an associated handler. The main role of the handler is to invoke the member function r e c e i v e ( . . . ) of the target object. The receiver never calls its receive function. On that account, the object is never blocked while waiting a message because they are put in its context by the handler, in a transparent way.

The call to low-level functions, in charge of both intermachine communications and active object creations (heavy- or lightweight processes), is

\ po Q P~ve

0 Fig. 13. Classes hierarchy.


class Port : ... {

public:

virtual void receive(Message *o, Process_Id *p)=0;

virtual void send(Message *o, Process_Id *p)=0;

}

/ / Receive function

/ / Send function

Fig. 14. Interface of the generic class Port.

designed in such a way that any communication library can be interfacedJ Specific points related to the use of one or another library have to be taken into account in our run-time implementation in order to be as generic as possible. Currently, a run-time using heavy processes (a language process is implemented by a Unix process) and interfacing PVM has been developed; a run-time using threads (a language process can be implemented by a lightweight process) is currently being written.

5.3. REUSE OF THE COMMUNICATION L I B R A R Y

The definition of communication ports and the mapping parameters make this work reusable outside the context of the C + + / / run-time. In particular, it is possible to write C + + programs which create distant active objects, communicating through ports, and using the three different communications protocols of C + + / / .

Other types of communication libraries exist, for instance, to provide a convenient access to the PVM functions while programming with a different language than C; we can mention an ADA to PVM interface [4] and Para + + developed at INRIA Lorraine [17]. They are somehow similar to the one described here; however, their communication protocol is strictly the PVM one, while the library described here provides the specific communication semantics of C + + / / .

6. CONCLUSION

The top level of the SLOOP system offers a generic environment for distributed discrete event simulation. However, the two other parts of this

1We can mention PVM [18], MPI [13], extensions with threads like PT-PVM [25] and PM2 [35], and combination of a communication library and a thread package like pthread [32].


three-layer system are also available as stand-alone tools: the parallel object-oriented language C + + / / , and a library of communication routines and mapping.

In order to improve the performance, we are investigating the possibility of incorporating a computa t ion /communica t ion overlapping mechanism. The underlying idea is to send a list of objects using several messages in order to be able to start the work on the receiver side as soon as the first part of the data has been received (while the remaining part of the message is still on its way). This mechanism is based on physical and logical cutting up of the objects to be sent.

While developing the distributed simulations, important requirements were raised; the C + + / / flexibility greatly helped to achieve them. First, interprocess communications (requests sent from one process to another, replies returned to callers) need specific t reatment in simulation, especially the introduction of extra information (e.g., simulation time, priority). The C + + / / p rogrammer is able to define extensions of the class Re- q u e s t in order to deal with this aspect. Second, coroutines, an important feature of simulations as reflected by their status in Simula, are very much needed to control the activation of simulation objects. The capability to derive from the class P r o c e s s active classes permits us to offer this feature. This class S i m _ i c t i v i t y encapsulates coroutines capabilities, with specific behavior and interface. Finally, a crucial point is the scheduling of these activities. We are currently investigating this aspect, using and extending the M a p p i n g class with dedicated mechanism in order to give the simulation programmer some control over the scheduling policy.

R E F E R E N C E S

1. G. Agha, Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge, MA, 1986.

2. G. Agha, S. Fr~lund, W. Y. Kim, R. Panwar, A. Patterson, and D. Sturman, Abstraction and modularity mechanisms for concurrent computing, IEEE Parallel and Distributed Technology--Systems and Applications, May 1993.

3. M. Badel, T. de Pretto, P. Mussi, and G. Siegel, Stat-Tool: An extensible and distributed object oriented statistic tool for discrete event simulation, presented at Object-Oriented Simulation Conference, 1996.

4. F. Baude, N. Furmento, and D. Lafaye de Micheaux, Managing true parallelism in ADA through PVM, in First European PVM User's Group Meeting, Oct. 1994. [http: / /www.netlib.org/pvm3 /epvmug94 /.]

5. A. Birrell, G. Nelson, S. Owicki, and E. Wobber, Network objects, Tech. Rep. SRC-RR-115, Digital Systems Research (DEC), 1995. [http://gatekeeper.dec.com/ pub/ DEC/SRC/research-reports/ abstracts/ src-rr-115.htm1.]

6. D. G. Bobrow, L. G. DiMichiel, R. P. Gabriel, S. E. Keen, G. Kiczales, and D. A. Moon, Common lisp object system specification: X3J13 document 88-002R, SIG- PLAN Notices 23, Sept. 1988.

P A R A L L E L O O P F O R P A R A L L E L S I M U L A T I O N S 63

7. F. Buschmann, K. Kiefer, F. Paulish, and M. Stal, The meta-information-protocol: Run-time type information for C ++, in Proceedings of the International Workshop on Reflection and Meta-Level Architecture, 1992, pp. 82-87.

8. R. H. Campbell and A. N. Haberman, The specification of process synchronization by path expression, in Colloque sur les aspects th~oriques et pratiques des syst~mes d'exploitation, Paris, 1974.

9. D. Caromel, Concurrency: An object oriented approach, in J. Bezivin, B. Meyer, and J.-M. Nerson, Eds., Technology of Object-Oriented Languages and Systems (TOOLS' 90), Angkor, June 1990, pp. 183-197.

10. D. Caromel, Programming abstractions for concurrent programming, in J. Bezivin, B. Meyer, J. Potter, and M. Tokor6, Eds., Technology of Object-Oriented Languages and Systems (TOOLS Pacific '90), TOOLS Pacific, November 1990, pp. 245-253.

11. D. Caromel, A solution to the explicit/implicit control dilemma, Object-Oriented Program. Syst. Mess. 2(2) (1991).

12. D. Caromel, Abstract control types for concurrency (Position Statement for the panel: How could object-oriented concepts and parallelism co habit?), in L. O'Con- net, Ed., International Conference on Computer Languages (IEEE ICCL'94), IEEE Computer Society Press, 1993, pp. 205-214.

13. D. Caromel and M. Rebuffel, Object based concurrency: Ten language features to achieve reuse, in R. Ege, M. Singh, and B. Meyer, Eds., Technology of Object- Oriented Languages and Systems (TOOLS USA '93), Prentice Hall, Englewood Cliffs, NJ, 1993, pp. 205-214.

14. Denis Caromel, Towards a method of object-oriented programming, CACM, 36(9) (1993).

15. K. Chandy and J. Misra, Asynchronous distributed simulation via a sequence of parallel computations, Commun. ACM 24(1):198-206 (1981).

16. S. Chiba and T. Masuda, Designing an extensible distributed language with meta- level architecture, in Proceedings of the 7th European Conference on Object-Oriented Programming (ECOOP'93), Kaiserslautem, July 1993, pp. 482-501.

17. O. Coulaud and E. Dillon, PARA++ : C ++ bindings for message passing libraries: User guide, Tech. Rep. 0174, INRIA Lorraine, June 1995.

18. A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam, PVM 3 User's Guide and Reference Manual, Engineering Physics and Mathematics Divi- sion, Oak Ridge National Laboratory, May 1993.

19. A. S. Grimshaw, Easy to use object-oriented parallel programming with Mentat, IEEE Computer, pp. 39-51 (1993).

20. C. Hewitt, Viewing control structures as patterns of passing messages, J. Artif. Intell. 8(3):323-364, (1977).

21. D. Jefferson, Virtual time, ACM Trans. Program. Lang. Syst. 7(3):404-425 (1985). 22. D. G. Kafura and K. H. Lee, Act ++: Building a concurrent C ++ with actors, J.

ObJect-Orient. Program. 3(1) (1990). 23. D. Keppel, Tools and techniques for building fast portable threads packages, Tech.

Rep. UWCSE 93-05-06, University of Washington, 1993. 24. G. Kiczales, J. des Rivi~res, and D. G. Bobrow, The Art of the Metaobject Protocol,

MIT Press, Cambridge, MA, 1991. 25. O. Krone, M. Aguilar, and B. Hirsbrunner, PT-PVM: Using PVM in a multi-threaded

environment, in J. Dongarra, M. Gengler, B. Tourancheau, and X. Vigouroux, Eds., EuroPVM'95, Parall£lisme, rdseaux et rdpartition, vol. 5, HERMES, Sept. 1995, pp. 83-88.

26. G. Lamprecht, Introduction to Simula 67, Vieweg, 1982.

64 F. B A U D E E T AL.

27. H. Lieberman, Concurrent object-oriented programming in act 1, in A. Yonezawa and M. Tokoro, Eds., Object-Oriented Concurrent Programming, MIT Press, Cam- bridge, MA, 1987.

28. P. Madany, P. Kougiouris, N. Islam, and R. H. Campbell, Practical examples of reification and reflection in C ++, in Proceedings of the International Workshop on Reflection and Meta-Leuel Architecture, 1992, pp. 76-81.

29. L. Mallet and P. Mussi, Object oriented parallel discrete event simulation: The Prosit approach, in Modelling and Simulation, Lyon, June 1993; also in INRIA Res. Rep. 2232.

30. C. McHale, B. Walsh, S. Baker, A. Donnelly, and N. Harria, Extending synchronisa- tion counters, Tech. Rep. TCD-Pub-0011, University of Dublin, Trinity College, Dublin 2, Ireland, July 1990; ESPRIT Project Comandos, nos. 834 and 2071.

31. Message Passing Interface Forum, Document for a Standard Message-Passing Inter- face, Feb. 1994.

32. F. Mueller, A library implementation of POSIX threads under UNIX, in Proceed- ings of Winter USENIX, San Diego, CA, Jan. 1993.

33. P. Mussi and G. Siegel, Sequential simulation in Prosit: Programming model and implementation, Tech. Rep. RR-2713, INRIA, November 1995; also in European Simulation Symposium, Erlangen, Germany, Oct. 1995.

34. P. Mussi and G. Siegel, The Prosit sequential simulator: A test-bed for object oriented discrete event simulation, in European Simulation Symposium, Erlangen, Germany, Oct. 1995, pp. 297-301.

35. R. Namyst and J. F. M6haut, PM2: Parallel multithreaded machine: A computing environment for distributed architectures, in Proceedings of ParCo'95, Grent, Bel- gium, Sept. 1995.

36. P. Robert and J.-P. Verjus, Towards autonomous descriptions of synchronization modules, in B. Gilchrist, Ed., Proc. IFIP, Congress, North-Holland, Amsterdam, 1977, pp. 981-986.

37. V. Sunderam, PVM: A framework for parallel distributed computing, Concurrency: Pract. Exp. 2(4) (1990).

38. T. Watanebe and A. Yonezawa, Reflection in an object-oriented concurrent language, in ACM Conference on Object-Oriented Programming Systems, Languages and Applications ( OOPSLA ), Sept. 1988.

39. Y. Yokote and M. Tokoro, The design and implementation of concurrent smalltalk, in Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications, 1986, pp. 331-340.

40. Y. Yokote and M. Tokoro, Concurrent programming in concurrent smalltalk, in A. Yonezawa and M. Tokoro, Eds., Object-Oriented Concurrent Programming, MIT Press, Cambridge, MA, 1987.

41. A. Yonezawa, E. Shibayama, T. Takada, and Y. Honda, Modelling and programming in an object-oriented concurrent language abcl/1, in A. Yonezawa and M. Tokoro, Eds., Object-Oriented Concurrent Programming, MIT Press, Cambridge, MA, 1987.

Received 25 January 1996

Parallel object-oriented programming for parallel simulations

Documents

Transcript of Parallel object-oriented programming for parallel simulations