RMI versus CORBA There has been lots of talk on the 'Net as to what technology to use when building distributed systems: Java or CORBA. In this paper, we compare and contrast the two technologies. Distributed systems consist of several different layers: Transport layer ORB architecture API The transport layer is the basic network protocol that allows one piece of a distributed system talk to another piece. The ORB architecture is the design of the system supporting the distributed object mechanism. Finally, the API is what a programmer needs to know in order to build a distributed system. CORBA and RMI are two distributed object standards supported by the OMG (Object Management Group). CORBA (Common Object Request Broker Architecture) is an architecture standard for building heterogeneous distributed systems. In other words, it supports the creation of distributed systems whose components are written in many different languages. An example of such a system might be a client/server application where the client is written in Smalltalk and the server is written COBOL. CORBA is specifically concerned with the architecture and API layers, using IDL (Interface Definition Language) to provide the API. CORBA can use different transport layers such as IIOP. RMI (Remote Method Invocation) is an API standard for building distributed Java systems. In its current form, it uses its own proprietary architecture and transport layers, but neither of these are required. OMG has certified RMI using IIOP as the transport layer as the standard for doing distributed computing in Java. By using IIOP as the transport layer, RMI is capable of interfacing with CORBA systems. By using RMI in a pure Java environment, you gain the benefits of distributed garbage collection and full Java semantics. For example, CORBA cannot support distributed garbage collection since not all languages support garbage collection. In addition, because the API in CORBA uses IDL, this means programmers must leave the semantic context of the language in which they are developing in order to build distributed systems. Java RMI enables the developer to write only Java code to build their distributed systems. CORBA and RMI are not competing systems; they are complementary ones. They each suit different needs, and which one you use boils down to what your requirements are. The most basic requirement is whether you require homogeneity or heterogeneity. Systems made up of components in languages other than Java demand CORBA. Systems which must be deployed partially in Java may use a hybrid of RMI interoperating with CORBA. The hybrid solution makes sense if your delivery window is after IIOP support has been added to RMI (sometime next year) and you have a need within your system of Java-to-Java distributed computing in addition to interoperability with other languages. Even in the 100% pure Java scenario, there is a place for CORBA. RMI is specifically not capable of handling very massively distributed systems. It is at this time missing some of the key services that CORBA supports. These services such as transaction management are a requirement of very large distributed object solutions. RMI will support the CORBA services in the future and eventually be a viable large scale solution, but it is not there today. Distributed Garbage Collection Distributed systems typically require distributed garbage collection. If a client holds a proxy to an object in the server, it is important that the server does not garbage-collect that object until the client releases the proxy (and it can be validly garbage-collected). Most third-party distributed systems, such as RMI, handle the distributed garbage collection, but that does not necessarily mean it will be done efficiently. The overhead of distributed garbage collection and remote reference maintenance in RMI can slow network communications by a significant amount when many objects are involved. Question: How does the Distributed Garbage Collection algorithm work? Answer The RMI subsystem implements a reference counting-based distributed garbage collection (DGC) algorithm to provide automatic memory management facilities for remote server objects. Basically, DGC works by having the remote server keep track of all external client references to it at any given time. When a client obtains a remote reference, it is addded to the remote object's referenced set. The DGC then marks the remote object as dirty and increases its reference count by one. When a client drops a reference, the DGC decreases its reference count by one, and marks the object as clean. When the reference count reaches zero, the remote object is free of any live client references. It is then placed on the weak reference list and subject to periodic garbage collection. http://www.objs.com/workshops/ws9801/papers/paper015.html Garbage Collection for Distributed Persistent Objects 1. Introduction Distributed object technology provides efficient and safe distribution and reuse of software components in the network. Applications are composed of independent components which are distributed over the network and can communicate with other components. Due to the technology, developers can create reliable, robust, secure, and maintainable distributed applications easier and faster. Distributed object technology enables components of software to plug-and-play, interoperate across the network, and coexist with legacy applications. A distributed object consists of data and methods along with interfaces through which clients invoke the methods of the object. When a client requests a service to an object, the object is created and executed on its server process. The object replies to the client with results after processing and performing the request. The object is reclaimed as garbage when it is not reachable from any clients or root objects. For this procedure, called garbage collection, CORBA [CORBA-A, 1997; CORBA-S, 1997] uses reference counting, DCOM [Chappell, 1996] uses reference counting and pinging, and Java RMI uses a lease. The persistent object service enables objects to remain long after its server process terminates. To be persistent, the state of an object should be stored in persistent storage such as disks so that the object can be activated later with the stored previous state. The state of an object is stored and loaded either transparently or explicitly: clients can make their objects persistent explicitly to reuse them later; or objects are stored and loaded transparently by the activation and de-activation mechanism. The persistent object service and the activation mechanism can be used to conserve system resources [Wollrath et al., 1995]. Since a system may contain a considerable number of objects, it is unreasonable that objects remain active and consume valuable system resources although no requests are made on them for a long time. When persistent objects are not reachable from any clients or roots, they should be reclaimed as garbage. In this position paper, we discuss the issues of garbage collection for distributed persistent objects, and present a practical and efficient garbage collection mechanism. 2. The issues of garbage collection for distributed persistent objects Garbage collection is a critical problem for administratively decentralized large scale distributed systems which manage distributed persistent objects. This process is important because manual garbage collection is error-prone and it is difficult for clients to maintain the information about references correctly. Many solutions for distributed garbage collection have been suggested. However, they are not suitable for large scale distributed systems because their target systems are small in scale and they cannot collect cyclic garbage. In the ideal distributed system, objects continue to exist as long as they are reachable from clients or root objects and should be reclaimed when unreachable. In practice, this is difficult to support in administratively decentralized large scale distributed systems because of the following reasons: Distributed objects and references are dynamically created, deleted, migrated, and shared across the network. Therefore, it is difficult to determine when an object is not reachable, and whether it is safe to reclaim it. Distributed systems are administratively decentralized, so clients' well-behaving cooperation should not be required. Distributed systems are very large in scale, so it is impossible to get a global view of clients, objects and their references. Servers and clients can crash during garbage collection related operations. Messages can be lost, and the network can be partitioned for a while. Since persistent objects may remain after their clients and servers terminate, garbage objects might stay in persistent storage forever unless garbage collection is performed. When garbage collection is performed by clients manually only, the same problem happens because manual garbage collection is error-prone and clients' well-behaving cooperation should not be required. If some clients terminate without deleting their references, storage is wasted, which even causes system crash. CORBA supports the persistent object service and the activation mechanism. When active objects have not received any requests for a long time (e.g., 15 minutes in NEO), they are de-activated and stored on persistent storage. They are activated again when they are accessed. This process is performed transparently. Clients can store the state of their objects explicitly as well. The persistent objects exist until they are explicitly deleted by clients, which means that garbage objects should be reclaimed by clients manually. DCOM also supports the persistent object service. Objects can be stored into a Running Object Table(ROT), and they can be activated later. DCOM does not support a garbage collection mechanisms for the ROT, so the client which created objects should come back and delete the objects. In [Pjama], root objects are created by clients, and the root objects and objects reachable from the root objects are assumed to be alive. Garbage objects which are unreachable from the root objects are reclaimed by the external garbage collector, opjgc. The current garbage collection mechanisms for active objects in memory are not suitable for persistent objects. Reference counting is simple and scalable, but 1) it cannot collect cyclic garbage objects, and 2) it is impossible to maintain the referencing link counts correctly. Since clients might terminate without reducing their reference counts in a loosely synchronized distributed system, the reference count of an object can be greater than zero, even though no other objects are referencing the object. Both pinging and lease are based on [Birrell et al., 1993] but they cannot collect cyclic garbage. That is, garbage objects in a same cycle will refresh each other, and they never expire. Since persistent objects may remain long after its server process terminates, all garbage including cyclic garbage should be reclaimed completely. Therefore, we need a new garbage collection mechanism which is efficient, scalable and complete. 3. Our approach We have implemented a garbage collection mechanism for the Prospero directory service which maintains distributed persistent objects [Neuman, 1992]. In our system model, all distributed objects are equal, and there are no special root objects. Objects which have a long lifetime (Time-to-Live) effectively serve as root objects. Objects reachable from clients are assumed to be alive, and unreachable objects are assumed garbage. Each object may have a different reclamation method based on its characteristics. For example, temporary objects are removed directly once they become garbage. However, some important objects such as bank accounts are moved to dormant account repository and notify the customers. Our mechanism consists of two major steps: 1) acyclic garbage is collected by timeouts; and 2) cyclic garbage is collected by last referenceable timestamp propagation and backward inquiry. We assume that all clocks in the network are synchronized to some degree, and crashed networks and servers recover within a finite period of time. 3.1 Timeouts Information about references are maintained by timeouts which is the same concept as that of RMI's lease. When an object is created by a client, the client assigns a TTL (Time-To-Live) to the object. The new object also gets an expiration time which is the creation time plus the TTL. The TTL is the period for which the object is guaranteed to exist after being refreshed (receiving a refresh message). When a link is made to the object, the TTL is added to the current time, and the resulting expiration time is stored in both the object and the link. Since each link contains the expiration of the target object, clients can send a refresh message before the expiration time. To guarantee that a link will continue to work (the target object remains valid), it must be refreshed before its expiration. Two optimization techniques are introduced in order to reduce the communication overhead incurred by the refreshing process. First, refresh messages are gathered and sent on a host-to-host basis rather than a client-to-object basis. Second, refreshing frequency is controlled by the likelihood of becoming garbage. [Lieberman & Hewitt, 1983] showed that the likelihood of becoming garbage for a newly created object is higher than that of an old one. Therefore, garbage collection should be performed more frequently for newly created objects and less for old objects. To controlled the frequency of refreshing, the TTL of an object starts from a small number and is increased as the object is accessed. 3.2 Last referenceable timestamp propagation and backward inquiry Because timeouts do not collect cyclic garbage, last referenceable timestamp propagation and backward inquiry are introduced. In practice, distributed object spaces do not have special root objects which are always alive. For that reason, our garbage collection mechanism uses reachability from clients rather than from roots. Cyclic garbage is reclaimed by two steps: last referenceable timestamp propagation which detects local garbage; and backward inquiry which performs partial synchronization and confirms garbage. Each object maintains an LRTS (Last Referenceable TimeStamp) which indicates the most recent time when the object was accessible (i.e., reachable) by clients. When an object is accessed by a client, the LRTS of the object is set to the accessed time, and the LRTS (the accessed time) is propagated to all reachable objects recursively when links are refreshed. All reachable objects from the accessed object will get new LRTSs eventually. The LRTS of inaccessible distributed cyclic garbage, however, will stabilize at a value that is less than or equal to the time at which the cycle became inaccessible. Objects whose LRTSs are not more recent than a local threshold are local garbage. Since each system selects its own local threshold, each has a different threshold. Moreover, it is possible that LRTSs might not arrive at target objects on time because they are sent only during the link refreshing process. Therefore, before local garbage is reclaimed, it should be confirmed to be garbage. The basic idea of backward inquiry is that, before a local garbage object is reclaimed, all referencing objects are examined to see whether they are garbage or not. If all referencing objects are garbage, then the object is also garbage and can be reclaimed. The best way to check referencing objects is to ask them directly to examine themselves whether they are garbage or not, and to reply with results. Since referencing objects refresh their target objects before expiration times, objects can receive refresh messages from all referencing objects. This means that objects can receive information about all referencing objects (i.e., an implicit reference list) within their TTL times, even though they do not have an explicit reference list. Since messages necessary for cyclic garbage collection are bundled with the messages used for timeouts, the overhead of cyclic garbage collection is minimized. 3.3 Evaluation and implication Our garbage collection mechanism satisfies the following properties: The mechanism scales well in terms of the number of distributed objects due to the two optimization techniques of the refreshing mechanism. No extra messages are necessary for cyclic garbage collection because messages needed for cyclic garbage collection are bundled with refresh messages. The latency time to collect garbage increases in proportion to the length of the garbage chain, which is scalable. All garbage including distributed cyclic garbage is reclaimed. Our mechanism does not require global synchronization of all distributed systems. We have a plan to apply our garbage collection mechanism on practical distributed persistent object systems such as CORBA, DCOM and RMI. We expect that our garbage collection mechanism will work well for distributed persistent object systems. References [Birrell et al., 1993] Andrew Birrel, David Evers, Greg Nelson, Susan Owicki and Edward Wobber, Distributed garbage collection for network objects, Technical Report 116, Digital Equipment Cooperation Systems Research Center, December 15, 1993. [Chappell, 1996] David Chappell, Understanding ActiveX and OLE, Microsoft Press, 1996 [CORBA-A, 1997] OMG, The Common Object Request Broker: Architecture and Specification Revision 2.1, August 1997. [CORBA-S, 1997] OMG, CORBAservices: Common Object Services Specification, July 1997. [Lieberman & Hewitt, 1983] H. Lieberman and C. Hewitt, A real-time garbage collector based on the lifetimes of objects, Commun. ACM, vol. 26, June 1983, pages 419-429. [Neuman, 1992] B. Clifford Neuman, The virtual system model: A scalable approach to organizing large systems, Ph.D. dissertation, University of Washington, 1992. [RMI, 1997] Sun Microsystems, Remote Method Invocation Specification, 1997. [Spence & Atkinson, 1997] Susan Spence and Malcolm Atkinson, A Scalable Model of Distribution Promoting Autonomy of and Cooperation between PJava Object Stores, Proceedings of the Hawaii International Conference on System Sciences, January 1997, Hawaii, USA. [Pjama] The Forest Project, PJama: Orthogonal Persistence for Java , http://www.sunlabs.com/research/forest. [Wollrath et al., 1995] Ann Wollrath, Geoff Wyant, and Jim Waldo, Simple Activation for Distributed Objects, USENIX Conference on Object-Oriented Technologies (COOTS) June 26-29, 1995 Monterey, California. http://www.dcc.uchile.cl/~lmateu/NetObj/ Distributed Object Management By Luis Mateu The Distributed Garbage Collector Ideally the DGC should be independent of the object management and the local garbage collector. A clear interface has been defined for the interaction of the three component. The DGC must be safe: do not recycle reachable objects. The DGC should be live: recycle all objects not reachable by the distributed application. Implementation of the distributed garbage collector Tasks performed by the manager: The manager will create an single object of the class DistGC which must perform the distributed garbage collector. Any time that a process P receives a remote reference not currently present in the object table, the manager will invoke the method sendDirtyCall(wrep) on P. The parameter wrep is the wire representation of reference received. Any time that a process P detects that a stub is no longer reachable locally, the manager will extract its wire representation wrep from the object table and will invoke the method sendCleanCall(wrep) on P. Tasks performed by the collector: Let P be a process where a sendDirtyCall or a sendCleanCall is invoked with a wire representation refering to an object held in Q. sendDirtyCall(wrep): open a socket connection with Q if none is available. send (DIRTY, the identifier in wrep) through the socket to notify Q that P references now that object. cleanDirtyCall(wrep): open a socket connection with Q if none is available. send (CLEAN, the identifier in wrep) through the socket to notify Q that P no longer references that object. Each process Q will run a thread for reading dirty and clean message from processes having remote references of objects owned by Q. This thead must keep for each concrete object the set of spaces which reference that object. When the procedure clean detects that a concrete object has no remote references then it must invoke the method: manager.removeFromObjTbl(concNetObj); The manager will extract the concrete object from the object table, so it will be collected if it is not locally reachable. In transit references When the wire representation of a network reference is sent through a socket, it will not arrive immediately. Flushing the buffer just ensures that the data have been sent, but does not guarant that they have been received. The references that have been sent but have not being received are named in-transit references. A race condition occurs when a network object has only in-transit references, because the DGC may incorrectly recycle it. The current implementation does not protect in-transit references to be collected. Handling in-transit references Before returning from a remote invocation: The stub must guarant that all dirty messages corresponding to parameter references have been received. The stub must guarant that all dirty messages corresponding to the returned value have been received. clean messages can be sent later. The easy way to protect in-transit references is sending the dirty calls synchronously. Collecting client/server connections The current implementation does not collect client/server connections. It is very import to collect connections, because the threads needed to dispatch remote calls forbid the termination of processes. Connections can be collected as follows: Each object of the class ServerConnection must have a reference counter. That counter will keep the number of references to objects owned by the server process. It increases the counter in the method receiveNetObj and decreases the counter in finalizeStub. When the counter reaches the value zero, the manager of the client process can close the connection. The termination of a process A Java process will terminate only when all threads have finished (excepting the daemon threads). The thread accepting socket connections from the port must be turned to be a daemon thread when there is no exported objects. A thread dispatching remote calls to local objects will finish when the client side will close the socket, because the reference counter reaches zero. Thus a process may terminate when there is no exported objects and no remote process has references to local objects.