Distributed Garbage Collection

advertisement
RMI versus CORBA
There has been lots of talk on the 'Net as to what technology to use when building
distributed systems: Java or CORBA. In this paper, we compare and contrast the two
technologies.
Distributed systems consist of several different layers:
 Transport layer
 ORB architecture
 API
The transport layer is the basic network protocol that allows one piece of a distributed
system talk to another piece. The ORB architecture is the design of the system supporting
the distributed object mechanism. Finally, the API is what a programmer needs to know in
order to build a distributed system.
CORBA and RMI are two distributed object standards supported by the OMG (Object
Management Group). CORBA (Common Object Request Broker Architecture) is an
architecture standard for building heterogeneous distributed systems. In other words, it
supports the creation of distributed systems whose components are written in many
different languages. An example of such a system might be a client/server application
where the client is written in Smalltalk and the server is written COBOL. CORBA is
specifically concerned with the architecture and API layers, using IDL (Interface Definition
Language) to provide the API. CORBA can use different transport layers such as IIOP.
RMI (Remote Method Invocation) is an API standard for building distributed Java systems.
In its current form, it uses its own proprietary architecture and transport layers, but neither
of these are required. OMG has certified RMI using IIOP as the transport layer as the
standard for doing distributed computing in Java. By using IIOP as the transport layer, RMI
is capable of interfacing with CORBA systems. By using RMI in a pure Java environment,
you gain the benefits of distributed garbage collection and full Java semantics. For example,
CORBA cannot support distributed garbage collection since not all languages support
garbage collection. In addition, because the API in CORBA uses IDL, this means
programmers must leave the semantic context of the language in which they are developing
in order to build distributed systems. Java RMI enables the developer to write only Java
code to build their distributed systems.
CORBA and RMI are not competing systems; they are complementary ones. They each suit
different needs, and which one you use boils down to what your requirements are. The most
basic requirement is whether you require homogeneity or heterogeneity. Systems made up
of components in languages other than Java demand CORBA. Systems which must be
deployed partially in Java may use a hybrid of RMI interoperating with CORBA. The hybrid
solution makes sense if your delivery window is after IIOP support has been added to RMI
(sometime next year) and you have a need within your system of Java-to-Java distributed
computing in addition to interoperability with other languages.
Even in the 100% pure Java scenario, there is a place for CORBA. RMI is specifically not
capable of handling very massively distributed systems. It is at this time missing some of
the key services that CORBA supports. These services such as transaction management are
a requirement of very large distributed object solutions. RMI will support the CORBA
services in the future and eventually be a viable large scale solution, but it is not there
today.
Distributed Garbage Collection
Distributed systems typically require distributed garbage collection. If a client holds a proxy to an object in the
server, it is important that the server does not garbage-collect that object until the client releases the proxy (and it
can be validly garbage-collected). Most third-party distributed systems, such as RMI, handle the distributed garbage
collection, but that does not necessarily mean it will be done efficiently. The overhead of distributed garbage
collection and remote reference maintenance in RMI can slow network communications by a significant amount
when many objects are involved.
Question: How does the Distributed Garbage Collection algorithm work?
Answer
The RMI subsystem implements a reference counting-based distributed garbage collection
(DGC) algorithm to provide automatic memory management facilities for remote server
objects.
Basically, DGC works by having the remote server keep track of all external client
references to it at any given time. When a client obtains a remote reference, it is addded to
the remote object's referenced set. The DGC then marks the remote object as dirty and
increases its reference count by one. When a client drops a reference, the DGC decreases
its reference count by one, and marks the object as clean. When the reference count
reaches zero, the remote object is free of any live client references. It is then placed on the
weak reference list and subject to periodic garbage collection.
http://www.objs.com/workshops/ws9801/papers/paper015.html
Garbage Collection for Distributed Persistent Objects
1. Introduction
Distributed object technology provides efficient and safe distribution and reuse of software components in the
network. Applications are composed of independent components which are distributed over the network and can
communicate with other components. Due to the technology, developers can create reliable, robust, secure, and
maintainable distributed applications easier and faster. Distributed object technology enables components of
software to plug-and-play, interoperate across the network, and coexist with legacy applications.
A distributed object consists of data and methods along with interfaces through which clients invoke the methods of
the object. When a client requests a service to an object, the object is created and executed on its server process. The
object replies to the client with results after processing and performing the request. The object is reclaimed as
garbage when it is not reachable from any clients or root objects. For this procedure, called garbage collection,
CORBA [CORBA-A, 1997; CORBA-S, 1997] uses reference counting, DCOM [Chappell, 1996] uses reference
counting and pinging, and Java RMI uses a lease.
The persistent object service enables objects to remain long after its server process terminates. To be persistent, the
state of an object should be stored in persistent storage such as disks so that the object can be activated later with the
stored previous state. The state of an object is stored and loaded either transparently or explicitly: clients can make
their objects persistent explicitly to reuse them later; or objects are stored and loaded transparently by the activation
and de-activation mechanism. The persistent object service and the activation mechanism can be used to conserve
system resources [Wollrath et al., 1995]. Since a system may contain a considerable number of objects, it is
unreasonable that objects remain active and consume valuable system resources although no requests are made on
them for a long time. When persistent objects are not reachable from any clients or roots, they should be reclaimed
as garbage.
In this position paper, we discuss the issues of garbage collection for distributed persistent objects, and present a
practical and efficient garbage collection mechanism.
2. The issues of garbage collection for distributed persistent objects
Garbage collection is a critical problem for administratively decentralized large scale distributed systems which
manage distributed persistent objects. This process is important because manual garbage collection is error-prone
and it is difficult for clients to maintain the information about references correctly. Many solutions for distributed
garbage collection have been suggested. However, they are not suitable for large scale distributed systems because
their target systems are small in scale and they cannot collect cyclic garbage.
In the ideal distributed system, objects continue to exist as long as they are reachable from clients or root objects and
should be reclaimed when unreachable. In practice, this is difficult to support in administratively decentralized large
scale distributed systems because of the following reasons:
 Distributed objects and references are dynamically created, deleted, migrated, and shared across the
network. Therefore, it is difficult to determine when an object is not reachable, and whether it is safe to
reclaim it.
 Distributed systems are administratively decentralized, so clients' well-behaving cooperation should not be
required.
 Distributed systems are very large in scale, so it is impossible to get a global view of clients, objects and
their references.
 Servers and clients can crash during garbage collection related operations.
 Messages can be lost, and the network can be partitioned for a while.
Since persistent objects may remain after their clients and servers terminate, garbage objects might stay in persistent
storage forever unless garbage collection is performed. When garbage collection is performed by clients manually
only, the same problem happens because manual garbage collection is error-prone and clients' well-behaving
cooperation should not be required. If some clients terminate without deleting their references, storage is wasted,
which even causes system crash.
CORBA supports the persistent object service and the activation mechanism. When active objects have not received
any requests for a long time (e.g., 15 minutes in NEO), they are de-activated and stored on persistent storage. They
are activated again when they are accessed. This process is performed transparently. Clients can store the state of
their objects explicitly as well. The persistent objects exist until they are explicitly deleted by clients, which means
that garbage objects should be reclaimed by clients manually. DCOM also supports the persistent object service.
Objects can be stored into a Running Object Table(ROT), and they can be activated later. DCOM does not support a
garbage collection mechanisms for the ROT, so the client which created objects should come back and delete the
objects. In [Pjama], root objects are created by clients, and the root objects and objects reachable from the root
objects are assumed to be alive. Garbage objects which are unreachable from the root objects are reclaimed by the
external garbage collector, opjgc.
The current garbage collection mechanisms for active objects in memory are not suitable for persistent objects.
Reference counting is simple and scalable, but 1) it cannot collect cyclic garbage objects, and 2) it is impossible to
maintain the referencing link counts correctly. Since clients might terminate without reducing their reference counts
in a loosely synchronized distributed system, the reference count of an object can be greater than zero, even though
no other objects are referencing the object. Both pinging and lease are based on [Birrell et al., 1993] but they cannot
collect cyclic garbage. That is, garbage objects in a same cycle will refresh each other, and they never expire. Since
persistent objects may remain long after its server process terminates, all garbage including cyclic garbage should be
reclaimed completely. Therefore, we need a new garbage collection mechanism which is efficient, scalable and
complete.
3. Our approach
We have implemented a garbage collection mechanism for the Prospero directory service which maintains
distributed persistent objects [Neuman, 1992]. In our system model, all distributed objects are equal, and there are
no special root objects. Objects which have a long lifetime (Time-to-Live) effectively serve as root objects. Objects
reachable from clients are assumed to be alive, and unreachable objects are assumed garbage. Each object may have
a different reclamation method based on its characteristics. For example, temporary objects are removed directly
once they become garbage. However, some important objects such as bank accounts are moved to dormant account
repository and notify the customers. Our mechanism consists of two major steps: 1) acyclic garbage is collected by
timeouts; and 2) cyclic garbage is collected by last referenceable timestamp propagation and backward inquiry. We
assume that all clocks in the network are synchronized to some degree, and crashed networks and servers recover
within a finite period of time.
3.1 Timeouts
Information about references are maintained by timeouts which is the same concept as that of RMI's lease. When an
object is created by a client, the client assigns a TTL (Time-To-Live) to the object. The new object also gets an
expiration time which is the creation time plus the TTL. The TTL is the period for which the object is guaranteed to
exist after being refreshed (receiving a refresh message). When a link is made to the object, the TTL is added to the
current time, and the resulting expiration time is stored in both the object and the link. Since each link contains the
expiration of the target object, clients can send a refresh message before the expiration time. To guarantee that a link
will continue to work (the target object remains valid), it must be refreshed before its expiration.
Two optimization techniques are introduced in order to reduce the communication overhead incurred by the
refreshing process. First, refresh messages are gathered and sent on a host-to-host basis rather than a client-to-object
basis. Second, refreshing frequency is controlled by the likelihood of becoming garbage. [Lieberman & Hewitt,
1983] showed that the likelihood of becoming garbage for a newly created object is higher than that of an old one.
Therefore, garbage collection should be performed more frequently for newly created objects and less for old
objects. To controlled the frequency of refreshing, the TTL of an object starts from a small number and is increased
as the object is accessed.
3.2 Last referenceable timestamp propagation and backward inquiry
Because timeouts do not collect cyclic garbage, last referenceable timestamp propagation and backward inquiry are
introduced. In practice, distributed object spaces do not have special root objects which are always alive. For that
reason, our garbage collection mechanism uses reachability from clients rather than from roots. Cyclic garbage is
reclaimed by two steps: last referenceable timestamp propagation which detects local garbage; and backward
inquiry which performs partial synchronization and confirms garbage.
Each object maintains an LRTS (Last Referenceable TimeStamp) which indicates the most recent time when the
object was accessible (i.e., reachable) by clients. When an object is accessed by a client, the LRTS of the object is
set to the accessed time, and the LRTS (the accessed time) is propagated to all reachable objects recursively when
links are refreshed. All reachable objects from the accessed object will get new LRTSs eventually. The LRTS of
inaccessible distributed cyclic garbage, however, will stabilize at a value that is less than or equal to the time at
which the cycle became inaccessible. Objects whose LRTSs are not more recent than a local threshold are local
garbage. Since each system selects its own local threshold, each has a different threshold. Moreover, it is possible
that LRTSs might not arrive at target objects on time because they are sent only during the link refreshing process.
Therefore, before local garbage is reclaimed, it should be confirmed to be garbage.
The basic idea of backward inquiry is that, before a local garbage object is reclaimed, all referencing objects are
examined to see whether they are garbage or not. If all referencing objects are garbage, then the object is also
garbage and can be reclaimed. The best way to check referencing objects is to ask them directly to examine
themselves whether they are garbage or not, and to reply with results. Since referencing objects refresh their target
objects before expiration times, objects can receive refresh messages from all referencing objects. This means that
objects can receive information about all referencing objects (i.e., an implicit reference list) within their TTL times,
even though they do not have an explicit reference list.
Since messages necessary for cyclic garbage collection are bundled with the messages used for timeouts, the
overhead of cyclic garbage collection is minimized.
3.3 Evaluation and implication
Our garbage collection mechanism satisfies the following properties:
 The mechanism scales well in terms of the number of distributed objects due to the two optimization
techniques of the refreshing mechanism.
 No extra messages are necessary for cyclic garbage collection because messages needed for cyclic garbage
collection are bundled with refresh messages.
 The latency time to collect garbage increases in proportion to the length of the garbage chain, which is
scalable.
 All garbage including distributed cyclic garbage is reclaimed.
 Our mechanism does not require global synchronization of all distributed systems.
We have a plan to apply our garbage collection mechanism on practical distributed persistent object systems such as
CORBA, DCOM and RMI. We expect that our garbage collection mechanism will work well for distributed
persistent object systems.
References
[Birrell et al., 1993] Andrew Birrel, David Evers, Greg Nelson, Susan Owicki and Edward Wobber, Distributed
garbage collection for network objects, Technical Report 116, Digital Equipment Cooperation Systems Research
Center, December 15, 1993.
[Chappell, 1996] David Chappell, Understanding ActiveX and OLE, Microsoft Press, 1996
[CORBA-A, 1997] OMG, The Common Object Request Broker: Architecture and Specification Revision 2.1,
August 1997.
[CORBA-S, 1997] OMG, CORBAservices: Common Object Services Specification, July 1997.
[Lieberman & Hewitt, 1983] H. Lieberman and C. Hewitt, A real-time garbage collector based on the lifetimes of
objects, Commun. ACM, vol. 26, June 1983, pages 419-429.
[Neuman, 1992] B. Clifford Neuman, The virtual system model: A scalable approach to organizing large systems,
Ph.D. dissertation, University of Washington, 1992.
[RMI, 1997] Sun Microsystems, Remote Method Invocation Specification, 1997.
[Spence & Atkinson, 1997] Susan Spence and Malcolm Atkinson, A Scalable Model of Distribution Promoting
Autonomy of and Cooperation between PJava Object Stores, Proceedings of the Hawaii International Conference on
System Sciences, January 1997, Hawaii, USA.
[Pjama] The Forest Project, PJama: Orthogonal Persistence for Java , http://www.sunlabs.com/research/forest.
[Wollrath et al., 1995] Ann Wollrath, Geoff Wyant, and Jim Waldo, Simple Activation for Distributed Objects,
USENIX Conference on Object-Oriented Technologies (COOTS) June 26-29, 1995 Monterey, California.
http://www.dcc.uchile.cl/~lmateu/NetObj/
Distributed Object Management
By Luis Mateu
The Distributed Garbage Collector



 Ideally the DGC should be independent of the object management and the local garbage collector.
A clear interface has been defined for the interaction of the three component.
The DGC must be safe: do not recycle reachable objects.
The DGC should be live: recycle all objects not reachable by the distributed application.
Implementation of the distributed garbage collector
Tasks performed by the manager:
 The manager will create an single object of the class DistGC which must perform the distributed garbage
collector.
 Any time that a process P receives a remote reference not currently present in the object table, the manager
will invoke the method sendDirtyCall(wrep) on P. The parameter wrep is the wire representation
of reference received.
 Any time that a process P detects that a stub is no longer reachable locally, the manager will extract its wire
representation wrep from the object table and will invoke the method sendCleanCall(wrep) on P.
Tasks performed by the collector:
 Let P be a process where a sendDirtyCall or a sendCleanCall is invoked with a wire representation refering
to an object held in Q.
 sendDirtyCall(wrep):
 open a socket connection with Q if none is available.
 send (DIRTY, the identifier in wrep) through the socket to notify Q that P references now that
object.
 cleanDirtyCall(wrep):
 open a socket connection with Q if none is available.
 send (CLEAN, the identifier in wrep) through the socket to notify Q that P no longer references
that object.



Each process Q will run a thread for reading dirty and clean message from processes having remote
references of objects owned by Q.
This thead must keep for each concrete object the set of spaces which reference that object.
When the procedure clean detects that a concrete object has no remote references then it must invoke the
method:
manager.removeFromObjTbl(concNetObj);
The manager will extract the concrete object from the object table, so it will be collected if it is not locally
reachable.
In transit references




When the wire representation of a network reference is sent through a socket, it will not arrive immediately.
Flushing the buffer just ensures that the data have been sent, but does not guarant that they have been
received.
The references that have been sent but have not being received are named in-transit references.
A race condition occurs when a network object has only in-transit references, because the DGC may
incorrectly recycle it.

The current implementation does not protect in-transit references to be collected.
Handling in-transit references
Before returning from a remote invocation:
 The stub must guarant that all dirty messages corresponding to parameter references have been received.
 The stub must guarant that all dirty messages corresponding to the returned value have been received.
 clean messages can be sent later.
 The easy way to protect in-transit references is sending the dirty calls synchronously.
Collecting client/server connections



The current implementation does not collect client/server connections.
It is very import to collect connections, because the threads needed to dispatch remote calls forbid the
termination of processes.
Connections can be collected as follows:
 Each object of the class ServerConnection must have a reference counter. That counter will keep
the number of references to objects owned by the server process.
 It increases the counter in the method receiveNetObj and decreases the counter in finalizeStub.
 When the counter reaches the value zero, the manager of the client process can close the
connection.
The termination of a process




A Java process will terminate only when all threads have finished (excepting the daemon threads).
The thread accepting socket connections from the port must be turned to be a daemon thread when there is
no exported objects.
A thread dispatching remote calls to local objects will finish when the client side will close the socket,
because the reference counter reaches zero.
Thus a process may terminate when there is no exported objects and no remote process has references to
local objects.
Download