Introduction of Middleware-CORBA It's rare to find a computer/operating system/programming language that does everything well. Each has it's strengths and weaknesses. Companies tend to acquire diverse computers, operating systems, programming languages because each one was chosen because it was the best at a specific task. The problem is when (not if) they need to interaction. The Middleware exist to solve this problem. CORBA, along with DCOM and JAVA/RMI is the most popular Middleware today. CORBA is a standard architecture and infrastructure for connecting heterogeneous computers. The definition of CORBA can be defined from different points of views, in general, we can say “CORBA is a Software bus, Connects programs written in different language on different platform the way a hardware bus connects different peripherals implemented with different technologies.” Another definition is: “ CORBA is a Distributed architecture based on object-oriented principles: encapsulation of data and method, interface separate from implementation.” There are many issues and discussion about CORBA: How it integrate different object written in different languages to archive a common task? How does it deal with different hardware and operating Systems? How it make transparent to clients? How it archive the communication between clients and serves? Also, what the limitation of CORBA, and what advantage it has over other middleware such as COM or RMI. In this paper, the brief history of middleware are introduced in general at first, the basic architecture, components and operation of CORBA is second. Later, some advance topics around this popular middleware are discussed. Then detail comparison among CORBA, RMI and COM are made. Finally, how does CORBA integrate with other middleware to reach the common goal: totally transparent to users and freely communication discarding of different data formats, programming languages and underlies platforms Development of Middleware History CORBA is a specification standard for middleware that is implemented in about a dozen products. COM is Microsoft’s component object model and provides support for invoking operations from remote objects. Java/RMI is the mechanism for remote method invocation that has become available in version1.1 of the language specification. There are three middleware are competing with each other since they are born, “One will eventually ‘win’ and replace the other two.”[Engineering distributes Objects, 88]. Microsoft presented the first version of COM during the OLE Professional Developer’s Conference in 1993. With windows NT4.0, Microsoft introduced distribution capabilities into COM. These mechanisms were initially referred to as Network-OLE, DOM and Active/X in the (marketing) literature, but they are commonly subsumed under the term COM. They will be referred to as COM+ in future versions of the windows operation system. Before introduce RMI, a brief introduction of Java is necessary. Java is a platform independent Language and an object-oriented programming language developed by Sun Microsystems. Java is a hybrid language in that it is both compiled and interpreted. A java compiler creates byte-code, which is then interpreted by a virtual machine(VM). Java was designed to be highly portable. This portability is achieved by provision of different VMs that interpret the byte code on different hardware and operation system platforms. Remote Method invocation (RMI) is included in java API. RMI enable a client object that resides in one java VM to invoke a method from a remote server object that reside in a different Java VM. As these VMs may actually execute on different hosts, Java/RMI must be regarded as an object-oriented middleware and we can consider remote method invocation in Java to be object requests between java objects that are distributed across different Java VMs. The CORBA specifications are defined by the Object Management Group, a nonprofit organization that has more than 800 members. The first specification adopted by the OMG was the Object Management Architecture (OMA) [Soley, 1992]. In the more complex and more powerful world of objects, many of the same old problems lurk, but the same old tools don't quite work as solutions. In response to this gap, OMG developed the Common Object Request Broker Architecture (CORBA) specification. CORBA encompasses a series of standards and protocols for interprocess communication in a heterogeneous environment. With CORBA, developers can easily connect processes running on different machines, with different operating systems, and with code written in different languages. The CORBA specification has quickly caught on as a standard method for interprocess communication. Architecture To understand the Architecture of CORBA is the factor of writing CORBA programs. There are two specifications adopted by OMG: OMA(object managerment Architecture and CORBA(Common object Request Broker Architecture). OMA is showing in FIGURE1, it has five elements: Object Request Broker, Object Services, Common Facilities, Domain Interfaces and Application Objects. Object Services include: Lifecycle Service, Persistence Service, Naming Service, Event Service, Concurrency Control Service, Transaction Service, Relationship Service, Externalization Service, Query Service, Licensing Service, Time Service and Security Service. Figure1. CORBA specification defines the interface definition language for CORBA and identifiers a number of programming language binding. These bindings determine how client and server objects can be implemented in different programming languages. The CORBA specification also defines object adapters that govern object activation, deactivation and generation of object references. A CORBA Architecture is showing in Figure 2. The functions at client side include: Client IDL stubs, Dynamic Invocation Interface, Interface Repository APIs and ORB Interface. The functions at server side include: Server IDL Stubs, Dynamic Skeleton Interface, Object Adapter, Implementation Repository and ORB Interface. Figure2 (BOB COTTER, UMKC) Performance of CORBA There are some basics concepts about CORBA we should know: the CORBA specification only defines a set of conventions and protocols that must be followed by CORBA implementations. It is left to vendors and developers to translate this specification into a working implementation. CORBA does not make any restrictions on language or underlying operating system; because of this, implementations of the CORBA specification have been created for a wide variety of OSs, including Unix, Windows, and AS/400, and for many languages, including C, C++, Java, Ada, LISP, Python, and even COBOL [engineering distributes Objects, 88]. Any CORBA implementation that matches the defined interfaces and adheres to the defined protocols will be allowed to communicate with other CORBA implementations. Such specifications as TCP/IP, HTTP, and SMTP have taken off on the Internet largely because their language- and OS-neutral specifications give them a great deal of flexibility; CORBA's designers have ensured same sort of success for their brainchild. At the heart of every CORBA application are objects. Objects reside on various machines throughout the distributed environment and are tasked with performing duties defined by their implementation. The objects are often thought of as the servers in the system. (a client can perform as a server at a certain event and provide a service to other and a serve can be a client at a certain event and request a service from other servers.) However, unlike such standard servers, objects have the ability to move around if needed. A client communicates to an object through an object reference. Each CORBA object has unique identifier. There are may be many object references to a CORBA objects. CORBA object references are opaque to clients, that is client objects do not know the content of object references, in order to make an object request, a client has to have a reference to the server object [engineering distributes Objects, 90]. We can think of this as a pointer to the object that allows requests for operations and data access to be sent from the client to the server via an object request broker (ORB). An ORB is best thought of as the traffic bus cop in the system as the definition early in first paragraph. It knows whether requests should be routed to implementations contained within itself or to another ORB running on another machine. Every object on the ORB must have an implementation. This implementation is code written to perform tasks on the server machine. In other words, the implementation is what does the actual work of the object. An implementation can be in any language. It is allowed to perform tasks supported by the language, operating system, and underlying hardware when a request reaches the ORB for which it is intended, the request is passed to an object adapter. The portable object adapter (POA) and its predecessor, the basic object adapter (BOA), form a link between an object's implementation and its presence on the ORB. During the creation of an object on the ORB, the developer must specifically link a newly created object reference and its implementation. The object adapter then informs the ORB that it wants any requests for the new object reference to be routed to it. With object adapters, the ORB is no longer burdened with the task of keeping track of object implementations. It can now concentrate on its role as request traffic cop. So, if you have a great distributed system running on hundreds of different machines, how does the client know where to make its request? There are two common ways in which a client can receive an object reference: using interoperable object references (IORs) and using the naming service. Every object on the ORB has an IOR. A IOR includes a set of profiles that allow reference to be annotated with the ORB in which they are valid. An IOR includes the type name of the objects as a string and a sequence of profiles [engineering distributes Objects, 146]. In other words, the IOR is a global identifier string that identifies the machine on which its associated object is located and the interface that the object supports. If given the IOR for an object, a client can use standard function calls on the ORB to turn it into an object reference. With the information contained in the IOR, the ORB knows what type of object is being referenced and the machine to which all requests should be routed. The simplest way for a client to get the IOR of a server object on the ORB would be for the server object to write its IOR to a file. The client could then read the IOR from this file and have the ORB resolve it into an object reference. This method would work wonderfully for a system running on a single machine, where access to a common file between the client and the server is possible -- but that would defeat the purpose of CORBA. Another way for a client to receive a reference to an object is through the naming service With a proper connection to an object, the client is now free to make requests. Requests come in two basic forms: a client can request data from an object through access of its attributes or through access of its methods. Attributes of objects are identical to C++ public member variables. Anyone is allowed to get or set these values. (Client objects use one operation to read the state of the attribute while the other operation enables a clients to modify the attribute [engineering distributes Objects, 92]) When access to an attribute is requested, either for reading the current value or setting it to a new value, no operations are performed on the actual implementation of the object. The request is routed to the object adapter, which retrieves or updates the current values for the attribute. The second form of access to an object is through method calls. Methods calls are similar to C++ public member functions, except that they are allowed to have multiple return values without pointers. When a client requests that an object's method be called, the parameters are sent over the ORB to the implementation. The corresponding method in the object's implementation is then called with the parameters from the client. Any return values or exceptions are made available to the client by sending them from the implementation, through the object adapter, and over the ORB. Interface of CORBA Interfaces define the public interfaces of the objects in the system. In an interface, a user can define publicly accessible attributes and member functions. Interface attributes are allowed to be any of the core or complex data types, and can be references to another interface as well. Interface member functions are allowed to have any number of parameters and return values, and can throw exceptions. The format of a method definition is: <First Return Value Type> methodName (<parameter definition list>) raises (<exception list>); The method can be given any method name you choose, with the exception of IDL keywords. The first return value is defined as a C return value would be. Its type is defined before the method name. If the method has no return values, then this will be defined as void. Note that “The CORBA object model does not support private or protected operation or method for the same reason the it does not support private or protect arrtibeted [engineering distributes Objects, 92]. Each parameter in the parameter list has the following format: parameter_attribute parameter_type parameter_name The parameter name identifies the parameter within the implementation. The parameter type can be any type, core, complex, or interface you have previously defined. The parameter attribute defines the type of parameter with which you are dealing. It has three possible values: in, out, or inout. All in and inout parameters will be passed to your method's implementation as parameters. All out and inout parameters will be returned from your implementation as results. An implementation is allowed any number of parameters and return values. The inout parameter attribute allows for the illusion of passing a value by reference. Later on, a detail comparison of IDL are made among CORBA, JAVA/RMI and DCOM. Heterogeneity Heterogeneity is a big issue in middleware designing. We can consider it from different angle and domain: Programming Language Heterogeneity -using different Programming language to implement objects which can communication with others in a distribute computing environment. Middleware Heterogeneity -different middleware such as RMI, CORBA, or COM are involved in a single distributed object communication. Data Representation Heterogeneity -different platform of operation system has different data representation such as big-edian or little-edian notation. It is middleware task to resolve these heterogeneities and also make it totally transparency to ending point users of clients. Therefore, to understand the basic concept of those Heterogeneity describe above is the main factor of having a better understanding of middleware architecture and using middleware more efficiently. For next couple paragraphs, the detail of these three heterogeneities are discussed. Programming Language Heterogeneity: We all know it is unrealistic to have only one high level Programming language in programming world. Each language is designed for special purpose and has its own advantages. “ the programming language that is best suited to one component is unsuitable for the other…therefore, the need to integrate components written in heterogeneous programming language arises in many projects” [engineering distributes Objects, 92]. Typing, type construction, interfaces, method call, implementations and inheritance are defined in different Programming language in different ways. Also, Binary representation generated by compilers of the same language may difference since Machine code representation may be hardware platform dependent. What is the problem? If we have n object in a distribute computing system, and suppose these n objects are written in n different programming languages, we have to make n(n-1) interface in order make each object can talking others. By doing this, there will be a costly complicates. The IDL is coming out to solve this problem. For example, The CORBA specification includes six programming language binding. Integration in CORBA is simplified by the way that the interface language are carefully chosen to mapped relatively to a variety of programming languages, so that there are n way communication between n objects written in n different programming languages. Middleware Heterogeneity: So far, we have discussed three kind of middleware: CORBA, COM, JAVA/RMI. The availability of different object-oriented middleware way presents a selection problem, but sometimes there is no optimal single middleware, and multiple middleware systems have to be combined. The integration problem that is to be solved by object-oriented middleware may be such that different programming language has to be utilized. Not all middleware can be used with every programming language. Different middleware may be required due to integration with exiting systems, Different middleware may be required due to availability on hardware platforms. Different middleware may be required due to different performance profiles [engineering distributes Objects, 131]. Each of them has its own advantage over other two. For example, JAVA/RMI is base on java and it is platform independence which other two do not have, on the other hand, JAVA/RMI only can implement in JAVA, it is a limitation of JAVA, in contrast, CORBA can implement in six different languages. Would that will be wonderful if we have a single middleware to integrate all of three middleware to accomplish distrusted computing system more efficiently. Data Representation Heterogeneity: Each computer architecture provides its own definition for the representation of data. Some computer store the least significant byte of an integer at the lowest memory address, others store the most significant byte at the lowest address, and others do not store bytes contiguously in memory. Those all are consider as computer’s native data representation. Programmers who create client and server software must contend with data representation, however, because both endpoints must agree on the exact representation for all data sent across the communication channel between them, if the native data representation on two different machine, data sent from a program on one machine to a program on the other must be converted. A Problem is raised: “If client-server software is designed to convert from the client’s native data representation directly to the server’s native data representation asymmetrically, the number of versions of the software grows as the square of the number of architectures”[TCP/IP, 233]. XDR(external data representation) is coming up to solve this problem. Instead of converting directly from one machine’s representation to the other’s, both client and server perform a data conversion. Before sending across the network, they convert data from the sending computer’s native representation into a standard, machine-independent representation. A detail conversion a discussed in Comer’s TCP/IP book. Interoperability Interoperability denotes the ability of different implementations of middleware to work together. The requests are implemented by the middleware systems talking to each other via interoperability bridges. The fact that the server object is connected to a middleware from a different vendor, however, is transparent for client programmers. Client and server objects communicate only with their middleware as they did before and bridges between different middleware are used to resolve the remaining heterogeneity. How does the bridge works? It translates a request from the domain of one middleware implementation to another implementation. There are two kind of bridges: inline bridges and request-level bridge. Inline bridges are built into the core of the middleware and implement standardized of the middleware and therefore they cannot be built by application programmers but rather have to be provided by middle vendors. I think it can cause high cast or over head of constructing a middleware because there must be extra code to be implemented each time a request from other domain. And it may take longer time to get a service. Request-level bridges, on the other hand, are built on top of a middleware implementation and they use the published specification of the middleware. Request-level bridges can be added by application programmers in order to implement interoperability for a middleware that does not itself implement a standardized interoperability protocol [engineering distributes Objects, 134]. I think this technical can increasing the complexity of program and client users. CORBA Interoperability protocols To allow different ORB implementations to communicate with each other, the format of the messages must be well defined. CORBA uses the General Inter-ORB Protocol (GIOP) to define the format of these messages. Inside a GIOP message, there is information about the operation to be performed on the server, the parameters, and even the byte order to which the server can convert the data, if needed. Parameters and return values are sent in GIOP messages using the Common Data Representation Protocol (CDR). This protocol defines the process by which all of the IDL primitives can be represented in a text-based format. It is possible for any CORBA-aware application to receive and understand data in CDR format from any other CORBA application. The Internet Inter-ORB Protocol (IIOP) defines a transport layer for sending GIOP messages over TCP/IP. IIOP establishes a mapping of GIOP messages to TCP/IP messages, and the communication protocol to be used when sending GIOP messages over the Internet Detail of IDL’s comparison of CORBA, DCOM, JAVA/RMI (see figure 3) DCOM - The DCOM IDL file shows that our DCOM server implements a dual interface. COM supports both static and dynamic invocation of objects. It is a bit different than how CORBA does through its Dynamic Invocation Interface (DII) or Java does with Reflection. For the static invocation to work, The Microsoft IDL (MIDL) compiler creates the proxy and stub code when run on the IDL file. These are registered in the systems registry to allow greater flexibility of their use. This is the vtable method of invoking objects. For dynamic invocation to work, COM objects implement an interface called IDispatch. As with CORBA or Java/RMI, to allow for dynamic invocation, there has to be some way to describe the object methods and their parameters. Type libraries are files that describe the object, and COM provides interfaces, obtained through the IDispatch interface, to query an Object's type library. In COM, an object whose methods are dynamically invoked must be written to support IDispatch. This is unlike CORBA where any object can be invoked with DII as long as the object information is in the Implementation Repository. The DCOM IDL file also associates the IStockMarket interface with an object class StockMarket as shown in the coclass block. Also notice that in DCOM, each interface is assigned a Universally Unique IDentifier (UUID) called the Interface ID (IID). Similarly, each object class is assigned a unique UUID called a CLasS ID (CLSID). COM gives up on multiple inheritance to provide a binary standard for object implementations. Instead of supporting multiple inheritance, COM uses the notion of an object having multiple interfaces to achieve the same purpose. This also allows for some flexible forms of programming. CORBA - Both CORBA and Java/RMI support multiple inheritance at the IDL or interface level. One difference between CORBA (and Java/RMI) IDLs and COM IDLs is that CORBA (and Java/RMI) can specify exceptions in the IDLs while DCOM does not. In CORBA, the IDL compiler generates type information for each method in an interface and stores it in the Interface Repository (IR). A client can thus query the IR to get run- time information about a particular interface and then use that information to create and invoke a method on the remote CORBA server object dynamically through the Dynamic Invocation Interface (DII). Similarly, on the server side, the Dynamic Skeletion Interface (DSI) allows a client to invoke an operation of a remote CORBA Server object that has no compile time knowledge of the type of object it is implementing. The CORBA IDL file shows the StockMarket interface with the get_price() method. When an IDL compiler compiles this IDL file it generates files for stubs and skeletons. Java/RMI - Notice that unlike the other two, Java/RMI uses a .java file to define it's remote interface. This interface will ensure type consistency between the Java/RMI client and the Java/RMI Server Object. Every remotable server object in Java/RMI has to extend the java.rmi.Remote class. Similarly, any method that can be remotely invoked in Java/RMI may throw a java.rmi.RemoteException. java.rmi.RemoteException is the superclass of many more RMI specific exception classes. We define an interface called StockMarket which extends the java.rmi.Remote class. Also notice that the get_price() method throws a java.rmi.RemoteException. Figure 3. DCOM-IDL CORBA-IDL JAVA/RMI-interface definition [ uuid(7371a240-2e51-11d0-b4c1444553540000), version(1.0) ] library SimpleStocks { importlib("stdole32.tlb"); [ uuid(BC4C0AB0-5A45-11d299C5-00A02414C655), dual ] interface IStockMarket : IDispatch { HRESULT get_price([in] BSTR p1, [out, retval] float * rtn); } module SimpleStocks { interface StockMarket { float get_price( in string symbol ); }; }; package SimpleStocks; import java.rmi.*; import java.util.*; [ uuid(BC4C0AB3-5A45-11d299C5-00A02414C655), ] coclass StockMarket { interface IStockMarket; };}; File StackMarketIDL File StackMarketIDL public interface StockMarket extends java.rmi.Remote { float get_price( String symbol ) throws RemoteException; } StackMarket.java Conclusion After a briefly history study about middleware and analyzed some big issues of these three main. I have a better understanding of CORBA. I can not say The CORBA is the best of three because each of them has its own advantage and disadvantage base one special case and situation. At I mention early “one of them will eventually win over other two”. Is that statement true? I don’t think this is a good conclusion. I believe, in the future, there will be a new technique coming up to combine CORBA with others two together to reach a single goal of to solve every single software engineering crisis. For now, Catalysis approach seems a big opportunity to archive this goal. For base understanding of CORB, DCOM and JAVA/RMI, following figure chat provide detail comparison of three. Figure 4. DCOM CORBA Supports multiple interfaces Supports multiple for objects and uses the inheritance at the QueryInterface() method to interface level navigate among interfaces. This means that a client proxy dynamically loads multiple server stubs in the remoting layer depending on the number of interfaces being used. Every object implements Every interface inherits IUnknown. from CORBA.Object Uniquely identifies a remote server object through its interface pointer, which serves as the object handle at run-time. Uniquely identifies an interface using the concept of Interface IDs (IID) and Uniquely identifies remote server objects through object references(objref), which serves as the object handle at run-time. These object references can be externalized (persistified) into strings which can then be converted back into an objref. Uniquely identifies an interface using the interface name and JAVA/RMI Supports multiple inheritance at the interface level Every server object implements java.rmi.Remote (Note : java.rmi.UnicastRemoteObject is merely a convenience class which happens to call UnicastRemoteObject.exportObject(this) in its constructors and provide equals() and hashCode() methods) Uniquely identifies remote server objects with the ObjID, which serves as the object handle at run-time. When you .toString() a remote reference, there will be a substring such as "[1db35d7f:d32ec5b8d3:-8000, 0]" which is unique to the remote server object. Uniquely identifies an interface using the interface name and uniquely identifies a named implementation of the server object by its mapping to a URL in the Registry uniquely identifies a named implementation of the server object using the concept of Class IDs (CLSID) the mapping of which is found in the registry. The remote server object reference generation is performed on the wire protocol by the Object Exporter Tasks like object registration, skeleton instantiation etc. are either explicitly performed by the server program or handled dynamically by the COM run-time system. Uses the Object Remote Procedure Call(ORPC) as its underlying remoting protocol When a client object needs to activate a server object, it can do a CoCreateInstance()(Note:There are other ways that the client can get a server's interface pointer, but we won't go into that here) The object handle that the client uses is the interface pointer The mapping of Object Name to its Implementation is handled by the Registry The type information for methods is held in the Type Library The responsibility of locating an object implementation falls on the Service Control Manager (SCM) The responsibility of activating an object implementation falls on the Service Control Manager (SCM) uniquely identifies a named implementation of the server object by its mapping to a name in the Implementation Repository The remote server object reference generation is performed on the wire protocol by the Object Adapter The constructor implicitly performs common tasks like object registration, skeleton instantiation etc The remote server object reference generation is performed by the call to the method UnicastRemoteObject.exportObject(this) Uses the Internet InterORB Protocol(IIOP) as its underlying remoting protocol When a client object needs to activate a server object, it binds to a naming or a trader service - (Note:There are other ways that the client can get a server reference, but we won't go into that here) The object handle that the client uses is the Object Reference The mapping of Object Name to its Implementation is handled by the Implementation Repository The type information for methods is held in the Interface Repository The responsibility of locating an object implementation falls on the Object Request Broker (ORB) The responsibility of locating an object implementation falls on the Object Adapter (OA) - either the Basic Object Adapter (BOA) or the Uses the Java Remote Method Protocol(JRMP) as its underlying remoting protocol (at least for now) The RMIRegistry performs common tasks like object registration through the Naming class. UnicastRemoteObject.exportObject(this) method performs skeleton instantiation and it is implicitly called in the object constructor. When a client object needs a server object reference, it has to do a lookup() on the remote server object's URL name. The object handle that the client uses is the Object Reference The mapping of Object Name to its Implementation is handled by the RMIRegistry Any type information is held by the Object itself which can be queried using Reflection and Introspection The responsibility of locating an object implementation falls on the Java Virtual Machine (JVM) The responsibility of activating an object implementation falls on the Java Virtual Machine (JVM) The client side stub is called a proxy The server side stub is called stub All parameters passed between the client and server objects are defined in the Interface Definition file. Hence, depending on what the IDL specifies, parameters are passed either by value or by reference. Attempts to perform distributed garbage collection on the wire by pinging. The DCOM wire protocol uses a Pinging mechanism to garbage collect remote server object references. These are encapsulated in the IOXIDResolver interface. Portable Object Adapter (POA) The client side stub is called a proxy or stub The server side stub is called a skeleton When passing parameters between the client and the remote server object, all interface types are passed by reference. All other objects are passed by value including highly complex data types Does not attempt to perform generalpurpose distributed garbage collection. The client side stub is called a proxy or stub The server side stub is called a skeleton When passing parameters between the client and the remote server object, all objects implementing interfaces extending java.rmi.Remote are passed by remote reference. All other objects are passed by value Attempts to perform distributed garbage collection of remote server objects using the mechanisms bundled in the JVM Reference: Dougtlas E. Comer TCP/IP-Client-Server Programming and Application 1997 John Wiley. Sons, Ltd Engineering Distributed Objects 2000 http://www.execpc.com/~gopalan/