COM/DCOM & COM+ A Primer on the Evolution of a Microsoft Development Environment The Component Object Model The Component Object Model (COM) has its roots in OLE version 1, which was created in 1991 and was a proprietary document integration and management framework for the Microsoft Office suite. Microsoft later realized that document integration is just a special case of component integration. OLE version 2, released in 1995 was a major enhancement over its predecessor. The foundation of OLE version 2, now called COM, provided a general-purpose mechanism for component integration on Windows platforms. Since then additions have been made, such as DCOM, but applications that worked then still work now. COM, the Component Object Model, refers to both a specification and an implementation developed by Microsoft Corporation that provides a framework for integrating components. This framework supports interoperability and reusability of distributed objects by allowing developers to build systems by assembling reusable components from different vendors that communicate via COM. By applying COM to build systems of preexisting components, developers hope to reap benefits of maintainability and adaptability. Objects created using the COM specification support the fundamental notions of encapsulation, polymorphism, and reusability. Microsoft's Component Object Model (COM) defines a language-independent notion of what an object is -- how to create objects, how to invoke methods, and so on. This allows development of components that programmers can use (and reuse) in a consistent way, regardless of which languages they use to write the component and its client. COM is an architecture for the integration and deployment of software components, rather than a body of techniques for problem analysis. In contrast, most Object Oriented Design methodologies were created for more monolithic object-oriented applications. Therefore, the assumptions that you can make with OOD don't necessarily apply to COM, and important considerations arise, Table 1: Differences in design considerations between OOD and COM Object-Oriented Design Assumptions Added COM Considerations Objects typically packaged in the same Objects and clients typically in separate application (module) as client code modules, both .EXEs and .DLLs Objects and clients run in a single process Objects and clients may run in different processes and on different machines Class (implementation) inheritance Interface inheritance (no implementation inheritance) Single interface per object (the object's class definition) Multiple interfaces per object Single client per object Multiple simultaneous clients per object 1:1 relationships between clients and objects typical Many: Many relationships between clients and objects is common Designing a component-based system in COM is not just a matter of applying an Object Oriented Design methodology; COM introduces new considerations of packaging, components per package, objects per component, interfaces per object, and simultaneous clients per object. COM is about choice. It provides the choice of the highest volume languages and tools available, as well as the largest base of applications. COM also provides choice in the area of security, as it provides a common interface (SSPI) where various security providers can be plugged in. COM also provides choice of network transport. COM Principles COM forces the Windows operating system to see applications as objects. The OS takes the responsibility of creating objects when they are required, deleting them when they are not, and handling communications between them, be it in the same or different processes or machines. The OS creates a central registry for the objects. One major advantage of this mechanism is versioning. If the COM object ever changes to a new version, the applications that use that object need not be recompiled. All COM objects are registered with a component database. When a client wishes to create and use a COM object: 1. It invokes the COM API to instantiate a new COM object. 2. COM locates the object implementation and initiates a server process for the object. 3. The server process creates the object, and returns an interface pointer at the object. The client can then interact with the newly instantiated COM object through the interface pointer. COM defines a binary structure for the interface between the client and the object. This binary structure provides the basis for interoperability between software components written in arbitrary languages. A fully compliant COM object can be written in any language that can produce binary compatible code. As long as a compiler can reduce language structures down to this binary representation, the implementation language for clients and COM objects does not matter - the point of contact is the run-time binary representation. COM defines an application programming interface (API) to allow for the creation of components for use in integrating custom applications or to allow diverse components to interact. COM components are never linked to any particular application. The only thing that an application may know about a COM object is what functions it may or may not support. In fact, the object model is so flexible that applications can query the COM object at run-time as to what functionality it provides. As shown in Figure 1, services implemented by COM objects are exposed through a set of interfaces that represent the only point of contact between clients and the object. Figure 1: Client Using COM Object Through an Interface Pointer Garbage collection is another major advantage to using COM. When there are no outstanding references (a.k.a. pointers) to an object, the COM object destroys itself. COM Runtime Architecture COM/ DCOM is a truly distributed Object-Oriented Architecture. Components developed using Microsoft’s COM provide a way by which two objects in different object spaces or networks, can talk together by calling each other’s methods. COM services are provided in a standard way, whether those services are required within a single running process, within two different processes on the same machine, or on two different processes across a network using DCOM. As a result COM and DCOM provide location transparency. COM servers (objects) are accessed within the same process, within two different processes on the same machine, or across the network using RPC: 1. In-process server: The client can link directly to a library containing the server. The client and server execute in the same process. Communication is accomplished through function calls. 2. Local Object Proxy: The client can access a server running in a different process but on the same machine through an inter-process communication mechanism. This mechanism is actually a lightweight Remote Procedure Call (RPC). 3. Remote Object Proxy: The client can access a remote server running on another machine. The network communication between client and server is accomplished through DCE RPC. The mechanism supporting access to remote servers is called DCOM. Figure 2: Three Methods for Accessing COM Objects COM and DCOM give designers three choices for packaging component code into some executable module: in-process (same process as client), local (separate process from client on the same machine), and remote (separate processes on separate machines). Table 2: Pros and Cons of COM packaging choices Package Pros Type Cons Preferred Uses InProcess High speed (no No security, no remoting overhead), process protectionno remoting limitations crash in component crashes process that loaded it, UI synchronization (sharing a message pump) tricky Add-on types of components that provide simple services (like function libraries or child-window UI elements) to clients Local Process security (separation), process ownership (including threading, memory management, etc.), control over UI synchronization Slower than in-process (remoting overhead), remoting limitations on interfaces, no access security Heavier components that are too expensive to load inprocess, have UI beyond simple child windows, or wish to manage their own files (such as databases). Remote Process and access security, process and possible machine ownership (e.g., managing a shared component resource), cross-platform Slower than local with additional remoting limitations Components that need to run in close proximity to a particular resource If the client and server are in the same process, the sharing of data between the two is simple. However, when the server process is separate from the client process, as in a local server or remote server, COM must format and bundle the data in order to share it. This process of preparing the data is called marshalling. Distributed computing purists describe marshalling as the process of packaging and transmitting data between different address spaces, automatically resolving pointer problems, while preserving the data’s original form and integrity. Marshalling is accomplished through a "proxy" object and a "stub" object that handles the cross-process communication details for any particular interface. Even though COM objects reside in separate processes or address spaces or even different machines, the operating system takes care of marshalling the call and calling objects running in a different application (or address space) on a different machine. The actual internal implementation of marshalling and unmarshalling differs depending on whether the client and server operate on the same machine (COM) or on different machines (DCOM). Given an IDL file, the Microsoft IDL compiler can create default proxy and stub code that performs all necessary marshalling and un-marshalling. Figure 3: Cross-process communication in COM The fact that COM can access services within the same process is a huge differentiator between COM and CORBA. Allowing for the in-process model allows for the development of components such as AcitveX Controls or JavaBeans. CORBA at present cannot handle in-process components and therefore cannot participate in the component marketplace. The IDL Whenever a client needs some service from a remote distributed object, it invokes a method implemented by the remote object. The service that the remote distributed object (Server) provides is encapsulated as an object and the remote object's interface is described in an Interface Definition Language (IDL). The interfaces specified in the IDL file serve as a contract between a remote object server and its clients. Clients can thus interact with these remote object servers by invoking methods defined in the IDL. COM objects and interfaces are specified using Microsoft Interface Definition Language (IDL), an extension of the DCE Interface Definition Language standard. To avoid name collisions, each object and interface must have a unique identifier. Interfaces are considered logically immutable. Once an interface is defined, it should not be changed (new methods should not be added and existing methods should not be modified). When developing a COM-based system, it's important to get the interface down in IDL code. In modern COM, IDL best describes COM interfaces. After describing an interface in IDL, run the IDL through the MIDL compiler, which produces C and C++ header files, a type library, and the source code necessary for building a proxy-stub DLL. Interfaces must be well defined in IDL, because the proxy and the stub need to understand exactly how to move data between the client and the object. This is important because the client and the object might be on different machines, and moving data from the client to the object probably involves moving actual bits back and forth. To invoke a remote method, the client makes a call to the client proxy. The client side proxy packs the call parameters into a request message and invokes a wire protocol like IIOP (in CORBA) or ORPC (in DCOM) or JRMP (in Java/RMI) to ship the message to the server. At the server side, the wire protocol delivers the message to the server side stub. The server side stub then unpacks the message and calls the actual method on the object. In both CORBA and Java/RMI, the client stub is called the stub or proxy and the server stub is called skeleton. In DCOM, the client stub is referred to as proxy and the server stub is referred to as stub. The final consideration for using COM is that a single object instance may play different roles for different simultaneous clients where each client is using a different set of interfaces. A COM object can support any number of interfaces. An interface provides a grouped collection of related methods. In addition, a single piece of client code may be using many different objects polymorphically (through the same interface). COM gives up on multiple inheritances to provide a binary standard for object implementations. Instead of supporting multiple inheritances, COM uses the notion of an object having multiple interfaces to achieve the same purpose. This also allows for some flexible forms of programming. DCOM Distributed COM is an extension to COM that allows network-based component interaction. While COM processes can run on the same machine but in different address spaces, the DCOM extension allows processes to be spread across a network. With DCOM, components operating on a variety of platforms can interact, as long as DCOM is available within the environment. It is best to consider COM and DCOM as a single technology that provides a range of services for component interaction, from services promoting component integration on a single platform, to component interaction across heterogeneous networks. In fact, COM and its DCOM extensions are merged into a single runtime. This single runtime provides both local and remote access. DCOM which is often called 'COM on the wire’ supports remoting objects by running on a protocol called the Object Remote Procedure Call (ORPC). This ORPC layer is built on top of DCE's RPC and interacts with COM's run-time services. A DCOM server is a body of code that is capable of serving up objects of a particular type at runtime. Each DCOM server object can support multiple interfaces each representing a different behavior of the object. A DCOM client calls into the exposed methods of a DCOM server by acquiring a pointer to one of the server object's interfaces. The client object then starts calling the server object's exposed methods through the acquired interface pointer as if the server object resided in the client's address space. As specified by COM, a server object's memory layout conforms to the C++ vtable layout. Since the COM specification is at the binary level it allows DCOM server components to be written in diverse programming languages like C++, Java, Object Pascal (Delphi), Visual Basic and even COBOL. As long as a platform supports COM services, DCOM can be used on that platform. DCOM is now heavily used on the Windows platform. Active X In October of 1996 Microsoft turned over COM/DCOM, parts of OLE, and ActiveX to the Open Group (a merger of Open Software Foundation and X/Open). The Open Group has formed the Active Group to oversee the transformation of the technology into an open standard. The aim of the Active Group is to promote the technology's compatibility across systems (Windows, UNIX, and MacOS) and to oversee future extension by creating working groups dedicated to specific functions. However, it is unclear how much control Microsoft will relinquish over the direction of the technology. Certainly, as the inventor and primary advocate of COM and DCOM, Microsoft is expected to have strong influence on the overall direction of the technology and underlying APIs. An ActiveX control is really just another term for "OLE Object" or, more specifically, "Component Object Model (COM) Object." In other words, a control, at the very least, is some COM object that supports the IUnknown interface and is also self-registering. It usually supports many more interfaces in order to offer functionality, but all additional interfaces can be viewed as optional and, as such, a container should not rely on any additional interfaces being supported. This allows a control to implement as little functionality as it needs to, instead of supporting a large number of interfaces that actually don't do anything. Through QueryInterface a container can manage the lifetime of the control, as well as dynamically discover the full extent of a control's functionality based on the available interfaces. In short, this minimal requirement for nothing more than IUnknown allows any control to be as lightweight as it can. Other than IUnknown and selfregistration, there are no other requirements for a control. There are, however, conventions that should be followed about what the support of an interface means in terms of functionality provided to the container by the control. It should never be assumed that an interface is available, and standard returnchecking conventions should always be followed. It is important for a control or container to degrade gracefully and offer alternative functionality if a required interface is not available. ActiveX controls have become the primary architecture for developing programmable software components for use in a variety of different containers, ranging from software development tools to end-user productivity tools. For a control to operate well in a variety of containers, the control must be able to assume some minimum level of functionality that it can rely on in all containers. Component Object Model+ COM+ is much younger than COM, it was announced in Sept. 23, 1997 and is a major upgrade of Microsoft’s long-term component strategy. The production release of COM+ is shipped with Windows 2000. The Web-centered computing industry has begun to align itself into two technology camps-with one camp centered around Microsoft's COM/DCOM/COM+, Internet Explorer, and ActiveX capabilities, and the other camp championing Netscape, CORBA, and Java/J2EE solutions. Both sides argue vociferously about the relative merits of their approach, but at this time there is no clear technology winner. COM+ is the next step in the evolution of the Microsoft Component Object Model and the Microsoft Transaction Server (MTS). COM+ is the merging of the COM and MTS programming models with the addition of several new features. COM+ handles many of the resource management tasks a developer had to program himself, such as thread allocation and security. It automatically makes an application more scalable by providing thread pooling, object pooling, and just-intime object activation. COM+ also protects the integrity of the data by providing transaction support, even if a transaction spans multiple databases over a network. COM+ has come along to unify COM, DCOM, and MTS into a coherent, enterprise-worthy component technology. Indeed, these and other technologies constitute Microsoft's distributed and web-oriented strategy. This strategy is globally referred as Distributed interNet Architecture(tm) (DNA) and it comprises a full set of products and specifications to implement net-centric applications. COM+ integrates MTS services and message queuing into COM, and makes COM programming easier through a closer integration with Microsoft languages as Visual Basic, Visual C++, and J++. COM+ does not change the wire protocol that Distributed COM (DCOM) uses, so network communication stays unchanged. COM+ not only adds MTS-like quality of service into every COM+ object, it hides some of the complexities of coding in COM. COM suffers from some weaknesses that have been recognized by Microsoft and addressed in Component Object Model+. COM is hard to use. Reference counting, Microsoft IDL, Global Unique Identifiers (GUID), etc. require deep knowledge of the COM specification by developers. It has been estimated that up to 30 percent of the effort in writing component-based software is spent writing “object housekeeping” software. COM+ will handle this task for developers by providing implementations of most of the code that deals with the object infrastructure, leaving the developer free to write logic that deals with the problem they are trying to solve. COM+ provides a much better component administration environment, support for load balancing and object pooling, and an easier-to-use event model. These changes are based on a new COM+ runtime that moves most of the COM grunge code (e.g., IUnknown and class factory implementations) into the OS. COM+ consists of: 1) A runtime or execution environment 2) Extensible services, which are provided by Microsoft, which include transactions, security, load balancing, and automatic memory management. Third party developers will provide additional extensible services. 3) Innovation, with key concepts such as Interception, which enables many of the extensible services which COM+ provides. COM+ Principles A good number of developers see COM as more than a little challenging to understand and use. The reason for this is simple. Using COM's languageindependent objects in any real programming language requires understanding a new object model -- the one defined by COM. For example, a C++ programmer knows that creating a new object requires using the language's new operator, while getting rid of that object requires calling delete. If that same C++ programmer wants to use a COM object, however, the developer can't do things in this familiar way. Instead, the standard COM function CoCreateInstance (or one of a few other choices) must be called to create the object. When done with this COM object, the programmer doesn't delete it explicitly, as in C++, but instead invokes the object's Release method. The object relies on an internal reference count that it maintains to determine when it has no more clients, and thus when it's safe to destroy itself. COM+ still provides a standard library, and objects and their clients still use it. But in contrast to COM, COM+ hides calls to this library beneath the equivalent native functions in the programming language. C++ programmers, for example, can once again use the standard new operator rather than CoCreateInstance to create a COM+ object. In doing so, they are relying on a C++ compiler that is aware of COM+ to generate the correct code to call the COM+ library. To accomplish this, the compiler uses the COM+ library at compile time, then embeds calls to this same COM+ library in the generated binary. Microsoft will provide this library, and any language tool that wants to use COM+ must rely upon it. Unlike classic COM, where only COM objects and their clients use the COM library, COM+ also requires compilers (or interpreters, such as those for Visual Basic, and scripting languages, like JavaScript) to rely on a standard library to produce the correct code. COM+ eliminates the need for clients to call Release when they are done using an object. COM+ also allows implementation inheritance between COM+ objects running in the same process. In COM+, developers no longer need to define interfaces using IDL. Instead, they can just use their programming language's syntax to define the object's interfaces. The compiler for that language then works with the COM+ library to generate metadata for the object. Since every COM+ object has metadata, it's also possible to approach marshaling consistently. Marshaling is packaging a method call's parameters in some standard way, allowing these parameters to move effectively between objects written in entirely different languages or running on entirely different machines. A developer in a COM+ world just provides the Meta data that is needed and provides the methods. Most of the other code dealing with things such as registering components, reference counting for memory management etc. will be handled by the system. COM+ addresses an important but challenging problem in creating a languageindependent object model: data types. Different languages support different data types, which causes problems when passing parameters between objects written in different languages. COM+ introduces the concept of attribute-based programming. A developer sets attributes on an object that tell the system essentially how to treat it, for example, as a transactional object. This attribute-based programming model relies on a mechanism known as interception. Interceptors provide services based on attributes that have been previously set on an object by a developer. These interceptors provide automatic behavior at runtime based on the attribute set. As an example interception allows for things such as dynamic load balancing. Also, an interceptor makes sure that when a transactional object attempts to change data either all succeed or all fail and rollback. COM+ also changes COM's persistence model. Today, the creator of a COM object must typically implement one or more of a fairly large set of interfaces related to persistence. A client of this object then calls various methods in those interfaces to have the object load or save its persistent state. But the COM+ library provides standard support for persistence, removing much of the burden from the COM+ object implementor. And by representing an object's properties in a standard way ("serialization"), COM+ lets the developer pass objects by value. All that's required is to send this serialized representation of an object's data to another object of the same class. Today, COM and MTS components place all of their configuration information in the Windows registry. With COM+, however, most component information is stored in a new database, currently called the COM+ Catalog. The COM+ Catalog unifies the COM and MTS registration models and provides an administrative environment for components. A developer interacts with the COM+ Catalog using either the COM+ Explorer, which is similar to the MTS Explorer, or through a series of new COM interfaces that expose its capabilities. Another interesting change, support for constructors, makes COM+ objects more like objects in a typical object-oriented programming language. Languages like C++ and Java can define a constructor method that runs when first creating an object. The creator of the object can then pass parameters as needed to this constructor, allowing easy initialization. COM objects do not support constructors, but COM+ objects do. COM+ constructors even allow passing parameters, better integrating COM+ objects and the objects used by today's most popular objectoriented languages. Another important COM+ feature is its support for declarative programming. What this means is that a programmer can develop components in a generic way and defer many of the details until deployment time. For example, one can develop a component that supports working in a load-balanced environment. However, the decision of whether or not to use load balancing is deferred. Some applications may want to use load balancing and others may not. You indicate support by setting an attribute or declaring to use its support for load balancing. This is done at an administrative level using the COM+ Explorer. The MTS declarative security model is another example. Instead of handling security programmatically, you let the administrators do it through MTS packages and its administration model. Windows DNA Windows Distributed interNet Applications Architecture, or Windows DNA, is Microsoft’s latest acronym that describes its move from workstation-based to enterprise-level application development. Windows DNA describes those Microsoft technologies that provide a complete, integrated n-tier development model, and those services that developers require to build scalable and dependable enterprise-level systems on the Windows platform. Figure 1 Figure 1 depicts Windows DNA as it stands today. When building an application, a developer uses several different Windows and Internet technologies based on the application’s target user. Rich client applications are written using the Win32 API and distributed as executables, in the typical fashion. Thin client applications, or those that target a browser, use either straight HTML or dynamic HTML at the presentation tier. At the middle tier, developers use DCOM, MTS, IIS, and Active Server Pages to handle business logic and other application services. Components executing on the middle tier access back-end data using Active Data Object (ADO) or OLE DB. Microsoft also provides tools to access data on non-Windows platforms. Examples include ODBC, COM services on UNIX, and the new COM Transaction Integrator (COMTI). Figure 2 Figure 2 depicts the long-term goal of Windows DNA: a technology called Forms+ at the GUI level, COM+ at the middle tier, and Storage+ on the data tier. Details on Forms+ and Storage+ are sketchy, but the majority of COM+ has been delivered with Windows 2000. The goal of the Forms+ initiative is to merge the Win32 GUI and Web APIs. Forms+ is Microsoft’s answer to the difficulties developers face today when deciding what presentation platform to target when developing an application. Today, developers have to choose to either target Windows using the Win32 API or to target the browser and use HTML or dynamic HTML. Forms+ is a move away from the Win32 API and a move toward DHTML for Windows presentation development. Spend some time with the architecture of Internet Explorer 5 for a glimpse of where Forms+ is headed. Storage+ is the future of the Windows file system and will probably look a lot like OLE DB, but with several new features. Storage+ is the most distant technology of Windows DNA and we won’t see much on this front until well after Windows 2000. Conclusion It is important to note that COM+ essentially is one of the key unifying elements for Windows DNA, as it allows for the development of applications, which are flexible and powerful enough to deal with the spectrum of environments found today, from three tier client server environments to web based. What all of this means is that Microsoft understands that the software market is moving faster and faster toward the Internet. All of the future “killer applications” will be developed for Web environments, be they intranets or the Internet. To succeed in this new market, Microsoft is making it easier for developers to build applications without a dependency on the Win32 API. References Jason Pritchard, PH.D., COM and CORBA Side by Side, Addison-Wesley Longman, Inc., 1999 Doreen L. Galli, Distributed Operating Systems, Prentice Hall, 2000 Microsoft Corporation. The Component Object Model Specification, Version 0.9, October 24, 1995 [online]. <URL: http://www.microsoft.com/oledev/> (1995). Microsoft Corporation. Distributed Component Object Model Protocol-DCOM/1.0, draft, November 1996 [online]. <URL: http://www.microsoft.com/oledev/> (1996). Kirtland, Mary. “The COM+ Programming Model Makes it Easy to Write Components in Any Language”. Microsoft System Journal. December, 1997 Object Management Group home page [online]. The site provides information comparing DCOM (ActiveX) to CORBA. <URL: http://www.omg.org/> (1997). Cluts, Nancy Winnick. “Creating ActiveX Components in C++”. Microsoft Corporation. November, 1996. By: Paul Visokey Qisheng Hong Yani Mulyani