Information Dissemination Services – Usage patterns discussion document Author: Vijay Dialani (vkd00r@ecs.soton.ac.uk) Optimizing Interaction between IDSs An information dissemination service (IDS) consists of four distinct artefacts – the publication, the subscription, the propagation and the consumption. IDS uses publish/subscribe notification mechanism to facilitate information sharing between a set of publishers and consumers. A publication is described by means of a publication rule, which essentially consists of data access statement and the data access schedule. It is assumed that a DS (DAIS Service) is used to provide the data resource access for the IDS. An access statement (usually a query statement) is evaluated by the associated data access service. As specified in DS specification, a synchronous or an asynchronous access mechanism can be used to access a DS. This document discusses probable usage scenarios of the IDS and elaborates its probable interactions with the DS. It should be noted that there exist multiple possibilities for optimizing the performance of the information dissemination service. Examples of such optimization include, but are not limited to, optimizing data access by identifying similarity between query expressions, optimizing publication process by sharing access across multiple subscriptions, optimizing data propagation by intelligently managing the staging issues, etc. However, this document focuses on the service level semantics and/or interface modifications that may allow incorporation of known optimization techniques. This document suggests modifications to operational semantics of IDS. Following general principles underline the modifications suggested in this document: Maximize the resource sharing capabilities between multiple subscriptions. Centralize the maintenance of the state to within the publication service and facilitate recovery of state for the consumer and the analyst services. Optimize resource management by automating object lifetime management by the use of policies. Modification to usage patterns Four usage scenarios described in the GDD specification are: Asynchronous SQL Result Set delivery to 3rd party consumers. Publication on Request: Asynchronous SQL Result Set Delivery. Workflow. Continuous replication. Above scenarios were analyzed to abstract the usage patterns for the operations provided by the IDS. Modifications have been suggested to introduce the brokerage and discovery pattern for out of bounds communication between the analyst and prospective subscribers of the information. In addition, the assumption of direct co-relation between the publication and the subscription has been relaxed to introduce the notion of a publication shared for a set of subscriptions. Suggested Modification-1 Explicitly creating and managing a publication (with brokerage pattern), involves An analyst creates an explicit publication and controls the rights for subscription to the data. An analyst acts as a broker to create a “placeholder” subscription object with the IDS and distributes the subscription identifier to potential subscribers. Use of brokerage pattern prevents unauthorized propagation of rights to subscription. The analyst determines the lifetime management policy of the publication. Destruction of the publication results in explicit destruction of all the subscription objects. Suggested Modification-2 Implicitly creating and managing a publication (with discovery pattern), involves A subscription request results in the creation of the publication object and the primary subscription. The implicit publication object is published in a discovery service (provided the policy permits). Subsequent subscription objects share the publication object. Dropping of the primary subscription does not necessarily result in dropping of the publication. The publication object is automatically dropped when there are no more subscription objects for the specified publication. The dropping of the publication object is automatically preceded by unpublishing of the object from the discovery service. Suggested Modification-3 Selective association of the state with IDS objects, involves An analyst creates a publication for continual data. The analyst controls the publication process by explicit calls to publishData( ). In accordance to the publication rule, a query is evaluated by the data access and a data set is generated. The data sets are either immediately transported for consumption or are cached for retrieval by explicit calls to the getData( ) method. Either the subscription object or the consumption object ends up caching the dataset. The scenario represents a staging problem, one similar to that encountered in data access mechanism. The state maintenance needs to take into account the staging policy, to facilitate efficient usage of resources. It is suggested that the subscription rule be modified to include the policy for resolving the staging issues. Suggested Modification-4 Providing reflection for publication and subscription objects and means to interrelate their lifetime management. The IDS service maintains the state of the valid publications and subscriptions, an associated object identifier (for example publication ID, subscription ID) are associated with the messages to maintain the context. In certain cases, for example a terminal fault in the analyst, it is not possible for the client to retain the state identifiers to recreate the state at the client side. Providing enumeration on publication and subscription objects allows state recovery for the analyst and consumers. Explicit creation and maintenance of objects in an SOA introduces requirements for autonomous destruction of the objects. By default the lifetime of the objects are associated with the temporal constraints, for example, timeout. However, the lifetimes of the various resources are inter-related. For example, in case of managing an explicit publication, the lifetime of all the subscription objects is associated to the lifetime of the publication. Providing added features to associate lifetime of the objects within the service seems to be a probable way of efficiently utilizing the resources. A detailed scenario is explained in the following section. A probable usage pattern An IDS publication is activated by either an explicit request for a publication or an implicit request for a subscription. In either case the invocation results in the creation of a publication object, a subscription object and a data access object. In case of an explicit creation of the publication object all the subsequent subscription objects lifetimes are controlled by the lifetime of the publication object, where as if the subscription request is used to create an implicit publication the subscription object controls the lifetime of the publication. The lifetime of this combined entity controls the lifetime of the data access session. In its current form the IDS specification necessitates explicit management of the service objects. For example, createPublication() and dropPublication() control the lifetime of an specific publication object. However, use of policy based control for the publication process facilitates resource sharing. For example, consider that an analyst has requested for a publication of the “customer data” to be accessed and propagated at sometime in the future. If the publication is advertised in a discovery service or is available through reflection provided by the IDS -a number of probable subscribers can subscribe for the publication and as a result optimize the data access process. In addition, the publication lifetime policy can determine the semantics for the dropping of the publication. Probable candidate policies for dropping of the publication include- first, consider that the analyst holds exclusive authorizations for extracting the data from the data resource. In this if the analyst has exclusive rights to drop the publication. However, if any of the subsequent subscriptions also have access rights to the underlying data resource the dropPublication() by the analyst will only drop the publication and subscription requests associated with the analyst, while retaining the remaining subscriptions. Successful evaluation of the doPublish() may result in service staging the data for multiple subscribers or each subscriber initiating its own propagation. The creation of the propagation object proceeds in accordance to the propagation rules. A probable propagation policy may allow automatic propagation of the data on the completion of the publication process. The above example uses policy based object management for maximizing resource sharing between multiple subscriptions. DS Access Patterns An IDS publication object creates a DS based access to the underlying data resource. An IDS may utilize the DS in either of the following three patterns: 1. A synchronous access mechanism 2. An asynchronous access mechanism. 3. A continual data access mechanism – In this pattern, the IDS receives multiple datasets from the DS. Each of these datasets may be asynchronously delivered at different temporal points. For example, online replication mechanism with periodic updates. The first two access patterns have been described in the existing DS specification, while the continual access pattern is not yet supported. Additionally, there are some other unresolved issues both in IDS and DS that have been highlighted below: 1. If a IDS accepts a publication request, does it attempts to verify the publication rule at the time of acceptance, does it verify the schema and access rights with the underlying DS? 2. Does the DS notify the IDS of any schema changes that may effect its publication requests? 3. Does the DS provide an IDS with metadata information on the authorization model?