CA Identity Manager IM clustering on Websphere – Resolving cluster member startup issues This documentation and any related computer software help programs (hereinafter referred to as the “Documentation”) is for the end user’s informational purposes only and is subject to change or withdrawal by CA at any time. This Documentation may not be copied, transferred, reproduced, disclosed, modified or duplicated, in whole or in part, without the prior written consent of CA. This Documentation is confidential and proprietary information of CA and protected by the copyright laws of the United States and international treaties. Notwithstanding the foregoing, licensed users may print a reasonable number of copies of the Documentation for their own internal use, and may make one copy of the related software as reasonably required for back-up and disaster recovery purposes, provided that all CA copyright notices and legends are affixed to each reproduced copy. Only authorized employees, consultants, or agents of the user who are bound by the provisions of the license for the Product are permitted to have access to such copies. The right to print copies of the Documentation and to make a copy of the related software is limited to the period during which the applicable license for the Product remains in full force and effect. Should the license terminate for any reason, it shall be the user’s responsibility to certify in writing to CA that all copies and partial copies of the Documentation have been returned to CA or destroyed. EXCEPT AS OTHERWISE STATED IN THE APPLICABLE LICENSE AGREEMENT, TO THE EXTENT PERMITTED BY APPLICABLE LAW, CA PROVIDES THIS DOCUMENTATION “AS IS” WITHOUT WARRANTY OF ANY KIND, INCLUDING WITHOUT LIMITATION, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT. IN NO EVENT WILL CA BE LIABLE TO THE END USER OR ANY THIRD PARTY FOR ANY LOSS OR DAMAGE, DIRECT OR INDIRECT, FROM THE USE OF THIS DOCUMENTATION, INCLUDING WITHOUT LIMITATION, LOST PROFITS, BUSINESS INTERRUPTION, GOODWILL, OR LOST DATA, EVEN IF CA IS EXPRESSLY ADVISED OF SUCH LOSS OR DAMAGE. The use of any product referenced in the Documentation is governed by the end user’s applicable license agreement. The manufacturer of this Documentation is CA. Provided with “Restricted Rights.” Use, duplication or disclosure by the United States Government is subject to the restrictions set forth in FAR Sections 12.212, 52.227-14, and 52.227-19(c)(1) - (2) and DFARS Section 252.227-7014(b)(3), as applicable, or their successors. All trademarks, trade names, service marks, and logos referenced herein belong to their respective companies. Copyright © 2008 CA. All rights reserved. Contact Technical Support For online technical assistance and a complete list of locations, primary service hours, and telephone numbers, contact Technical Support at http://ca.com/support. Contents Prerequisites 5 Workload Manager & ORB Service 6 Background ...................................................................................................................... 6 Workload Manager & each cluster member’s Workflow connection to the ORB service ................ 6 Each cluster member’s Workflow connection to the ORB service .............................................. 8 Recommended reading 10 Prerequisites Identity Manager should be successfully deployed to a Websphere cluster, and validated as functioning correctly for all out-of-the-box capabilities. Consult documentation for detailed steps. Websphere Administration skills are essential to follow this document’s content. All steps covered in this document involve the Websphere Deployment Manager Console pages unless otherwise noted. The technical scope of this document is not specific to a particular version of Websphere. Page 5 of 10 Workload Manager & ORB Service Background CA Identity Manager includes Workflow software which is EJB intensive. The CA Identity Manager Workflow component of each server in a cluster, when started, will attempt to communicate with the Object Request Broker (ORB) service. The Websphere Workload Manager is responsible for facilitating this connection. Workload Manager & each cluster member’s Workflow connection to the ORB service The following is a typical error message prevalent when an initial connection is not possible to the ORB service: [10/28/13 15:00:29:150 EDT] 00000009 SystemOut O 15:00:29,147 ERROR [ims.tmt.EnvironmentService] Stopping WidgetsPOC-IME java.lang.RuntimeException: com.netegrity.portal.service.workflow.core.WorkflowException: CORBA NO_IMPLEMENT 0x49421042 No; nested exception is: org.omg.CORBA.NO_IMPLEMENT: >> SERVER (id=11c328fe, host=xlab325.widgetst.com) TRACE START: >> org.omg.CORBA.NO_IMPLEMENT: No Cluster Data Available vmcid: 0x49421000 minor code: 42 completed: No >> at com.ibm.ws.cluster.router.selection.WLMLSDRouter.select(WLMLSDRouter.java:295) >> at com.ibm.ws.cluster.propagation.ServerClusterContextListenerImpl.forwardRequest(ServerClusterContextListenerImpl.java:635) >> at com.ibm.ws.cluster.propagation.ServerClusterContextListenerImpl.validateRequest(ServerClusterContextListenerImpl.java:679) >> at com.ibm.ws.wlm.server.WLMServerRequestInterceptor.notifyValidationListeners(WLMServerRequestInterceptor.java:317) >> at com.ibm.ws.wlm.server.WLMServerRequestInterceptor.receive_request_service_contexts(WLMServerRequestInterceptor.java:206) >> at com.ibm.rmi.pi.InterceptorManager.invokeInterceptor(InterceptorManager.java:607) >> at com.ibm.rmi.pi.InterceptorManager.iterateServerInterceptors(InterceptorManager.java:520) >> at com.ibm.rmi.pi.InterceptorManager.iterateReceiveContext(InterceptorManager.java:714) >> at com.ibm.rmi.iiop.ServerRequestImpl.runInterceptors(ServerRequestImpl.java:156) >> at com.ibm.rmi.iiop.Connection.respondTo(Connection.java:2902) >> at com.ibm.rmi.iiop.Connection.doWork(Connection.java:2823) >> at com.ibm.rmi.iiop.WorkUnitImpl.doWork(WorkUnitImpl.java:65) >> at com.ibm.ejs.oa.pool.PooledThread.run(ThreadPool.java:118) >> at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1646) >> SERVER (id=11c328fe, host=xlab325.widgetst.com) TRACE END. The CORBA NO_IMPLEMENT message occurs when the EJB client code (CA Identity Manager Workpoint component) cannot find a server to send the EJB request to. The Websphere Workload Management component maintains a list of active servers. If the list is empty, then the request cannot be sent. Page 6 of 10 To resolve this issue (it may be intermittent as well): 1. Ensure that the node agent, deployment manager (dmgr), and application server are up and running and in the same core group. 2. In the administrative console, click on "System Administration" on the left side. Click on "Cell" underneath that. Select "Custom Properties" in the middle frame, and add IBM_CLUSTER_CALLBACK_TIMEOUT with a value for the timeout in milliseconds. So 10000 is 10 seconds; we recommend a setting of 600,000 which is equivalent to 10mins. Selecte the “Ok” button to save the change. Stop all the cluster members, the node agents, and the deployment manager, then restart. Notes: The above property was sufficient to resolve this issue; however, an alternative approach is worth mentioning as timeouts may not be a panacea for every environment. IBM_CLUSTER_ENABLE_PRELOAD with a value of true to enable the preload function IBM_CLUSTER_CALLBACK_TIMEOUT set to 0 (zero) The preload value must be true if the callback timeout is set to zero. As a result of these two properties, the node agent will cache cluster information. As the node agent can take a much longer time to start up, this approach may not be desirable for customers with extremely large topologies. Page 7 of 10 Each cluster member’s Workflow connection to the ORB service After the changes in the previous step, the individual servers will connect to the ORB service, however there may be a delayed response from the ORB listener service that can result in the following errors: [10/29/13 14:28:46:640 EDT] 0000000b SystemOut O 14:28:46,638 ERROR [ims.tmt.EnvironmentService] Stopping WidgetsPOC-IME java.lang.RuntimeException: com.netegrity.portal.service.workflow.core.WorkflowException: CORBA NO_RESPONSE 0x4942fb01 Maybe; nested exception is: org.omg.CORBA.NO_RESPONSE: Request 44 timed out vmcid: IBM minor code: B01 completed: Maybe; nested exception is: java.rmi.RemoteException: CORBA NO_RESPONSE 0x4942fb01 Maybe; nested exception is: org.omg.CORBA.NO_RESPONSE: Request 44 timed out vmcid: IBM minor code: B01 completed: Maybe at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.startEnvironment(AdaptersConfigServiceImpl.java:157) at com.netegrity.ims.businessprocess.IMSEnvironmentServiceImpl.startEnvironmentInternal(IMSEnvironmentServiceImpl.java:445) at com.netegrity.ims.businessprocess.IMSEnvironmentServiceImpl.startEnvironment(IMSEnvironmentServiceImpl.java:336) at com.netegrity.ims.bootstrap.Main.start(Main.java:267) at com.netegrity.webapp.SystemInitializer.contextInitialized(SystemInitializer.java:44) at com.ibm.ws.webcontainer.webapp.WebApp.notifyServletContextCreated(WebApp.java:1718) at com.ibm.ws.webcontainer.webapp.WebApp.commonInitializationFinish(WebApp.java:385) at com.ibm.ws.webcontainer.webapp.WebAppImpl.initialize(WebAppImpl.java:299) at com.ibm.ws.webcontainer.webapp.WebGroupImpl.addWebApplication(WebGroupImpl.java:100) at com.ibm.ws.webcontainer.VirtualHostImpl.addWebApplication(VirtualHostImpl.java:166) at com.ibm.ws.webcontainer.WSWebContainer.addWebApp(WSWebContainer.java:732) at com.ibm.ws.webcontainer.WSWebContainer.addWebApplication(WSWebContainer.java:617) at com.ibm.ws.webcontainer.component.WebContainerImpl.install(WebContainerImpl.java:376) at com.ibm.ws.webcontainer.component.WebContainerImpl.start(WebContainerImpl.java:668) at com.ibm.ws.runtime.component.ApplicationMgrImpl.start(ApplicationMgrImpl.java:1127) at com.ibm.ws.runtime.component.DeployedApplicationImpl.fireDeployedObjectStart(DeployedApplicationImpl.java:1319) at com.ibm.ws.runtime.component.DeployedModuleImpl.start(DeployedModuleImpl.java:611) at com.ibm.ws.runtime.component.DeployedApplicationImpl.start(DeployedApplicationImpl.java:944) at com.ibm.ws.runtime.component.ApplicationMgrImpl.startApplication(ApplicationMgrImpl.java:740) at com.ibm.ws.runtime.component.ApplicationMgrImpl.start(ApplicationMgrImpl.java:2051) at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.start(CompositionUnitMgrImpl.java:385) at com.ibm.ws.runtime.component.CompositionUnitImpl.start(CompositionUnitImpl.java:123) at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.start(CompositionUnitMgrImpl.java:328) at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.access$300(CompositionUnitMgrImpl.java:113) at com.ibm.ws.runtime.component.CompositionUnitMgrImpl$CUInitializer.run(CompositionUnitMgrImpl.java:895) at com.ibm.wsspi.runtime.component.WsComponentImpl$_AsynchInitializer.run(WsComponentImpl.java:496) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1646) Caused by: com.netegrity.portal.service.workflow.core.WorkflowException: CORBA NO_RESPONSE 0x4942fb01 Maybe; nested exception is: org.omg.CORBA.NO_RESPONSE: Request 44 timed out vmcid: IBM minor code: B01 completed: Maybe; nested exception is: java.rmi.RemoteException: CORBA NO_RESPONSE 0x4942fb01 Maybe; nested exception is: org.omg.CORBA.NO_RESPONSE: Request 44 timed out vmcid: IBM minor code: B01 completed: Maybe at com.netegrity.ims.businessprocess.WorkflowManager.getSystemWFSession(WorkflowManager.java:55) at com.netegrity.ims.businessprocess.WorkflowServiceImpl.getWorkflowProcesses(WorkflowServiceImpl.java:808) at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.buildWorkflowMapperProperties(AdaptersConfigServiceImpl.java:846) at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.initAdapter(AdaptersConfigServiceImpl.java:826) at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.initVector(AdaptersConfigServiceImpl.java:811) at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.initAdapters(AdaptersConfigServiceImpl.java:800) at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.initEventAdapters(AdaptersConfigServiceImpl.java:589) at com.netegrity.ims.bootstrap.AdaptersConfigServiceImpl.startEnvironment(AdaptersConfigServiceImpl.java:150) ... 26 more [10/29/13 14:28:46:640 EDT] 0000000b SystemOut O 14:28:46,640 WARN [ims.tmt.EnvironmentService] * Stopping environment: WidgetsPOC-IME Page 8 of 10 On the Websphere Admin Console, under System Administration -> Deployment Manager, change the timeout value. Change the Request Timeout to 600 (10 mins) from 180. Select “Ok” to save it. Stop all the cluster members, the node agents, and the deployment manager, then restart. Notes: Although the above properties should resolve any issues related to Workload management and the ORB service, the following properties may also be investigated if need be: 1. Access the Java Virtual Machine page in the Websphere Admin Console. Page 9 of 10 2. On the Java Virtual Machine page, specify one or more of the following command-line arguments in the Generic JVM arguments field: o -Dcom.ibm.CORBA.RequestTimeout=timeout_interval This argument changes the value for the com.ibm.CORBA.RequestTimeout property, which specifies the timeout period for responding to requests sent from the client. This argument uses the -D option. timeout_interval is the timeout period in seconds. If your network experiences extreme latency, specify a large value to prevent timeouts. If you specify a value that is too small, an application server that participates in workload management can time out before it receives a response. Be careful specifying this property; it has no recommended value. Set it only if your application is experiencing problems with timeouts. o -Dcom.ibm.websphere.wlm.unusable.interval=interval If the workload management state of the client is refreshing too soon or too late, this argument changes the value for the com.ibm.websphere.wlm.unusable.interval property, which specifies the time interval that the workload management client runtime waits after it marks a server as unavailable before it attempts to contact the server again. This argument uses the -D option. interval is the time in seconds between attempts. The default value is 300 seconds. If the property is set to a large value, the server is marked as unavailable for a long period of time. This prevents the workload management refresh protocol from refreshing the workload management state of the client until after the time period has ended. 3. Click OK and Save to save your configuration changes. 4. Stop the application server and then restart the application server. Recommended reading Best Practices for Large WebSphere Application Server Topologies http://www3.software.ibm.com/ibmdl/pub/software/dw/wes/0710_largetopologies/LargeWebSphereT opologies.pdf Ensuring enterprise availability when deploying Enterprise JavaBeans in WebSphere Application Server http://www.ibm.com/developerworks/websphere/techjournal/1109_col_vanrun/1109_col_vanrun.html Page 10 of 10