Siebel Zookeeper Troubleshooting Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Objectives In this session we are going to discuss Siebel Gateway , Zookeeper & Gateway cluster troubleshooting Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 3 Agenda Changes Introduced in Siebel IP2017 Siebel Gateway Registry Request Flow from AI to Siebel About Zookeeper Validate Zookeeper & Administration Gateway Cluster Validation Troubleshooting Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 4 Changes Introduced in IP2017 Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Siebel Enterprise Server Changes IP2017 Pre-IP2017 • Gateway data store is static – siebns.dat. Any change requires a Gateway restart. • Gateway hosts REST API for configuration of the Enterprise and Application Interface • Only local provisioning possible using Config Wizard. • Dynamic registry, facilitating elasticity and load balancing Siebel Enterprise Gateway port (usually2320) Gateway Siebns.dat https port http port Redirect to https Shutdown port Siebel Server S i e Si e bbeelSleServrevrr https port http port Redirect to https Shutdown port Application Container Siebel Enterprise Gateway REST API Gateway KeyStore TrustStore TLS port, Authorization Dynamic Registry ApplicationContainer ConfigAgent KeyStore Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | TrustStore SiSeiebbeleSleServreverr Siebel Server Changes to Siebel Gateway Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted 7 Gateway in Siebel IP2017 Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | About Siebel Gateway Registry The Siebel Gateway provides the dynamic address registry for Siebel Servers and server components, and also for Siebel Application Interface and other modules, like Siebel Enterprise Cache and Siebel Constraint Engine. For example, at startup, Siebel Server within the Siebel Enterprise Server stores its network address in the Siebel Gateway’s nonpersistent address registry. Siebel Enterprise Server components query the Siebel Gateway registry for Siebel Server availability and address information. When a Siebel Server shuts down, this information is cleared from the registry. The Siebel Application Interface and Siebel Gateway work together to provide Siebel Server load balancing. When a user requests a new application connection, Siebel Application Interface sends a request to Siebel Gateway, which returns a connect string for the least-loaded Application Object Manager from among the Siebel Servers supporting that component. The user session will use this Application Object Manager. The Siebel Gateway also includes persistent storage in the registry for configuration information for Siebel Server, Siebel Application Interface, and other installable components. This information includes: • Definitions and assignments of component groups and components • Operational parameters • Connectivity information As this configuration information changes, such as during the configuration of Siebel Enterprise, a Siebel Server, or a Siebel Application Interface, this data is written to the Siebel Gateway registry. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | SAI(HTTPS) Siebel Database Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Siebel Request Flow Log Review: When a user requests a new application connection, Siebel Application Interface sends a request to Siebel Gateway, which returns a connect string for the least-loaded Application Object Manager from among the Siebel Servers supporting that component. The user session will use this Application Object Manager. Request passed from SAI to CGW (HTTPS) CGW does not store anything on it's own. It uses Zookeeper for storing the configuration CGW needs Siebel configuration detail and runtime detail to expose restful services and it uses Zookeeper to store and read these detail. CGW has a few permanent connections into zk, cloudgateway/v1.0 restapi ZK will check in SvcDisocvery for minimum load OM process Send request to Siebel Server Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Siebel Request Flow Log Review: When a user requests a new application connection, Siebel Application Interface sends a request to Siebel Gateway, which returns a connect string for the least-loaded Application Object Manager from among the Siebel Servers supporting that component. The user session will use this Application Object Manager. AI:localhost_access*.log 172.17.44.85 - - [17/Nov/2019:14:13:00 +0800] "GET /siebel/app/eautomotive/chs?SWECmd=Start&SWEHo=172.17.44.85 HTTP/1.0" 200 248 172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/app/eautomotive/chs?SWECmd=Start&SWEHo=172.17.44.85 HTTP/1.0" 200 6958 172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/files/login.css HTTP/1.0" 200 14084 172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/scripts/login.js HTTP/1.0" 200 4634 172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/scripts/swecommon.js?_scb=19.9.0.0_SIA_[23073_0]_CHS HTTP/1.0" 200 55512 CG_localhost_access*.log 172.17.44.85 - - [17/Nov/2019:14:13:02 +0800] "GET /siebel/v1.0/cloudgateway/discovery/services/eautoobjmgr_chs/connectstring HTTP/1.1" 200 87 Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Siebel Request Flow: CommonLogger.log from AI: [DEBUG] 2019-11-17 14:13:02.711 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.handlers.HandlerFactory:getHandler Identified as UI channel [INFO ] 2019-11-17 14:13:02.712 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.util.CGUtil:cgGet New CGHost after Service Discovery : crmsit.local:9021 [INFO ] 2019-11-17 14:13:02.712 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.util.CGUtil:cgGet New CG URL after service discovery : https://crmsit.local:9021/siebel/v1.0/cloudgateway/discovery/services/eautoobjmgr_chs/connectstring [INFO ] 2019-11-17 14:13:02.719 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.store.ConfigStore:getConnString Connect String JSON:{ "ConnectString" : "siebel.TCPIP.NONE.NONE://CGWHOST:2321/ENT/eautoObjMgr_chs" } UI.log from AI [DEBUG] 2019-11-17 14:13:03.413 [https-openssl-nio-9011-exec-10] UI - com.siebel.swsm.sessmgr.UIConnectionModule:invokeService Calling invokeMethod for Siebel Request : > Value = • Contains all values of OM Session Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Gateway Registry Validation for SAI Validate cgclienstrore.dat under $sai/applicationcontainer/webapps It should contain entry as gateway host:HTTPS Port Validate GatewayServiceFramework.log 2019-12-24 15:33:49,176 INFO GtwySvcFrmwrkLog : Discovery: clientAuth for framework is : true . 2019-12-24 15:33:57,690 INFO GtwySvcFrmwrkLog : Discovery: Authentication to framework successfully done 2019-12-24 15:33:57,690 INFO GtwySvcFrmwrkLog : Discovery: CGmetafile created successfully:: C:\Siebel\AI\applicationcontainer\webapps\cgclientstore.dat 2019-12-24 15:33:58,268 INFO GtwySvcFrmwrkLog : Discovery: Connection to Registry successful 2019-12-24 15:33:59,065 INFO GtwySvcFrmwrkLog : Discovery: CGMetafile updated successfully 2019-12-24 15:33:59,096 INFO GtwySvcFrmwrkLog : Discovery: Service Discovery Successful 2019-12-24 15:33:59,096 INFO GtwySvcFrmwrkLog : Discovery: Get Connect String Successful If there are errors while authenticating, fix those. Otherwise it will cause issues while SvcDiscovery. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | What is Zookeeper ZooKeeper: A Distributed Coordination Service for Distributed Applications Zookeeper is a configuration coordination distributed system and we use it for that very requirement. In simple, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. ZooKeeper follows a simple client-server model where clients are nodes (i.e., machines) that make use of the service, and servers are nodes that provide the service. Applications make calls to ZooKeeper through a client library. The client library is responsible for the interaction with ZooKeeper servers. In Siebel terms, we call this as Siebel Gateway Registry. This zookeeper is started when gateway is started. Basically CGW does not store anything on it's own. It uses Zookeeper for storing the configuration Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Zookeeper Service Unix: siebel 22512 1 0 Dec13 ? 00:00:00 siebsvc -s gtwyns -a /f /refresh/siebel/ses/gtwysrvr/sys/siebns.dat /t 4330 /c /refresh/siebel/ses/gtwysrvr/bin/gateway.cfg siebel 22513 22512 0 Dec13 ? 00:17:08 /refresh/siebel/ses/gtwysrvr/../jre/bin/java Dzookeeper.log.dir=/refresh/siebel/ses/gtwysrvr/zookeeper -Dzookeeper.root.logger=INFO -cp "/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../build/classes:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../build/lib/*.jar:/refresh/siebel/ses/gtwys rvr/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/slf4j-api1.6.1.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/log4j1.2.16.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/jline-0.9.94.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../zookeeper3.4.8.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../src/java/lib/*.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../conf" Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /refresh/siebel/ses/gtwysrvr/zookeeper/conf/zoo1.cfg Windows: Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | How to Verify Zookeeper status: One can validate zookeeper status with below: Process status: Validate from OS process for zookeeper Using zkCli Under path: $gtwysrvr/zookeeper/bin Unix: zkCli.sh -server <gatewayhost:registryport> Windows: zkCli.cmd -server <gatewayhost:registryport> Using ZooInspector How To Read Zookeeper Data In Standalone Mode Using ZooInspector (Doc ID 2427936.1) Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Siebel Zookeeper Zookeeper is a configuration coordination distributed system and we use it for that very requirement, so all component & configuration information is stored in zookeeper. Zookeeper contains below sections: Gateways SvcDiscovery Zookeeper Config Emdiscovery ServiceRegistry Enterprises DEMO Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Relevant Information in Zookeeper SvcDiscovery: which stores component process information. When even component requst comes like OM request comes to zk, it should take it from Discovery information. Config: section stores SMC information like Profiles & Deployments ServiceRegistry: contains Gateway registry & TLS information EMDiscovery: contains permanent connection to ZK discovery: https://<gatewayhost:HTTPSPort>/siebel/v1.0/cloudgateway/enterprises?e xpand=all enterprises: Contains enterprise information like component groups, parameters, named subsystems etc... Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Relevant Information in Zookeeper Every SMC login will create bootstrap request as below ($ses/applicationcontainer/logs/localhost_access*log) [16/Dec/2019:18:58:35 -1200] "GET /siebel/v1.0/cloudgateway/bootstrapCG?_=1576565914331 HTTP/1.1" 200 276 [16/Dec/2019:18:58:35 -1200] "GET /siebel/v1.0/cloudgateway/deployments/gatewaycluster?_=1576565914333 HTTP/1.1" 200 1018 [16/Dec/2019:18:58:35 -1200] "GET /siebel/v1.0/cloudgateway/deployments/enterprises?_=1576565914334 HTTP/1.1" 200 365 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/enterprises?_=1576565914332 HTTP/1.1" 200 63 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/cacheserver?_=1576565914337 HTTP/1.1" 200 37 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/migrations?_=1576565914338 HTTP/1.1" 200 348 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/constraintengine?_=1576565914339 HTTP/1.1" 200 42 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/swsm?_=1576565914336 HTTP/1.1" 200 307 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/servers?_=1576565914335 HTTP/1.1" 200 455 [16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/profiles/cacheserver?_=1576565914341 HTTP/1.1" 200 4136 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/swsm?_=1576565914340 HTTP/1.1" 200 131199 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/cacheclient?_=1576565914342 HTTP/1.1" 200 1300 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/servers?_=1576565914344 HTTP/1.1" 200 1896 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/enterprises?_=1576565914343 HTTP/1.1" 200 3929 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/constraintengine?_=1576565914346 HTTP/1.1" 200 39 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/migrations?_=1576565914345 HTTP/1.1" 200 2766 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/gatewaycluster?_=1576565914348 HTTP/1.1" 200 644 [16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/security?_=1576565914347 HTTP/1.1" 200 918 AOM request will create Discovery request with Gateway as below ($ses/applicationcontainer/logs/localhost_access*log) [16/Dec/2019:19:09:20 -1200] "GET /siebel/v1.0/cloudgateway/discovery/services/sccobjmgr_enu/connectstring HTTP/1.1" 200 30 [16/Dec/2019:19:09:59 -1200] "GET /siebel/v1.0/cloudgateway/discovery/services/sccobjmgr_enu/connectstring HTTP/1.1" 200 30 Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting: Validate ZK from zkCli Go to $ses/gtwysrvr/zookeeper/bin: Unix: ./zkCli.sh -server <gtwyhost:registry port> addauth –digest <SADMIN>:<SADMINPASSWORD> ls / Windows: zkCli.cmd -server <gtwyhost:registry port> addauth –digest <SADMIN>:<SADMINPASSWORD> ls / NOTE: Credentials should match with gateway.properties Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting: Validate ZK from zkCli Go to $ses/gtwysrvr/zookeeper/bin: Unix: ./zkCli.sh -server <gtwyhost:registry port> addauth –digest <SADMIN>:<SADMINPASSWORD> ls / Windows: zkCli.cmd -server <gtwyhost:registry port> addauth –digest <SADMIN>:<SADMINPASSWORD> ls / NOTE: Credentials should match with gateway.properties Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Troubleshooting: ZooKeeper Commands: The Four Letter Words ZooKeeper responds to a small set of commands. Each command is composed of four letters. You issue the commands to ZooKeeper via telnet or nc, at the client port. Some of the more interesting commands: "stat" gives some general information about the server and connected clients, while "srvr“, “wchc”, “dump” and "cons" give extended details on server and connections respectively. Conf: Print details about serving configuration. Cons : List full connection/session details for all clients connected to this server. Includes information on numbers of packets received/sent, session id, operation latencies, last operation performed, etc... dump: Lists the outstanding sessions and ephemeral nodes. This only works on the leader. srvr: New in 3.3.0: Lists full details for the server. Stat: Lists brief details for the server and connected clients. wchc: Lists detailed information on watches for the server, by session. This outputs a list of sessions(connections) with associated watches (paths). Note, depending on the number of watches this operation may be expensive (ie impact server performance), use it carefully. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Enable Zookeeper Server & Client Logs: To enable ZK client & server logs: How to Enable Zookeeper Client & Server Logs for Siebel 18.x & later ? (Doc ID 2555834.1) Cleanup Zookeeper version-2 folder (Doc ID 2425164.1) This reference contains steps for both Windows & Linux Known Issues: Bug 29025939 : CLOUD GATEWAY LOSES STATE OF OBJECT MANAGER AND REQUIRES AOM RESTART TO FIX Bug 28534810 : ZOOKEEPER IS NOT PURGING TRANSACTION FILES AS EXPECTED Refer: Zookeeper Administration Guide: https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Gateway Cluster Deployment Siebel CRM supports an optional native clustering feature for Siebel Gateway to provide high availability benefits to Siebel CRM customers. This feature works at the software level and is the preferred and recommended approach for clustering the Siebel Gateway. This topic is part of Con guring the Siebel Gateway Cluster. The clustering feature supports both the Siebel Gateway service (application container) and the Siebel Gateway registry (Apache ZooKeeper). You might choose to use Siebel Gateway clustering only for your production environment, for example. Further, you can use clustering for only the Siebel Gateway service, or only the Siebel Gateway registry. However, it is recommended to configure clustering for both of them. For a cluster to be always up and running, majority of the nodes in the cluster should be up. So, it is always recommended to run zookeeper (gateway registry) cluster in odd number of servers. For example, cluster with 3 nodes, or cluster with 5 nodes, etc. Refer Siebel Gateway Cluster Deployment : Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Gateway Cluster Validation Validate zoo1.cfg file under $ses\gtwysrvr\zookeeper\conf autopurge.purgeInterval=1 initLimit=10 syncLimit=5 autopurge.snapRetainCount=10 maxClientCnxns=10000 snapCount=256 clientPort=8330 tickTime=2000 dataDir=c\:\\Siebel\\ses\\gtwysrvr\\zookeeper server.1=node1:8335:8336 server.2=node2:8335:8336 server.3=node3:8335:8336 myid: myid file consists of a single line containing only the text of that machine's id. So myid of server 1 would contain the text "1" and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255. metadata under $ses/siebsrvr cgclientstore Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Gateway zoo1.cfg Parameters clientPort: the port to listen for client connections; that is, the port that clients attempt to connect to. tickTime: the length of a single tick, which is the basic time unit used by ZooKeeper, as measured in milliseconds. It is used to regulate heartbeats, and timeouts. For example, the minimum session timeout will be two ticks. syncLimit: Amount of time, in ticks (see tickTime), to allow followers to sync with ZooKeeper. If followers fall too far behind a leader, they will be dropped. initLimit: Amount of time, in ticks (see tickTime), to allow followers to connect and sync to a leader. Increased this value as needed, if the amount of data managed by ZooKeeper is large. initLimit This is the timeout limit, which indicates the length of time for one of the zookeeper nodes in quorum have to connect to the leader. syncLimit This specifies the limit on how much apart the individual nodes can be out-of-sync (i.e out-of-date) from the leader. snapCount: ZooKeeper logs transactions to a transaction log. After snapCount transactions are written to a log file a snapshot is started and a new transaction log file is created. The default snapCount is 100,000. traceFile: If this option is defined, requests will be will logged to a trace file named traceFile.year.month.day. Use of this option provides useful debugging information, but will impact performance. (Note: The system property has no zookeeper prefix, and the configuration variable name is different from the system property. Yes - it's not consistent, and it's annoying.) maxClientCnxns: Limits the number of concurrent connections (at the socket level) that a single client, identified by IP address, may make to a single member of the ZooKeeper ensemble. This is used to prevent certain classes of DoS attacks, including file descriptor exhaustion. The default is 10. Setting this to 0 entirely removes the limit on concurrent connections. Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Gateway Cluster fail Over Scenarios Validation Stop any node Gateway Service (Gateway Tomcat) Stop any node Gateway Registry (Gateway Zookeeper) Validate $AI/applicationcontainer/logs/CommonLoggerLog.log One can find which CGHost the request is moving. Also Srvrmgr still connect to other node when connecting node is down Exercise with Zk commands Demo: Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Gateway Cluster Known Issues & Bugs SADMIN password change does not reflect in zookeeper which is causing gateway cluster failure (Doc ID 2512095.1) APPLICATION CAN FAIL TO LOAD IF ONE OF THE GATEWAY NODES IS DOWN WITH GATEWAY CLUSTER (Doc ID 2585549.1) Siebel IP18.9 - Gateway Cluster Not Working As Expected/Fails (Doc ID 2530000.1) Bug 30301670 : INTERMITTENT COMPONENT JOBS FAILING WHEN GATEWAY NODE IS DOWN Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 30 THANK YOU Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | 31