Zookeeper data validation: Go to $ses/gtwysrvr/zookeeper/bin Run command: zkCli -server host:<registryport> Once it prompts for Connected0: addauth digest SADMIN:<SADMINPassowrd> These are registry userid & password. Then, command: ls / If authentication is successful, ls / command will provide list of zookeeper To validate every tree, run command as “ls /<name>” Four letter commands for Zookeeper using nc: ZooKeeper responds to a small set of commands. Each command is composed of four letters. You issue the commands to ZooKeeper via telnet or nc, at the client port. Some of the more interesting commands: "stat" gives some general information about the server and connected clients, while "srvr“, “wchc”, “dump” and "cons" give extended details on server and connections respectively. Conf: Print details about serving configuration. Cons : List full connection/session details for all clients connected to this server. Includes information on numbers of packets received/sent, session id, operation latencies, last operation performed, etc... dump: Lists the outstanding sessions and ephemeral nodes. This only works on the leader. srvr: New in 3.3.0: Lists full details for the server. Stat: Lists brief details for the server and connected clients. wchc: Lists detailed information on watches for the server, by session. This outputs a list of sessions(connections) with associated watches (paths). Note, depending on the number of watches this operation may be expensive (ie impact server performance), use it carefully. For Unix: nc is available on most Unix OSes and Linux. Please consult with your Unix/Linux administrator. Windows: It can run via power function since nc not available for windows. date; echo wchc | nc <host> <registryport> [siebel@abc bin]$ date; echo wchc | nc abc.us.oracle.com 4330 Fri Jul 12 03:49:20 BST 2019 0x16bb66566630001 0x16bb66566630904 /ServiceRegistry/TLSGateway/abc.us.oracle.com:9014 0x16bb6656663090f /ServiceRegistry/TLSGateway/abc.us.oracle.com:9014 0x16bb66566630003 0x16bb6656663000d /ServiceRegistry/Gateway/abc.us.oracle.com:9011 /ServiceRegistry/Gateway 0x16bb66566630907 /ServiceRegistry/TLSGateway/abc.us.oracle.com:9014 0x16bb6656663001a /ServiceRegistry/TLSGateway/abc.us.oracle.com:9014 0x16bb66566630005 /ServiceRegistry 0x16bb66566630018 /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/webtools /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application/Authentication prop/UserSpec /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application/Authentication prop/GuestSessionTimeout /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/RESTInBound/Restauthenticationprop /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application/EAISOAPNoSess InPref /Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/RESTInBound/Restauthenticationprop/OAuthEndPoi nt date; echo cons | nc abc.us.oracle.com 4330 [siebel@abc bin]$ date; echo cons | nc abc.us.oracle.com 4330 Fri Jul 12 03:50:58 BST 2019 /10.64.196.169:28029[0](queued=0,recved=1,sent=0) /10.64.196.169:27209[1](queued=0,recved=790,sent=790,sid=0x16bb66566630903,lop=PING,est=1562892020871 ,to=30000,lcxid=0x5d27d6fb,lzxid=0xfb82,lresp=1562899856756,llat=0,minlat=0,avglat=0,maxlat=10) /10.64.196.169:62280[1](queued=0,recved=76567,sent=76567,sid=0x16bb6656663005c,lop=PING,est=156213372 3843,to=30000,lcxid=0x5d1c43ed,lzxid=0xfb82,lresp=1562899858412,llat=1,minlat=0,avglat=0,maxlat=15) /10.64.196.169:61873[1](queued=0,recved=87838,sent=87838,sid=0x16bb66566630000,lop=PING,est=156213326 8588,to=40000,lcxid=0x76ca,lzxid=0xfb82,lresp=1562899850530,llat=1,minlat=0,avglat=0,maxlat=14) /10.64.196.169:61874[1](queued=0,recved=87841,sent=87842,sid=0x16bb66566630001,lop=PING,est=156213326 8872,to=40000,lcxid=0x76ca,lzxid=0xfb82,lresp=1562899848830,llat=0,minlat=0,avglat=0,maxlat=19) /10.64.196.169:61912[1](queued=0,recved=57458,sent=57458,sid=0x16bb66566630013,lop=PING,est=156213329 5896,to=40000,lcxid=0x1,lzxid=0xfb82,lresp=1562899849825,llat=0,minlat=0,avglat=0,maxlat=13) /10.64.196.169:61914[1](queued=0,recved=57458,sent=57458,sid=0x16bb66566630015,lop=PING,est=156213329 5929,to=40000,lcxid=0x0,lzxid=0xfb82,lresp=1562899856234,llat=1,minlat=0,avglat=0,maxlat=13) /10.64.196.169:62004[1](queued=0,recved=76592,sent=76592,sid=0x16bb66566630027,lop=PING,est=156213347 0495,to=30000,lcxid=0x5d1c43ef,lzxid=0xfb82,lresp=1562899855824,llat=0,minlat=0,avglat=0,maxlat=12) /10.64.196.169:61876[1](queued=0,recved=87843,sent=87844,sid=0x16bb66566630003,lop=PING,est=156213326 8918,to=40000,lcxid=0x76ca,lzxid=0xfb82,lresp=1562899849775,llat=0,minlat=0,avglat=0,maxlat=13) /10.64.196.169:62285[1](queued=0,recved=76566,sent=76566,sid=0x16bb6656663005d,lop=PING,est=156213372 4964,to=30000,lcxid=0x5d1c43ee,lzxid=0xfb82,lresp=1562899851874,llat=0,minlat=0,avglat=0,maxlat=14) /10.64.196.169:61982[1](queued=0,recved=76592,sent=76592,sid=0x16bb66566630022,lop=PING,est=156213346 7495,to=30000,lcxid=0x5d1c5e67,lzxid=0xfb82,lresp=1562899853357,llat=1,minlat=0,avglat=0,maxlat=21) date; echo dumps | nc abc.us.oracle.com 4330 [siebel@abc bin]$ date; echo dumps | nc abc.us.oracle.com 4330 Fri Jul 12 03:55:20 BST 2019 SessionTracker dump: Session Sets (21): 0 expire at Fri Jul 12 03:55:22 BST 2019: 0 expire at Fri Jul 12 03:55:24 BST 2019: 0 expire at Fri Jul 12 03:55:26 BST 2019: 0 expire at Fri Jul 12 03:55:28 BST 2019: 0 expire at Fri Jul 12 03:55:30 BST 2019: 0 expire at Fri Jul 12 03:55:32 BST 2019: 0 expire at Fri Jul 12 03:55:34 BST 2019: 0 expire at Fri Jul 12 03:55:36 BST 2019: 0 expire at Fri Jul 12 03:55:38 BST 2019: 0 expire at Fri Jul 12 03:55:40 BST 2019: 2 expire at Fri Jul 12 03:55:42 BST 2019: 0x16bb66566630047 0x16bb6656663005b 3 expire at Fri Jul 12 03:55:44 BST 2019: 0x16bb66566630038 0x16bb6656663005d 0x16bb66566630022 3 expire at Fri Jul 12 03:55:46 BST 2019: 0x16bb66566630027 0x16bb66566630019 0x16bb66566630059 5 expire at Fri Jul 12 03:55:48 BST 2019: 0x16bb66566630905 0x16bb66566630906 0x16bb6656663000f 0x16bb66566630063 0x16bb66566630903 10 expire at Fri Jul 12 03:55:50 BST 2019: 0x16bb66566630015 0x16bb66566630910 0x16bb6656663090f 0x16bb66566630913 0x16bb6656663090e 0x16bb66566630908 0x16bb66566630911 0x16bb6656663005a 0x16bb66566630912 0x16bb6656663005c 2 expire at Fri Jul 12 03:55:52 BST 2019: 0x16bb66566630016 0x16bb66566630017 4 expire at Fri Jul 12 03:55:54 BST 2019: 0x16bb66566630012 0x16bb66566630014 0x16bb66566630011 0x16bb6656663000e 4 expire at Fri Jul 12 03:55:56 BST 2019: 0x16bb6656663000d 0x16bb66566630001 0x16bb66566630005 0x16bb66566630010 8 expire at Fri Jul 12 03:55:58 BST 2019: 0x16bb66566630002 0x16bb66566630013 0x16bb66566630004 0x16bb66566630009 0x16bb66566630006 0x16bb66566630008 0x16bb66566630003 0x16bb66566630000 3 expire at Fri Jul 12 03:56:00 BST 2019: 0x16bb66566630018 0x16bb66566630007 0x16bb66566630904 2 expire at Fri Jul 12 03:56:02 BST 2019: 0x16bb6656663001a 0x16bb66566630907 ephemeral nodes dump: Sessions with Ephemerals (11): 0x16bb6656663005d: /SvcDiscovery/ESIA17/EAIObjMgr_enu###SSIA17/PID_19447 0x16bb6656663005c: /SvcDiscovery/ESIA17/CustomAppObjMgr_enu###SSIA17/PID_19426 0x16bb66566630059: /SvcDiscovery/ESIA17/SServiceObjMgr_enu###SSIA17/PID_19393 0x16bb66566630038: /SvcDiscovery/ESIA17/SynchMgr###SSIA17/PID_19378 0x16bb6656663005b: /SvcDiscovery/ESIA17/SCCObjMgr_enu###SSIA17/PID_19385 0x16bb6656663005a: /SvcDiscovery/ESIA17/SWToolsObjMgr_enu###SSIA17/PID_19463 0x16bb66566630004: /ServiceRegistry/TLSGateway/abc.us.oracle.com:9014 0x16bb66566630027: /SvcDiscovery/ESIA17/SRProc###SSIA17/PID_19085 0x16bb66566630047: /SvcDiscovery/ESIA17/eServiceObjMgr_enu###SSIA17/PID_19388 0x16bb66566630002: /ServiceRegistry/Gateway/abc.us.oracle.com:9011 0x16bb66566630022: /SvcDiscovery/ESIA17/SRBroker###SSIA17/PID_19062 [siebel@abc bin]$ Windows: a) Open a Powershell window from run b) Copy and paste the following script on the Powershell window and hit enter function get_zk_status($zkhost, $zkport,$fourlword) { $fourlw = [System.Text.Encoding]::ASCII.GetBytes("$fourlword") $zkconn = New-Object System.Net.Sockets.TcpClient("$zkhost", "$zkport") $str = $zkconn.GetStream() $str.Write($fourlw, 0, $fourlw.Length) $resp = New-Object System.Byte[] 4096 $count = $str.Read($resp, 0, 4096) [System.Text.Encoding]::ASCII.GetString($resp, 0, $count) $str.Close() $zkconn.Close() } c) Then run: get_zk_status with four letter commands as below get_zk_status REGISTRY_HOSTNAME REGISTRY_PORT_NUMBER <four_letter_command> get_zk_status <hostname> <registryclientport> wchc get_zk_status <hostname> <registryclientport> cons get_zk_status <hostname> <registryclientport> dump Gateway Cluster Troubleshooting with Zookeeper Commands: Make sure all 3 zk nodes are up Then to identify to which zookeeper the watchers are going, run either wchc or wchp command: In wchc, the session id comes below the watched nodes, which is very important: /SvcDiscovery/ESIA17/EAIObjMgr_enu###SSIA17/PID_1556 /SvcDiscovery/ESIA17/CustomAppObjMgr_enu###SSIA17 /SvcDiscovery/ESIA17/SServiceObjMgr_enu###SSIA17/PID_4252 /SvcDiscovery/ESIA17/SWToolsObjMgr_enu###SSIA17/PID_2612 /SvcDiscovery/ESIA17/SynchMgr###SSIA17/PID_4908 /SvcDiscovery/ESIA17/SCCObjMgr_enu###SSIA17/PID_5300 /SvcDiscovery/ESIA17/SServiceObjMgr_enu###SSIA17 /SvcDiscovery/ESIA17/SIAServiceCEObjMgr_enu###SSIA17 /SvcDiscovery/ESIA17/SRBroker###SSIA17 /SvcDiscovery/ESIA17/SRProc###SSIA17/PID_4872 /SvcDiscovery/ESIA17/MedicalCEObjMgr_enu###SSIA17 /SvcDiscovery/ESIA17/CustomAppObjMgr_enu###SSIA17/PID_6284 /SvcDiscovery/ESIA17/SRBroker###SSIA17/PID_8824 0x26c28eec4760008 So here our session is 0x26c28eec4760008 Now find that session in cons output: PS C:\Users\Administrator> cat cons.txt | select-string -pattern '760008' /0:0:0:0:0:0:0:1:64605[1](queued=0,recved=45,sent=46,sid=0x26c28eec4760008,lop=PING,es t=1564065136170,to=40000,lcxid=0 x1,lzxid=0xffffffffffffffff,lresp=1564065736091,llat=0,minlat=0,avglat=0,maxlat=0) Now lookup if you can associate your javaw.exe with the port 64605: PS C:\Users\Administrator> netstat -anbo | select-string 64605 TCP [::1]:8330 [::1]:64605 ESTABLISHED 3128 TCP [::1]:64605 [::1]:8330 ESTABLISHED 4572 Ok, seems to be there. Now check if that is in fact the javaw.exe: PS C:\Users\Administrator> tasklist | select-string javaw javaw.exe 4572 RDP-Tcp#24 2 362,820 K Ok, looks good. So 4572 is our javaw.exe which has established a watcher to local zookeeper. It also could be remote zookeeper, so you always need to check client ip and port here. Now shutdown this particular zookeeper instance. Then we should get the error in zookeeper client log about failed watcher recreation: 2019-07-25 03:32:10,429 [myid:] - WARN [localhost-startStop-1SendThread(celvpvm07224.us.oracle.com:8330):ClientCnxn$SendThread@1108] - Client session timed out, have not heard from server in 55108ms for sessionid 0x26c28eec4760008 2019-07-25 03:32:10,429 [myid:] - INFO [localhost-startStop-1SendThread(celvpvm07224.us.oracle.com:8330):ClientCnxn$SendThread@1156] - Client session timed out, have not heard from server in 55108ms for sessionid 0x26c28eec4760008, closing socket connection and attempting reconnect 2019-07-25 03:32:10,486 [myid:] - ERROR [https-jsse-nio-9011-exec-3EventThread:ClientCnxn$EventThread@532] - Error while calling watcher java.lang.NullPointerException at com.siebel.opcgw.cloudgateway.ServerChildWatch.process(ServerChildWatch.java:78) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505) 2019-07-25 03:32:10,487 [myid:] - ERROR [https-jsse-nio-9011-exec-3EventThread:ClientCnxn$EventThread@532] - Error while calling watcher java.lang.NullPointerException How to Identify Leader & Follower with Gateway Cluster: How to check Leader – Follower (Need to run below commands in all three nodes) Unix echo stat | nc <gtwy1> <registryport> | grep Mode echo stat | nc <gtwy2> <registryport>| grep Mode echo stat | nc <gtwy3> <registryport> | grep Mode Windows: get_zk_status <host1> <registryport> stat get_zk_status <host2> <registryport> stat get_zk_status <host3> <registryport> stat Output example: Latency min/avg/max: 0/0/31 Received: 4436 Sent: 4436 Connections: 16 Outstanding: 0 Zxid: 0x6800002adc Mode: follower Node count: 13233 Leader: Latency min/avg/max: 0/0/141 Received: 365839 Sent: 365842 Connections: 11 Outstanding: 0 Zxid: 0x6800002adc Mode: leader Node count: 13233