Uploaded by Stan Nicolae

Zookeeper data validation

advertisement
Zookeeper data validation:
Go to $ses/gtwysrvr/zookeeper/bin
Run command: zkCli -server host:<registryport>
Once it prompts for Connected0: addauth digest SADMIN:<SADMINPassowrd>
These are registry userid & password.
Then, command: ls /
If authentication is successful, ls / command will provide list of zookeeper
To validate every tree, run command as “ls /<name>”
Four letter commands for Zookeeper using nc:
ZooKeeper responds to a small set of commands. Each command is composed of four letters. You issue the
commands to ZooKeeper via telnet or nc, at the client port.
Some of the more interesting commands: "stat" gives some general information about the server and connected
clients, while "srvr“, “wchc”, “dump” and "cons" give extended details on server and connections respectively.
Conf:
Print details about serving configuration.
Cons : List full connection/session details for all clients connected to this server. Includes information on numbers
of packets received/sent, session id, operation latencies, last operation performed, etc...
dump: Lists the outstanding sessions and ephemeral nodes. This only works on the leader.
srvr: New in 3.3.0: Lists full details for the server.
Stat: Lists brief details for the server and connected clients.
wchc: Lists detailed information on watches for the server, by session. This outputs a list of sessions(connections)
with associated watches (paths). Note, depending on the number of watches this operation may be expensive (ie
impact server performance), use it carefully.
For Unix: nc is available on most Unix OSes and Linux. Please consult with your Unix/Linux
administrator.
Windows: It can run via power function since nc not available for windows.
date; echo wchc | nc <host> <registryport>
[siebel@abc bin]$ date; echo wchc | nc abc.us.oracle.com 4330
Fri Jul 12 03:49:20 BST 2019
0x16bb66566630001
0x16bb66566630904
/ServiceRegistry/TLSGateway/abc.us.oracle.com:9014
0x16bb6656663090f
/ServiceRegistry/TLSGateway/abc.us.oracle.com:9014
0x16bb66566630003
0x16bb6656663000d
/ServiceRegistry/Gateway/abc.us.oracle.com:9011
/ServiceRegistry/Gateway
0x16bb66566630907
/ServiceRegistry/TLSGateway/abc.us.oracle.com:9014
0x16bb6656663001a
/ServiceRegistry/TLSGateway/abc.us.oracle.com:9014
0x16bb66566630005
/ServiceRegistry
0x16bb66566630018
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/webtools
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application/Authentication
prop/UserSpec
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application/Authentication
prop/GuestSessionTimeout
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/RESTInBound/Restauthenticationprop
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/Applications/callcenter/Application/EAISOAPNoSess
InPref
/Config/Profiles/SWSM/AI_Profile/SWSMProfile/ConfigParam/RESTInBound/Restauthenticationprop/OAuthEndPoi
nt
date; echo cons | nc abc.us.oracle.com 4330
[siebel@abc bin]$ date; echo cons | nc abc.us.oracle.com 4330
Fri Jul 12 03:50:58 BST 2019
/10.64.196.169:28029[0](queued=0,recved=1,sent=0)
/10.64.196.169:27209[1](queued=0,recved=790,sent=790,sid=0x16bb66566630903,lop=PING,est=1562892020871
,to=30000,lcxid=0x5d27d6fb,lzxid=0xfb82,lresp=1562899856756,llat=0,minlat=0,avglat=0,maxlat=10)
/10.64.196.169:62280[1](queued=0,recved=76567,sent=76567,sid=0x16bb6656663005c,lop=PING,est=156213372
3843,to=30000,lcxid=0x5d1c43ed,lzxid=0xfb82,lresp=1562899858412,llat=1,minlat=0,avglat=0,maxlat=15)
/10.64.196.169:61873[1](queued=0,recved=87838,sent=87838,sid=0x16bb66566630000,lop=PING,est=156213326
8588,to=40000,lcxid=0x76ca,lzxid=0xfb82,lresp=1562899850530,llat=1,minlat=0,avglat=0,maxlat=14)
/10.64.196.169:61874[1](queued=0,recved=87841,sent=87842,sid=0x16bb66566630001,lop=PING,est=156213326
8872,to=40000,lcxid=0x76ca,lzxid=0xfb82,lresp=1562899848830,llat=0,minlat=0,avglat=0,maxlat=19)
/10.64.196.169:61912[1](queued=0,recved=57458,sent=57458,sid=0x16bb66566630013,lop=PING,est=156213329
5896,to=40000,lcxid=0x1,lzxid=0xfb82,lresp=1562899849825,llat=0,minlat=0,avglat=0,maxlat=13)
/10.64.196.169:61914[1](queued=0,recved=57458,sent=57458,sid=0x16bb66566630015,lop=PING,est=156213329
5929,to=40000,lcxid=0x0,lzxid=0xfb82,lresp=1562899856234,llat=1,minlat=0,avglat=0,maxlat=13)
/10.64.196.169:62004[1](queued=0,recved=76592,sent=76592,sid=0x16bb66566630027,lop=PING,est=156213347
0495,to=30000,lcxid=0x5d1c43ef,lzxid=0xfb82,lresp=1562899855824,llat=0,minlat=0,avglat=0,maxlat=12)
/10.64.196.169:61876[1](queued=0,recved=87843,sent=87844,sid=0x16bb66566630003,lop=PING,est=156213326
8918,to=40000,lcxid=0x76ca,lzxid=0xfb82,lresp=1562899849775,llat=0,minlat=0,avglat=0,maxlat=13)
/10.64.196.169:62285[1](queued=0,recved=76566,sent=76566,sid=0x16bb6656663005d,lop=PING,est=156213372
4964,to=30000,lcxid=0x5d1c43ee,lzxid=0xfb82,lresp=1562899851874,llat=0,minlat=0,avglat=0,maxlat=14)
/10.64.196.169:61982[1](queued=0,recved=76592,sent=76592,sid=0x16bb66566630022,lop=PING,est=156213346
7495,to=30000,lcxid=0x5d1c5e67,lzxid=0xfb82,lresp=1562899853357,llat=1,minlat=0,avglat=0,maxlat=21)
date; echo dumps | nc abc.us.oracle.com 4330
[siebel@abc bin]$ date; echo dumps | nc abc.us.oracle.com 4330
Fri Jul 12 03:55:20 BST 2019
SessionTracker dump:
Session Sets (21):
0 expire at Fri Jul 12 03:55:22 BST 2019:
0 expire at Fri Jul 12 03:55:24 BST 2019:
0 expire at Fri Jul 12 03:55:26 BST 2019:
0 expire at Fri Jul 12 03:55:28 BST 2019:
0 expire at Fri Jul 12 03:55:30 BST 2019:
0 expire at Fri Jul 12 03:55:32 BST 2019:
0 expire at Fri Jul 12 03:55:34 BST 2019:
0 expire at Fri Jul 12 03:55:36 BST 2019:
0 expire at Fri Jul 12 03:55:38 BST 2019:
0 expire at Fri Jul 12 03:55:40 BST 2019:
2 expire at Fri Jul 12 03:55:42 BST 2019:
0x16bb66566630047
0x16bb6656663005b
3 expire at Fri Jul 12 03:55:44 BST 2019:
0x16bb66566630038
0x16bb6656663005d
0x16bb66566630022
3 expire at Fri Jul 12 03:55:46 BST 2019:
0x16bb66566630027
0x16bb66566630019
0x16bb66566630059
5 expire at Fri Jul 12 03:55:48 BST 2019:
0x16bb66566630905
0x16bb66566630906
0x16bb6656663000f
0x16bb66566630063
0x16bb66566630903
10 expire at Fri Jul 12 03:55:50 BST 2019:
0x16bb66566630015
0x16bb66566630910
0x16bb6656663090f
0x16bb66566630913
0x16bb6656663090e
0x16bb66566630908
0x16bb66566630911
0x16bb6656663005a
0x16bb66566630912
0x16bb6656663005c
2 expire at Fri Jul 12 03:55:52 BST 2019:
0x16bb66566630016
0x16bb66566630017
4 expire at Fri Jul 12 03:55:54 BST 2019:
0x16bb66566630012
0x16bb66566630014
0x16bb66566630011
0x16bb6656663000e
4 expire at Fri Jul 12 03:55:56 BST 2019:
0x16bb6656663000d
0x16bb66566630001
0x16bb66566630005
0x16bb66566630010
8 expire at Fri Jul 12 03:55:58 BST 2019:
0x16bb66566630002
0x16bb66566630013
0x16bb66566630004
0x16bb66566630009
0x16bb66566630006
0x16bb66566630008
0x16bb66566630003
0x16bb66566630000
3 expire at Fri Jul 12 03:56:00 BST 2019:
0x16bb66566630018
0x16bb66566630007
0x16bb66566630904
2 expire at Fri Jul 12 03:56:02 BST 2019:
0x16bb6656663001a
0x16bb66566630907
ephemeral nodes dump:
Sessions with Ephemerals (11):
0x16bb6656663005d:
/SvcDiscovery/ESIA17/EAIObjMgr_enu###SSIA17/PID_19447
0x16bb6656663005c:
/SvcDiscovery/ESIA17/CustomAppObjMgr_enu###SSIA17/PID_19426
0x16bb66566630059:
/SvcDiscovery/ESIA17/SServiceObjMgr_enu###SSIA17/PID_19393
0x16bb66566630038:
/SvcDiscovery/ESIA17/SynchMgr###SSIA17/PID_19378
0x16bb6656663005b:
/SvcDiscovery/ESIA17/SCCObjMgr_enu###SSIA17/PID_19385
0x16bb6656663005a:
/SvcDiscovery/ESIA17/SWToolsObjMgr_enu###SSIA17/PID_19463
0x16bb66566630004:
/ServiceRegistry/TLSGateway/abc.us.oracle.com:9014
0x16bb66566630027:
/SvcDiscovery/ESIA17/SRProc###SSIA17/PID_19085
0x16bb66566630047:
/SvcDiscovery/ESIA17/eServiceObjMgr_enu###SSIA17/PID_19388
0x16bb66566630002:
/ServiceRegistry/Gateway/abc.us.oracle.com:9011
0x16bb66566630022:
/SvcDiscovery/ESIA17/SRBroker###SSIA17/PID_19062
[siebel@abc bin]$
Windows:
a) Open a Powershell window from run
b) Copy and paste the following script on the Powershell window and hit enter
function get_zk_status($zkhost, $zkport,$fourlword) {
$fourlw = [System.Text.Encoding]::ASCII.GetBytes("$fourlword")
$zkconn = New-Object System.Net.Sockets.TcpClient("$zkhost", "$zkport")
$str = $zkconn.GetStream()
$str.Write($fourlw, 0, $fourlw.Length)
$resp = New-Object System.Byte[] 4096
$count = $str.Read($resp, 0, 4096)
[System.Text.Encoding]::ASCII.GetString($resp, 0, $count)
$str.Close()
$zkconn.Close()
}
c) Then run: get_zk_status with four letter commands as below
get_zk_status REGISTRY_HOSTNAME REGISTRY_PORT_NUMBER <four_letter_command>
get_zk_status <hostname> <registryclientport> wchc
get_zk_status <hostname> <registryclientport> cons
get_zk_status <hostname> <registryclientport> dump
Gateway Cluster Troubleshooting with Zookeeper
Commands:
Make sure all 3 zk nodes are up
Then to identify to which zookeeper the watchers are going, run either wchc or wchp command:
In wchc, the session id comes below the watched nodes, which is very important:
/SvcDiscovery/ESIA17/EAIObjMgr_enu###SSIA17/PID_1556
/SvcDiscovery/ESIA17/CustomAppObjMgr_enu###SSIA17
/SvcDiscovery/ESIA17/SServiceObjMgr_enu###SSIA17/PID_4252
/SvcDiscovery/ESIA17/SWToolsObjMgr_enu###SSIA17/PID_2612
/SvcDiscovery/ESIA17/SynchMgr###SSIA17/PID_4908
/SvcDiscovery/ESIA17/SCCObjMgr_enu###SSIA17/PID_5300
/SvcDiscovery/ESIA17/SServiceObjMgr_enu###SSIA17
/SvcDiscovery/ESIA17/SIAServiceCEObjMgr_enu###SSIA17
/SvcDiscovery/ESIA17/SRBroker###SSIA17
/SvcDiscovery/ESIA17/SRProc###SSIA17/PID_4872
/SvcDiscovery/ESIA17/MedicalCEObjMgr_enu###SSIA17
/SvcDiscovery/ESIA17/CustomAppObjMgr_enu###SSIA17/PID_6284
/SvcDiscovery/ESIA17/SRBroker###SSIA17/PID_8824
0x26c28eec4760008
So here our session is 0x26c28eec4760008
Now find that session in cons output:
PS C:\Users\Administrator> cat cons.txt
| select-string -pattern '760008'
/0:0:0:0:0:0:0:1:64605[1](queued=0,recved=45,sent=46,sid=0x26c28eec4760008,lop=PING,es
t=1564065136170,to=40000,lcxid=0
x1,lzxid=0xffffffffffffffff,lresp=1564065736091,llat=0,minlat=0,avglat=0,maxlat=0)
Now lookup if you can associate your javaw.exe with the port 64605:
PS C:\Users\Administrator> netstat -anbo | select-string 64605
TCP [::1]:8330
[::1]:64605
ESTABLISHED
3128
TCP [::1]:64605
[::1]:8330
ESTABLISHED
4572
Ok, seems to be there. Now check if that is in fact the javaw.exe:
PS C:\Users\Administrator> tasklist | select-string javaw
javaw.exe
4572 RDP-Tcp#24
2 362,820 K
Ok, looks good. So 4572 is our javaw.exe which has established a watcher to local zookeeper. It also
could be remote zookeeper, so you always need to check client ip and port here.
Now shutdown this particular zookeeper instance.
Then we should get the error in zookeeper client log about failed watcher recreation:
2019-07-25 03:32:10,429 [myid:] - WARN [localhost-startStop-1SendThread(celvpvm07224.us.oracle.com:8330):ClientCnxn$SendThread@1108] - Client
session timed out, have not heard from server in 55108ms for sessionid
0x26c28eec4760008
2019-07-25 03:32:10,429 [myid:] - INFO [localhost-startStop-1SendThread(celvpvm07224.us.oracle.com:8330):ClientCnxn$SendThread@1156] - Client
session timed out, have not heard from server in 55108ms for sessionid
0x26c28eec4760008, closing socket connection and attempting reconnect
2019-07-25 03:32:10,486 [myid:] - ERROR [https-jsse-nio-9011-exec-3EventThread:ClientCnxn$EventThread@532] - Error while calling watcher
java.lang.NullPointerException
at
com.siebel.opcgw.cloudgateway.ServerChildWatch.process(ServerChildWatch.java:78)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
2019-07-25 03:32:10,487 [myid:] - ERROR [https-jsse-nio-9011-exec-3EventThread:ClientCnxn$EventThread@532] - Error while calling watcher
java.lang.NullPointerException
How to Identify Leader & Follower with Gateway
Cluster:
How to check Leader – Follower (Need to run below commands in all three nodes)
Unix
echo stat | nc <gtwy1> <registryport> | grep Mode
echo stat | nc <gtwy2> <registryport>| grep Mode
echo stat | nc <gtwy3> <registryport> | grep Mode
Windows:
get_zk_status <host1> <registryport> stat
get_zk_status <host2> <registryport> stat
get_zk_status <host3> <registryport> stat
Output example:
Latency min/avg/max: 0/0/31
Received: 4436
Sent: 4436
Connections: 16
Outstanding: 0
Zxid: 0x6800002adc
Mode: follower
Node count: 13233
Leader:
Latency min/avg/max: 0/0/141
Received: 365839
Sent: 365842
Connections: 11
Outstanding: 0
Zxid: 0x6800002adc
Mode: leader
Node count: 13233
Download