Uploaded by Stan Nicolae

Siebel Zookeeper & Gateway Cluster Troubleshooting

advertisement
Siebel Zookeeper Troubleshooting
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Objectives
In this session we are going to discuss Siebel
Gateway , Zookeeper & Gateway cluster
troubleshooting
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
3
Agenda
 Changes Introduced in Siebel IP2017
 Siebel Gateway Registry
 Request Flow from AI to Siebel
 About Zookeeper
 Validate Zookeeper & Administration
 Gateway Cluster Validation
 Troubleshooting
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
4
Changes Introduced in IP2017
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Siebel Enterprise Server Changes
IP2017
Pre-IP2017
• Gateway data store is static – siebns.dat. Any
change requires a Gateway restart.
• Gateway hosts REST API for configuration of the
Enterprise and Application Interface
• Only local provisioning possible using Config
Wizard.
• Dynamic registry, facilitating elasticity and load
balancing
Siebel Enterprise
Gateway port (usually2320)
Gateway
Siebns.dat
https port
http port
Redirect to https
Shutdown port
Siebel Server
S
i
e
Si e bbeelSleServrevrr
https port
http port
Redirect to https
Shutdown port
Application Container
Siebel Enterprise
Gateway
REST API
Gateway
KeyStore
TrustStore
TLS port,
Authorization
Dynamic
Registry
ApplicationContainer
ConfigAgent
KeyStore
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
TrustStore
SiSeiebbeleSleServreverr
Siebel Server
Changes to Siebel Gateway
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. | Confidential – Oracle Internal/Restricted/Highly Restricted
7
Gateway in Siebel IP2017
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
About Siebel Gateway Registry
The Siebel Gateway provides the dynamic address registry for Siebel Servers and server components, and also for Siebel
Application Interface and other modules, like Siebel Enterprise Cache and Siebel Constraint Engine. For example, at startup,
Siebel Server within the Siebel Enterprise Server stores its network address in the Siebel Gateway’s nonpersistent address
registry.
Siebel Enterprise Server components query the Siebel Gateway registry for Siebel Server availability and address information.
When a Siebel Server shuts down, this information is cleared from the registry.
The Siebel Application Interface and Siebel Gateway work together to provide Siebel Server load balancing. When a user
requests a new application connection, Siebel Application Interface sends a request to Siebel Gateway, which returns a
connect string for the least-loaded Application Object Manager from among the Siebel Servers supporting that component.
The user session will use this Application Object Manager.
The Siebel Gateway also includes persistent storage in the registry for configuration information for Siebel Server, Siebel
Application Interface, and other installable components. This information includes:
• Definitions and assignments of component groups and components
• Operational parameters
• Connectivity information
As this configuration information changes, such as during the configuration of Siebel Enterprise, a Siebel Server, or a Siebel
Application Interface, this data is written to the Siebel Gateway registry.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
SAI(HTTPS)
Siebel
Database
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Siebel Request Flow Log Review:
When a user requests a new application connection, Siebel Application Interface sends a request to Siebel Gateway, which returns a
connect string for the least-loaded Application Object Manager from among the Siebel Servers supporting that component. The user
session will use this Application Object Manager.
 Request passed from SAI to CGW (HTTPS)
 CGW does not store anything on it's own. It uses Zookeeper for storing the configuration
 CGW needs Siebel configuration detail and runtime detail to expose restful services and it uses Zookeeper to store and read these
detail.
 CGW has a few permanent connections into zk, cloudgateway/v1.0 restapi
 ZK will check in SvcDisocvery for minimum load OM process
 Send request to Siebel Server
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Siebel Request Flow Log Review:
When a user requests a new application connection, Siebel Application Interface sends a request to Siebel Gateway, which returns a
connect string for the least-loaded Application Object Manager from among the Siebel Servers supporting that component. The user
session will use this Application Object Manager.
AI:localhost_access*.log
172.17.44.85 - - [17/Nov/2019:14:13:00 +0800] "GET /siebel/app/eautomotive/chs?SWECmd=Start&SWEHo=172.17.44.85 HTTP/1.0" 200
248
172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/app/eautomotive/chs?SWECmd=Start&SWEHo=172.17.44.85 HTTP/1.0" 200
6958
172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/files/login.css HTTP/1.0" 200 14084
172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/scripts/login.js HTTP/1.0" 200 4634
172.17.44.85 - - [17/Nov/2019:14:13:03 +0800] "GET /siebel/scripts/swecommon.js?_scb=19.9.0.0_SIA_[23073_0]_CHS HTTP/1.0" 200
55512
CG_localhost_access*.log
172.17.44.85 - - [17/Nov/2019:14:13:02 +0800] "GET /siebel/v1.0/cloudgateway/discovery/services/eautoobjmgr_chs/connectstring
HTTP/1.1" 200 87
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Siebel Request Flow:
 CommonLogger.log from AI:
[DEBUG] 2019-11-17 14:13:02.711 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.handlers.HandlerFactory:getHandler Identified as UI channel
[INFO ] 2019-11-17 14:13:02.712 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.util.CGUtil:cgGet New CGHost after Service Discovery : crmsit.local:9021
[INFO ] 2019-11-17 14:13:02.712 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.util.CGUtil:cgGet New CG URL after service discovery :
https://crmsit.local:9021/siebel/v1.0/cloudgateway/discovery/services/eautoobjmgr_chs/connectstring
[INFO ] 2019-11-17 14:13:02.719 [https-openssl-nio-9011-exec-10] CommonLogger - com.siebel.swsm.store.ConfigStore:getConnString Connect String JSON:{
"ConnectString" : "siebel.TCPIP.NONE.NONE://CGWHOST:2321/ENT/eautoObjMgr_chs"
}
 UI.log from AI
[DEBUG] 2019-11-17 14:13:03.413 [https-openssl-nio-9011-exec-10] UI - com.siebel.swsm.sessmgr.UIConnectionModule:invokeService Calling
invokeMethod for Siebel Request : > Value =
• Contains all values of OM Session
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gateway Registry Validation for SAI
 Validate cgclienstrore.dat under $sai/applicationcontainer/webapps
It should contain entry as gateway host:HTTPS Port
 Validate GatewayServiceFramework.log
2019-12-24 15:33:49,176 INFO GtwySvcFrmwrkLog : Discovery: clientAuth for framework is : true .
2019-12-24 15:33:57,690 INFO GtwySvcFrmwrkLog : Discovery: Authentication to framework successfully done
2019-12-24 15:33:57,690 INFO GtwySvcFrmwrkLog : Discovery: CGmetafile created successfully::
C:\Siebel\AI\applicationcontainer\webapps\cgclientstore.dat
2019-12-24 15:33:58,268 INFO GtwySvcFrmwrkLog : Discovery: Connection to Registry successful
2019-12-24 15:33:59,065 INFO GtwySvcFrmwrkLog : Discovery: CGMetafile updated successfully
2019-12-24 15:33:59,096 INFO GtwySvcFrmwrkLog : Discovery: Service Discovery Successful
2019-12-24 15:33:59,096 INFO GtwySvcFrmwrkLog : Discovery: Get Connect String Successful
If there are errors while authenticating, fix those. Otherwise it will cause issues while SvcDiscovery.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
What is Zookeeper
ZooKeeper: A Distributed Coordination Service for Distributed Applications
Zookeeper is a configuration coordination distributed system and we use it for that very requirement.
In simple, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed
synchronization, and providing group services. All of these kinds of services are used in some form or another by
distributed applications.
ZooKeeper follows a simple client-server model where clients are nodes (i.e., machines) that make use of the service, and
servers are nodes that provide the service. Applications make calls to ZooKeeper through a client library. The client library
is responsible for the interaction with ZooKeeper servers.
In Siebel terms, we call this as Siebel Gateway Registry.
This zookeeper is started when gateway is started.
Basically CGW does not store anything on it's own. It uses Zookeeper for storing the configuration
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Zookeeper Service
 Unix:
siebel 22512 1 0 Dec13 ?
00:00:00 siebsvc -s gtwyns -a /f /refresh/siebel/ses/gtwysrvr/sys/siebns.dat /t 4330 /c
/refresh/siebel/ses/gtwysrvr/bin/gateway.cfg
siebel 22513 22512 0 Dec13 ?
00:17:08 /refresh/siebel/ses/gtwysrvr/../jre/bin/java Dzookeeper.log.dir=/refresh/siebel/ses/gtwysrvr/zookeeper -Dzookeeper.root.logger=INFO -cp
"/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../build/classes:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../build/lib/*.jar:/refresh/siebel/ses/gtwys
rvr/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/slf4j-api1.6.1.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/netty-3.7.0.Final.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/log4j1.2.16.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../lib/jline-0.9.94.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../zookeeper3.4.8.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../src/java/lib/*.jar:/refresh/siebel/ses/gtwysrvr/zookeeper/bin/../conf" Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false
org.apache.zookeeper.server.quorum.QuorumPeerMain /refresh/siebel/ses/gtwysrvr/zookeeper/conf/zoo1.cfg
 Windows:
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
How to Verify Zookeeper status:
One can validate zookeeper status with below:
 Process status: Validate from OS process for zookeeper
 Using zkCli
Under path: $gtwysrvr/zookeeper/bin
Unix: zkCli.sh -server <gatewayhost:registryport>
Windows: zkCli.cmd -server <gatewayhost:registryport>
 Using ZooInspector
How To Read Zookeeper Data In Standalone Mode Using ZooInspector (Doc ID 2427936.1)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Siebel Zookeeper
Zookeeper is a configuration coordination distributed system and we use it for that very requirement, so all component & configuration
information is stored in zookeeper.
Zookeeper contains below sections:
 Gateways
 SvcDiscovery
 Zookeeper
 Config
 Emdiscovery
 ServiceRegistry
 Enterprises
DEMO
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Relevant Information in Zookeeper
 SvcDiscovery: which stores component process information.
When even component requst comes like OM request comes to zk, it
should take it from Discovery information.
 Config: section stores SMC information like Profiles & Deployments
 ServiceRegistry: contains Gateway registry & TLS information
 EMDiscovery: contains permanent connection to ZK discovery:
https://<gatewayhost:HTTPSPort>/siebel/v1.0/cloudgateway/enterprises?e
xpand=all
 enterprises: Contains enterprise information like component groups,
parameters, named subsystems etc...
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Relevant Information in Zookeeper
 Every SMC login will create bootstrap request as below ($ses/applicationcontainer/logs/localhost_access*log)
[16/Dec/2019:18:58:35 -1200] "GET /siebel/v1.0/cloudgateway/bootstrapCG?_=1576565914331 HTTP/1.1" 200 276
[16/Dec/2019:18:58:35 -1200] "GET /siebel/v1.0/cloudgateway/deployments/gatewaycluster?_=1576565914333 HTTP/1.1" 200 1018
[16/Dec/2019:18:58:35 -1200] "GET /siebel/v1.0/cloudgateway/deployments/enterprises?_=1576565914334 HTTP/1.1" 200 365
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/enterprises?_=1576565914332 HTTP/1.1" 200 63
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/cacheserver?_=1576565914337 HTTP/1.1" 200 37
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/migrations?_=1576565914338 HTTP/1.1" 200 348
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/constraintengine?_=1576565914339 HTTP/1.1" 200 42
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/swsm?_=1576565914336 HTTP/1.1" 200 307
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/deployments/servers?_=1576565914335 HTTP/1.1" 200 455
[16/Dec/2019:18:58:36 -1200] "GET /siebel/v1.0/cloudgateway/profiles/cacheserver?_=1576565914341 HTTP/1.1" 200 4136
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/swsm?_=1576565914340 HTTP/1.1" 200 131199
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/cacheclient?_=1576565914342 HTTP/1.1" 200 1300
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/servers?_=1576565914344 HTTP/1.1" 200 1896
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/enterprises?_=1576565914343 HTTP/1.1" 200 3929
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/constraintengine?_=1576565914346 HTTP/1.1" 200 39
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/migrations?_=1576565914345 HTTP/1.1" 200 2766
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/gatewaycluster?_=1576565914348 HTTP/1.1" 200 644
[16/Dec/2019:18:58:37 -1200] "GET /siebel/v1.0/cloudgateway/profiles/security?_=1576565914347 HTTP/1.1" 200 918
 AOM request will create Discovery request with Gateway as below ($ses/applicationcontainer/logs/localhost_access*log)
[16/Dec/2019:19:09:20 -1200] "GET /siebel/v1.0/cloudgateway/discovery/services/sccobjmgr_enu/connectstring HTTP/1.1" 200 30
[16/Dec/2019:19:09:59 -1200] "GET /siebel/v1.0/cloudgateway/discovery/services/sccobjmgr_enu/connectstring HTTP/1.1" 200 30
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Troubleshooting:
Validate ZK from zkCli
Go to $ses/gtwysrvr/zookeeper/bin:
Unix:
./zkCli.sh -server <gtwyhost:registry port>
addauth –digest <SADMIN>:<SADMINPASSWORD>
ls /
Windows:
zkCli.cmd -server <gtwyhost:registry port>
addauth –digest <SADMIN>:<SADMINPASSWORD>
ls /
NOTE: Credentials should match with gateway.properties
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Troubleshooting:
Validate ZK from zkCli
Go to $ses/gtwysrvr/zookeeper/bin:
Unix:
./zkCli.sh -server <gtwyhost:registry port>
addauth –digest <SADMIN>:<SADMINPASSWORD>
ls /
Windows:
zkCli.cmd -server <gtwyhost:registry port>
addauth –digest <SADMIN>:<SADMINPASSWORD>
ls /
NOTE: Credentials should match with gateway.properties
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Troubleshooting:
ZooKeeper Commands: The Four Letter Words
ZooKeeper responds to a small set of commands. Each command is composed of four letters. You issue the commands to
ZooKeeper via telnet or nc, at the client port.
Some of the more interesting commands: "stat" gives some general information about the server and connected clients, while
"srvr“, “wchc”, “dump” and "cons" give extended details on server and connections respectively.
Conf:
Print details about serving configuration.
Cons : List full connection/session details for all clients connected to this server. Includes information on numbers of packets received/sent,
session id, operation latencies, last operation performed, etc...
dump: Lists the outstanding sessions and ephemeral nodes. This only works on the leader.
srvr:
New in 3.3.0: Lists full details for the server.
Stat:
Lists brief details for the server and connected clients.
wchc: Lists detailed information on watches for the server, by session. This outputs a list of sessions(connections) with associated watches
(paths). Note, depending on the number of watches this operation may be expensive (ie impact server performance), use it carefully.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Enable Zookeeper Server & Client Logs:
To enable ZK client & server logs:
How to Enable Zookeeper Client & Server Logs for Siebel 18.x & later ? (Doc ID 2555834.1)
Cleanup Zookeeper version-2 folder (Doc ID 2425164.1)
This reference contains steps for both Windows & Linux
Known Issues:
Bug 29025939 : CLOUD GATEWAY LOSES STATE OF OBJECT MANAGER AND REQUIRES AOM RESTART
TO FIX
Bug 28534810 : ZOOKEEPER IS NOT PURGING TRANSACTION FILES AS EXPECTED
Refer: Zookeeper Administration Guide: https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gateway Cluster Deployment
Siebel CRM supports an optional native clustering feature for Siebel Gateway to provide high availability benefits to Siebel
CRM customers. This feature works at the software level and is the preferred and recommended approach for clustering
the Siebel Gateway. This topic is part of Con guring the Siebel Gateway Cluster.
The clustering feature supports both the Siebel Gateway service (application container) and the Siebel Gateway registry
(Apache ZooKeeper). You might choose to use Siebel Gateway clustering only for your production environment, for
example. Further, you can use clustering for only the Siebel Gateway service, or only the Siebel Gateway registry.
However, it is recommended to configure clustering for both of them.
For a cluster to be always up and running, majority of the nodes in the cluster should be up. So, it is always
recommended to run zookeeper (gateway registry) cluster in odd number of servers. For example, cluster with 3 nodes, or
cluster with 5 nodes, etc.
 Refer Siebel Gateway Cluster Deployment :
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gateway Cluster Validation
 Validate zoo1.cfg file under $ses\gtwysrvr\zookeeper\conf
autopurge.purgeInterval=1
initLimit=10
syncLimit=5
autopurge.snapRetainCount=10
maxClientCnxns=10000
snapCount=256
clientPort=8330
tickTime=2000
dataDir=c\:\\Siebel\\ses\\gtwysrvr\\zookeeper
server.1=node1:8335:8336
server.2=node2:8335:8336
server.3=node3:8335:8336
myid: myid file consists of a single line containing only the text of that machine's id. So myid of server 1 would contain the text "1" and
nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.
metadata under $ses/siebsrvr
cgclientstore
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gateway zoo1.cfg Parameters
 clientPort: the port to listen for client connections; that is, the port that clients attempt to connect to.
 tickTime: the length of a single tick, which is the basic time unit used by ZooKeeper, as measured in milliseconds. It is used to regulate heartbeats,
and timeouts. For example, the minimum session timeout will be two ticks.
 syncLimit: Amount of time, in ticks (see tickTime), to allow followers to sync with ZooKeeper. If followers fall too far behind a leader, they will be
dropped.
 initLimit: Amount of time, in ticks (see tickTime), to allow followers to connect and sync to a leader. Increased this value as needed, if the amount of
data managed by ZooKeeper is large.
initLimit This is the timeout limit, which indicates the length of time for one of the zookeeper nodes in quorum have to connect to the leader.
syncLimit This specifies the limit on how much apart the individual nodes can be out-of-sync (i.e out-of-date) from the leader.
 snapCount: ZooKeeper logs transactions to a transaction log. After snapCount transactions are written to a log file a snapshot is started and a new
transaction log file is created. The default snapCount is 100,000.
 traceFile: If this option is defined, requests will be will logged to a trace file named traceFile.year.month.day. Use of this option provides useful
debugging information, but will impact performance. (Note: The system property has no zookeeper prefix, and the configuration variable name is
different from the system property. Yes - it's not consistent, and it's annoying.)
 maxClientCnxns: Limits the number of concurrent connections (at the socket level) that a single client, identified by IP address, may make to a single
member of the ZooKeeper ensemble. This is used to prevent certain classes of DoS attacks, including file descriptor exhaustion. The default is 10.
Setting this to 0 entirely removes the limit on concurrent connections.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gateway Cluster fail Over Scenarios Validation
 Stop any node Gateway Service (Gateway Tomcat)
 Stop any node Gateway Registry (Gateway Zookeeper)
 Validate $AI/applicationcontainer/logs/CommonLoggerLog.log
One can find which CGHost the request is moving.
 Also Srvrmgr still connect to other node when connecting node is down
Exercise with Zk commands Demo:
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Gateway Cluster Known Issues & Bugs
 SADMIN password change does not reflect in zookeeper which is causing gateway cluster failure (Doc ID
2512095.1)
 APPLICATION CAN FAIL TO LOAD IF ONE OF THE GATEWAY NODES IS DOWN WITH GATEWAY CLUSTER (Doc ID
2585549.1)
 Siebel IP18.9 - Gateway Cluster Not Working As Expected/Fails (Doc ID 2530000.1)
 Bug 30301670 : INTERMITTENT COMPONENT JOBS FAILING WHEN GATEWAY NODE IS DOWN
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
30
THANK YOU
Copyright © 2016, Oracle and/or its affiliates. All rights reserved. |
31
Download