607D - Leveraging NetScaler clusters to improve application performance David Jimenez Sr. Technical Readiness Specialist May 8, 2012 Today’s session • Why clustering? • What it new? • Architectural components • Deployment types • Cluster configuration and management • Logging and maintenance How to Scale? #CitrixSummit How is it done today? HA Solution does not scale up. Upon hitting traffic limits, appliances have to be upgraded It is not cost effective. One resource is always idle Active/Active Configuration is managed independently Dependent on upstream device for load distribution Not all L4-L7 features work across all nodes (max. clients, persistence, session reuse, etc…) #CitrixSummit Why clustering? 32X Configuration replication ○ Fault tolerance ○ #CitrixSummit PASSIVE ACTIVE 1011011010 SSL 1011011010 SSL 1011011010 •Efficient utilization •Elegant solution to scale up traffic •Dynamic capacity •Ease of management and configuration •Satisfies same requirements as HA ACTIVE #CitrixSummit What’s new? nCore only No classic build Managed as a single system Single IP to manage the cluster New license required Clustering License to enable the feature Scalable and fault tolerant Satisfies same requirements as HA Architectural components Cluster logical topology #CitrixSummit Cluster logical topology #CitrixSummit Quorum Service Provider • Main protocol of the cluster system • Consensus based • Decides which node should serve traffic or remain passive • Master role election • Implemented by the cluster daemon #CitrixSummit Cluster daemon • Runs on every node in the cluster … • Interfaces with cluster daemons running on each node • Distributes node and interface information to all nodes in cluster • Sends cluster configuration commands to all packet engines and other cluster daemons running on other nodes #CitrixSummit 12 The life of a command… Propagate command to other nodes… CCO Is this user allowed to run this command? Distribute to all packet engines… Node 2 ClusterD Cluster Daemon Node 1 ClusterD #CitrixSummit Authorize CLI GUI XML-API Nitro API Validate • • • • ClusterD add lb vserver… Configuration Daemon Other cluster nodes Node 3 Cluster heart-beat • Heart-beat module: Transmission and reception of HB messages • Used to detect when a node is no longer available Default time values Hello interval 200 ms Dead interval 3s • UDP unicast packets • Timers can be modified #CitrixSummit 14 Cluster configuration • CLIP: Cluster IP used for management • CCO (Cluster Coordinator) replicates configuration to all other nodes • Command propagation • File synchronization #CitrixSummit Spotted vs. Striped entities • Spotted entities ○ ○ Objects active on a single node e.g.: Interface configuration, NSIP, spotted vservers (VIP marked to be serviced by one node) • Striped entities ○ ○ Active on multiple nodes e.g.: Striped vservers (a VIP configured to be serviced by multiple nodes in cluster) #CitrixSummit Cluster IP addresses Type NSIP Options • Spotted • Striped SNIP / MIP VIP CLIP #CitrixSummit • Spotted (recommended) • Striped • Spotted • Neither striped nor spotted. Floating IP owned by COO Distributor flow distributor • Controls traffic flow in cluster systems • Ensures even distribution across cluster • Three main components: ○ ○ ○ #CitrixSummit Cluster interface manager Flow distributor Flow processor 18 Cluster interface manager • Equal Cost Multipath Routing (ECMP) • Cluster Link Aggregation (CLAG) • Link sets (LS) #CitrixSummit Deployment types ECMP VIP/32: Node0 VIP/32: Node1 VIP/32: Node2 VIP/32: Node3 Flow receiver #CitrixSummit Flow processor CLAG ARP request: CIP:CMAC -> VIP:broadcast ARP reply: VIP:CLAGMAC -> CIP:CMAC #CitrixSummit CLAG cont. #CitrixSummit Link set ARP request: CIP:CMAC -> VIP:broadcast ARP reply: VIP:ARP_OWNER_MAC -> CIP:CMAC #CitrixSummit Distribution mechanisms compared ECMP • Upstream device connectivity • Upstream device configuration Pros Cons All nodes must be connected. It can be used in combination with LinkSets • Does not require to be connected YES • Best traffic distribution • Routes are limited to maximum number supported by router #CitrixSummit LinkSets CLAG • All nodes must be connected. • It can be used in combination with LinkSets NO • Transparent to the YES • upstream device • Potential bottleneck. Each VIP is initially handled by only one node Better traffic distribution • Number of switch ports used can be a limitation 25 State sharing Session persistence • Extension of the nCore persistence algorithms • Use of consistent hashing techniques • Persistence entries are replicated across all nodes • Fault-tolerance when a node join/leaves the cluster #CitrixSummit Session coordination and caching CIP -> SRV1 CIP -> SRV1 #CitrixSummit Service monitoring challenges Increases the load on the server #CitrixSummit Service owner monitoring SVC1: UP SVC1: UP SVC1: UP SVC1: UP Only one monitoring probe is sent by the service owner #CitrixSummit Path validation Service owner informs all other nodes in the cluster about the health of the service Remaining nodes validate the path to ensure connectivity SVC1: UP SVC1: UP SVC1: UP SVC1: UP Only one monitoring probe is sent by the service owner #CitrixSummit Interface naming convention • In clustering, interfaces are renamed with a 3-tuple • Interface name prefixed by node ID • Naming convention: N/C/U N = node ID, C = controller number, U = unit number • E.g.: Interface 3/1/2 represents interface 1/2 on node with ID 3 #CitrixSummit Configuration synchronization • One node is selected as configuration coordinator (CCO) • Owns the cluster IP • If CCO goes down, a new CCO is elected by cluster protocol (QSP) • Sync triggered when new node joins the cluster • Two types of synch: ○ ○ Full Incremental • Propagation occurs when command is executed on CCO #CitrixSummit Cluster configuration Cluster setup • Add the cluster IP to the initial node: add ns ip 192.168.10.140 255.255.255.255 -type CLIP • Connect to the CLIP. Create and enable the cluster instance: add cluster instance 1 enable cluster instance 1 • From the CCO, add all nodes in the cluster: add cluster node 1 192.168.10.110 -state ACTIVE -backplane 1/1/2 add cluster node 2 192.168.10.120 -state ACTIVE -backplane 2/1/2 add cluster node 3 192.168.10.130 -state ACTIVE -backplane 3/1/2 • From each node, join the cluster instance: join cluster -clip 192.168.10.140 –password nsroot #CitrixSummit Cluster status Verify the cluster status and the state of each node: #CitrixSummit File synchronization • Upon node joining cluster, files from CCO are synched • File sync daemon synchronizes files automatically • Manual synchronization: #CitrixSummit 37 File synchronization (cont.) • /nsconfig/ssl/ • /nsconfig/ssh/ • /var/netscaler/ssl/ • /nsconfig/rc.netscaler • /var/vpn/bookmark/ • /nsconfig/resolv.conf • /nsconfig/dns/ • /nsconfig/inetd.conf • /nsconfig/htmlinjection/ • /nsconfig/syslog.conf • /netscaler/htmlinjection/ens/ • /nsconfig/snmpd.conf • /nsconfig/monitors/ • /nsconfig/ntp.conf • /nsconfig/nstemplates/ • /nsconfig/httpd.conf #CitrixSummit 38 File synchronization (cont.) • /nsconfig/sshd_config • /var/wi/tomcat/conf/Catalina/localhost/ • /nsconfig/hosts • /var/wi/java_home/lib/security/cacerts • /nsconfig/enckey • /var/wi/java_home/jre/lib/security/cacerts • /var/nslw.bin/etc/krb5.conf • /nsconfig/license/ • /var/nslw.bin/etc/krb5.keytab • /nsconfig/rc.conf • /var/lib/likewise/db/ • /var/download/ • /var/wi/tomcat/webapps/ #CitrixSummit 39 Cluster interfaces • Upon joining the cluster, all nodes display all cluster interfaces and each member NSIPs: #CitrixSummit Removing a cluster node permanently • Login to the node and remove the cluster instance reference: rm cluster instance 1 • Login to the cluster and remove the node: rm cluster node 2 #CitrixSummit Logs and reporting • Independent ○ ○ Each node maintains its own set of logs and counters Logs reside on each node’s local storage as it is done today • On-demand ○ Aggregation is done on-demand. Counters are summarized, and logs are merged in order • Tech support file ○ Support file can be generated for the node or the entire cluster NS10_node1> show techsupport -scope CLUSTER NODE #CitrixSummit Packet tracing • New options in nstrace format • Needs latest version of WireShark New display filters nstrace.snode nstrace.flags.srrs nstrace.dnode nstrace.flags.dfd nstrace.flags nstrace.flags.fr nstrace.flags.rssh nstrace.flags.fp #CitrixSummit 43 Upgrade / downgrade process • All cluster nodes must run same build/version for cluster to operate • Upgrade/downgrade requires disabling the cluster instance • Cluster is NOT able to serve traffic while upgrade/downgrade is in-progress • Reboot of all appliances necessary to complete firmware update #CitrixSummit Before you leave… • Recommended related breakout sessions: ○ SUM609D : NetScaler 10 – Learn, configure, and up-skill in this latest feature packed release • Conference surveys are available online at www.citrixsummit.com starting Thursday, May 10 ○ Provide your feedback and pick up a complimentary gift at the registration desk • Download presentations starting Monday, May 21, from your My Organizer tool located in your My Account #CitrixSummit We value your feedback! Take a survey of this session now in the mobile app • Click 'Sessions' button • Click on today's tab • Find this session • Click 'Surveys' #CitrixSummit Tweet about this session with hashtag #SUM607D and #CitrixSummit #CitrixSummit Lab Environment Login Launch your browser and type http://training.citrixsynergy.net Your session code is: “session code”