Data Replication Scheme Based On Neighbor Replica Distribution Technique for Web-Server Cluster System W.A. SURYANI, M. MAT DERIS*, H.M. SUZURI*, M. ZARINA, Z. AZNIDA *Department of Computer Science University College of Science and Technology Malaysia, 21030, Terengganu MALAYSIA Computer Science Centre KUSZA College, 21030, Terengganu, MALAYSIA. Abstract: Data replication across Web Server Cluster (WSC) is crucial due to the explosively increasing of Internet users and accesses. With the ever-increasing important www applications such as financial transactions and e-business, the need for a reliable service are highly demanded. In WSC, if data need to be shared by other users or require a more stable backup for availability and reliability, the best alternative is to replicate a copy of the data across the server. Thus providing reliability and efficient services are our primary goals in designing a cluster architecture data replication scheme. This paper propose a new technique called Neighbour Replica Distribution Technique (NRDT) and presents the algorithm of NRDT data replica scheme based on asynchronous replication to improve web server cluster system reliability. Keywords: web server cluster, data replica, Rsync, reliability. 1. Introduction With the ever-increasing applications in the world wide web (WWW) such as distance learning education and e-commerce, the need for a reliable web server is likely to increase [8]. Thus, providing reliable and efficient services are primary goals in designing a web server cluster (WSC) [9]. This is due to the constraints of the eventual failure of hardware or/and software components. In order to provide reliable services, a WSC needs to maintain the availability of some data replicas while preserving one-copy consistency among all replicas [3]. Therefore, data replication plays an important role in a WSC as a highly reliable system. The most common approaches to specify replication are the synchronous and asynchronous replications. However, synchronous replication has drawbacks in practice [4]. The major argument is that, the response time to execute an operation is high because the time taken for all servers that have the same copy to ‘agree’ to execute an operation is high. Whilst the asynchronous replication provides a ‘loose consistency’ between data stores. This means that there is always some degree of lag between the time the originating transaction is committed and the effects of the transaction are available at any replica(s) [1]. Nevertheless, the response time is lower than that of the synchronous technique. Reliability refers to the probability that the system under consideration does not experience any failure in a given time interval. Thus, a reliable WSC is one that can continue to process user’s requests even when the underlying system is unreliable [2]. When components fail, it should still be able to continue executing the requests without violating the database consistency. In this paper we concentrate on the implementation of the replication based on loose consistency, that is asynchronous replication. The rest of the paper is organized as follows. Section 2 presents NRDT model. Section 3 presents the overview of Rsync. Section 4 describes the algorithm of NRDT. Section 5 presents the NRDT data replica simulation and section 6 concludes the proposed work. 2. NRDT Technique 2.1 NRDT Architecture In the design of the WSC, a client on the Internet will notice only one IP address coming from the cluster, not those individual servers in the cluster. The cluster (with only one IP address visible to the public) is composed of a server called Request Distributor Agent (RDA) and a group of servers. The servers are logically connected to each other in the form of a grid structure, each of which is connected to RDA. Figure 1 shows the architecture of the clusterserver system with 8 servers. RDA is designed to perform two jobs. Firstly, RDA will communicate with client whereby in this case the RDA will forward legitimate Internet requests to the appropriate servers in the cluster. It returns any replies from the servers back to the clients. Secondly, the RDA act as a manager to control the eight servers in the cluster. As a manager, the RDA will communicate with one of the primary server depending on the request needs. In the implementation proposed the RDA through Rsync will monitor the file changes in each servers and needs to know the source and destination for the copy operation in this cluster server environment. Each server has the premier data (eg. file a will be located to server 1, file b will be located to server 2, and so on). The RDA will forward any request to the appropriate server with the premier data file. This is done by maintaining a server-service table that contains all the services provided by the cluster together with the corresponding addresses and their neighbors. One advantage of a server cluster over a monolithic server is its high security. If a monolithic server is used, it is reachable from the Internet and therefore vulnerable for vicious [2]. Only the RDA has the IP address visible from the Internet, and all other stations of the cluster bear only private IP address. Therefore, all cluster-server stations are not reachable directly from the outside. A firewall system may be installed on the RDA to protect the whole cluster. To attack one of the cluster server stations, one has to first land on RDA and launch an attack from there. Network Address Translation is used on RDA to translate the destination IP address of incoming packets to an internal IP address, and that of the outgoing packets to the IP address on Internet where the requests. 2.1 NRDT Data Replica Model Let IP address 192.168.100.5 be the primary server, thus the configuration of a history log to the Internet RDA cluster-server A B D E C F Figure 1. A cluster with 6 servers file is as shown in Table 1. The aim of the history log file is to record the history status (previous status) and the current status of the primary server. The status is either ‘1’ (up/active) or ‘0’ (down/inactive) . Table 1: History Log File History Current Time _Status _ Status 192.168.100. 5 1 1 t1 192.168.100. 5 1 0 t2 192.168.100.35 0 1 t3 192.168.100.35 0 0 t4 Assume that a web server cluster comprises of eight servers or servers and outside the web server cluster there is one server acts as a manager to control the eight servers in the IP Address cluster as shown in Figure 1. Let server 5 be a primary to neighbor servers 2, 4, 6, 8. There are two scenarios to be discussed. CASE 1 : No failure: All servers are up and server 5 (primary server) with the history and current status is 1 (up). In this case file x in server 5 will be copied to its neighbors 2, 4, 6, 8 asynchronously via cron daemon that will be discussed in Chapter 3. CASE 2 : Primary server is down: All the neighboring servers are up and primary server (server 5) is down with the current status is 0 and the history status is either up or down. In this case the first neighbor will copy file x to other neighbors. If the first neighbor fails, the second neighbor will take over. The configuration will be based on the first neighbor assigned in an array followed by the second, third and fourth neighbor to the primary. The neighbor will replicate file x to the primary server when it is available or recover after failure. 2.2 NRDT-Data Replica Algorithm This paper would like to give an overview of Rsync utility that is used in the implementation pertaining to data replica algorithm. Rsync is an open source utility that provide incremental file transfer capabilities. It is a utility for Unix Systems and provides a fast mechanism for synchronizing remote file systems. It is not a real-time mirroring software utility, but it must be called by the user or a program either periodically (via a cron job) or manually after updates have been done. Generally Rsync is good for replicating data in WSC environment. There are a lot of software for data mirroring and replicating between machines, freely or commercially. However, almost all of these provide solution to duplicate data into one destination machine at a time only. To get around this problem, we make some scripts to invoke Rsync to copy data from primary server to its neighboring servers in a cluster. Rsync is chosen to implement NRDT mainly because it is intelligent enough to transfer just the difference between two sets of file, instead transferring the whole file. This will increase the performance and speed up the replication process. Besides that, Rsync use efficient checksum –search algorithm to accomplish this feature and has the ability to copy links, devices, owners, groups, and permissions. Latency cost is minimized as Rsync use pipelining files transfer. It can also use any transparent remote shell such as remote shell (rsh) and secure shell (ssh). Furthermore Rsync does not require root privilege, it supports anonymous and authenticated Rsync servers and it can copy selected files only and recursive copy into any subdirectories. In the implementation, Cron daemon is used to enable loose consistency between replication. Cron is a daemon to execute schedule commands. Cron wakes up to the minutes to checks crontab files. There are five important modules for NRDTdata replica algorithms: initialization module, check-server module, copy1 module, update module, copy2 module and neighboring module. The initialization module will read data from configuration file and put it in an array. The check-server module will check the server either up or down/fail. The copy1 module will replicate data from primary server to its neighboring servers in the cluster. The update module is a recovery module whereby data will be replicated from one of the neighbors to the primary in the case of primary failure. The copied data will only take place once the primary is up again (recovered). The Neighboring module will use the data in the array to find the appropriate neighbors, which will be used by the primary server when fails. Meanwhile the copy2 module will replicate data from neighbor to neighbor in a case of primary failure. NRDT Algorithm Main Call Initialization-module While(1) Call Check-server module(return life) Read history-status /*log file If life=1 && history-status =1 Call Copy1 module If life=1 && history status=0 Call Update module If life=0 && (history-status =1 || history-status=0) While life-neighbour=1 Call Neighbour module Call Copy2 module End While End If End IF End While End Main Procedure Initialization Get n # n is a row number Row = 0, Col = 0 While NotEOF Row +=1 Metric=Row.Col Copy IP# to ARRAY(1).IP# Copy ADD# to ARRAY(2).ADD# Assign 2 to ARRAY(3).Status PUSH all data into@NRDT(row,column) If (row==n) Row=0, Col +=1 End If End While End Procedure Procedure Check-server Life= 1 Pinghost = ‘ping-c 1 IP’ If (pinghost=~/00\%packet loss/) Life =0 Return life End Procedure Procedure Copy1 # normal NRDT copy procedure from primary to its neighbours For each neighbour to the primary Rsync primary -> neighbour End For End Procedure Procedure Update # file will be copied back to the primary once the primary is up again after its failure For each neighbour to the primary Call check-server Rsync neighbour->primary End While End Procedure Procedure Neighbour For each array (@NRDT) Row = substri[1,1] (metric) Col = substri[2,1] (metric) Get IP Count =0 Prow = row +1, Mrow = row – 1 Pcol = col + 1, Mcol = col –1 For each array (@NRDT) Nrow =substri[1,1](metric) Ncol=substri[2,1](metric) If ( (((mrow= =nrow) || (prow= =nrow)) && (col = =ncol)) || (col = = ncol)) || ( (row = = nrow) && (mcol = =ncol) || (pcol = =ncol)))) Get IP_neighbour Neighbour{IP}{count} = IP_neighbour Count +=1 End foreach End foreach End procedure Procedure Copy2 # copy from neighbour to neighbour if the primary is down For each neighbour to the primary Call check-server Rsync neighbour->neighbour End For End Procedure 3. Performance Analysis In this section, we will analyze and compare the performance on the reliability of the two-replica proposed by H.H Shen et. al. [6,7] and our neighbor-replica distribution technique. Performance comparison between these two techniques will also be discussed. For simplicity, we will analyze the reliability for the case of N = 4,5, and 6 only. Suppose that p is the probability that a server is alive/available in CS. 3.1 Two-Replica Distribution Technique (TRDT) Suppose that all data have two replicas on different servers and all servers have two data replicas on CS for N2. All servers can be divided into n different set with two servers in each set. In such case, the distribution reaches the highest reliability, RN, to provide all data in a consistent state, thus n 2 n R N R 2 [1 (1 p) ] . possible combinations of fail servers for the system to be unoperational are S1F 2 4 , S1F 235 , S F2 3 , S1F45 and S F2 4 5 .Thus, FGN=5={{1,2,4},{1,2,3,5},{2.3},{1,4,5},{2,4,5} } 1 2 3 4 5 Figure 2: A cluster- server with 5 nodes Definition 4.2.2: The degree of failure servers, deg, is the number of servers unavailable in the system. deg For the case of N = 2n+1, then all servers will be divided into (n-1) sets with two servers and one set with three servers. In such case, each server of (n-1) sets has two replicas and each server of another set has three replicas. Then, the highest reliability, RN, given in [3] is; n1 RN= R R 3 2 Definition 4.2.3: Let iFGN and RFG i . deg FG i is a fail group with degree, deg, such that i R. For example, from Figure 1, let 1 = {1,2,4}FGN=5, and deg = 4, then 4 p 2 (3 2 p)[1 (1 p) 2 ]n For example, if N=5, then R5 = p2(3-2p)[1-(1- p) 2]2 = p 3(6 - 7p + 2p 2) 3.2 Neighbor-Replica Distribution Technique (NRDT) In this technique, all data have some replicas on different servers and all servers have some data replicas under CS. Let S = {S1,S2,…,SN} be a set of N servers available in the cluster system. A server i that is fails or unavailable is noted as S iF . Definition 4.2.1: The system is unoperational when the primary server and its neighbors are fail. A possible combination of fail-server i,j,…,m, where mN for the system to be unoperational is noted as SiF j...m . Thus, SiF j...m = { S iF , S Fj ,…, S Fm }. A group of SiF j...m is called fail group, FG. For example, from Figure 2, where each server is labeled as 1,2,3,4 and 5 respectively. The FG i = {{1,2,3,4}, {1,2,4,5}} deg The probability of FG i deg i such that is given by, degi deg5| | i i| | (1 p) deg p 5 deg . As in Figure 2, with 1 = {1,2,4} and deg = 4, then 5 3 4 (1 p) 4 p 5 4 =2P(1-P)4 {a,b,d } 4 3 deg deg Definition 4.2.4: Let FG i and FG j , ij, be two fail groups with the common degree, deg. We define that the probability of deg deg deg FG i FG j = ij . Thus, deg deg deg FG 1 FG 2 …FG N = ij...N . R5 = 1- [(1-P)2 P 3+6(1- P)3 P 2+5(1- P)4 P +(1- P)5] =P 2(4-3P – P 2 + P 3), …(4) The reliability, RN = {probability that the system with N servers, does not experience any failures in a given time interval} = 1- {probability that the system does experience any failures in the given time interval} and for the case of N =6, the reliability, R6, can be represented as, R6 = 1-[4P3(1- P)3 + 12 P 2(1- P)4 + 6P (1- P)5 – (1- P)6] = P(3P +4P 2-15P 3+12P 4-3P 5). …(5) = 1 deg i 3.3 The Correctness deg …(1) deg 1 where, deg i deg i ... ( 1) N 1 i i j deg i j i j k deg 1 2...N deg i j k …(2) For the case of N=4, from equations 1 and 2, the reliability, R4, can be represented as, 3 4 R4 = 1 4 p (1 p ) (1 p ) = P 2(6 -8P +3P 2), …(3) while for the case of N =5, the reliability, R5, can be represented as, Definition 5.3 : The reliability, RN, under CS of N servers is in a closed form if 0 RN 1, for 0 p 1. Proposition 5.3: The reliability under NRDT with N servers is in the closed form. Proof: Equation 3, is analogous to the Inclusion and Exclusion method [17], where Sideg 1. Thus, 0RN1 as 0 p deg 1 1. Table 3: The reliability of SC for TRDT and NRDT techniques for N=4,5 and 6. Technique Probability of a server available (P) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 N=4 0.036 0.130 0.260 0.410 0.563 0.706 0.828 0.922 N=5 0.005 0.037 0.110 0.225 0.375 0.544 0.713 0.860 TRDT N=6 0.007 0.047 0.133 0.262 0.422 0.593 0.754 0.883 N=4 0.052 0.181 0.348 0.525 0.688 0.821 0.916 0.973 N=5 0.036 0.135 0.273 0.433 0.594 0.740 0.859 0.942 NRDT N=6 0.026 0.132 0.283 0.463 0.641 0.793 0.904 0.970 9 3.4 Performance Comparison Table 3 and Figure 4, show the reliability of SC for TRDT and NRDT techniques for the number of servers, N=4,5, and 6. It is shown that, the reliability under NRDT technique is better than that the reliability under TRDT technique. It can be seen that, NRDT technique needs only the probability of a server alive, P = 0.9 0.980 0.962 0.970 0.996 0.987 0.996 0.7, while TRDT needs P >0.8 in order to maintain RN >0.9. For example, when the probability of a server available 70%, the reliability for TRDT is approximately 83%, whereas the reliability for NRDT is approximately 92%. It is more than 10% better than that of TRDT when N =4. The reliability for TRDT is approximately 75% when N =6 whereas the reliability for NRDT is approximately 90%. This is more that 20% better than that of TRDT. 1.2 Reliabiliy 1 0.8 TRDT,N= 4 0.6 TRDT,N= 6 NRDT,N= 4 0.4 0.2 NRDT,N= 6 0. 9 0. 7 0. 5 0. 3 0. 1 0 Probability of a node available Figure 4: Comparison of the reliability between TRD and NRDT for N =4 and 6 4. Conclusion Web server cluster is a popular architecture used for improving the reliability and availability of web sites. In this paper, Neighbor Replica Distribution Technique (NRDT) has been proposed to improve the reliability of the WCS. This technique provides high reliability by imposing a neighbor logical structure on data copies. Data from one server will be replicated to its neighboring servers and vice versa in the case of failures. The algorithm of data replication scheme was presented based on asynchronous replication to improve web server cluster system reliability. It showed that, this technique provides a convenient approach to high reliability for WSC. We are planning to implement the proposed model on various architectures including data grid, peer-to-peer, and mobile systems. References [1] M. Buretta, “Data replication: Tools and Techniques for Managing Distributed Information”, John Wiley, New York, 1997. [2] J. Liu, L Xu, B. Gu, J. Zhang, “Scalable, High Performance Internet Cluster Server”, IEEE 4th Int’l Conf. HPC-ASIA, Beijing, 2000,pp. 941 – 944. [3] M. Mat Deris, A. Mamat, P. C. Seng, H. Ibrahim, “Three Dimensional Grid Structure for Efficient Access of Replication Data”, Int’l Journal of Interconnection Network, World Scientific, Vol. 2, No. 3, 2001,pp 317 - 329. [4] A. Moissis, “Sybase Replication Server: A practical Architecture for Distributing and SharingInformation”, Technical Document, Sybase Inc, (1996). [5] E. Pacitti and E. Simon, “Update Propagation Strategies to Improve Fresin Lazy Master Replicated Databases”, Journal VLDB, Vol. 8, 2000,no. 4. [6] H. H. Shen, S. M. Chen, and W. M. Zheng, ”Research on Data Replication Distribution Technique for Web Server Cluster”, IEEE Proc. 4th Int’l. Conference on Performance Computing, Beijing, 2000,pp. 966-968. [7] H. H. Shen, S. M. Chen, W. M. Zheng and S.M. Shi, “A Communication Model for Data Availability on Web Server Cluster”, Proc. Int’l. Symposium on Distributed Computing and Application, Wuhan, 2001,pp. 169-171. [8] R. Short, R Credle and J. Pelles, “Clustering and High Availability Guide for IBM Netfinity and IBM PC Servers”, 1997. [9] W. Zhou and A. Goscinski, “Managing Replicated Remote Procedure Call Transactions”, The Computer Journal, Vol. 42, no. 7, 1999,.pp. 592-608.