1123 NETWORK PERFORMANCE TUNING USING REINFORCEMENT LEARNING G. Prem Kumar and P. Venkataram Department of Electrical Communication Engineering Indian Institute of Science, Bangalore 560012 INDIA E-Mail : {prem,pallapa}@ece.iisc.ernet.in ~ ABSTRACT Performance degradation in communication networks are caused by a set of faults. called sofr failures, owing to which the network resources like bandwidth can not he utilised to the expected level. An automated solution to the performance management problem involves identifying these soft failures follonred by the required change in few of the performance parameters t o overcome the performance degradation. The identification of soft-fa.ilures can be done using any of the faultdiagnosis model [SI, and is not discussed here. In this paper, the emphasis is on providing a viable solution t o the network performance tuning. Since the exploration of the network performance tuning search space t o bring out the required control is not well known in advance, a reinforcement control model is developed to tune the network performance. Ethernet performance management is talien up as a case study. The results obtained by the proposed approach demonstrate its effectiveness in solving tlie net,work performance management problem [ill. INTRODUCTION Performance management. is a complex pa,rt of present day net,work management. [6]. Any ahnormal behavior of tlie nebwork in tlie absence of hard-failure is supposed to be handled by the network performance nianager. Network performance degra.dat,ion is viewed t o be caused by a set, of faults, called soft-failures. Solution bo the network performmce management problem involves two st,eps. Firstly. identify t,he soft-failures t h a t cause the performance degradation. Secondly, tune t.he network parameters to bring its performance back to normalcy. In a.nother work [9], we proposed to use realist.ic abductive reasoning model (Realistic-ARM) t,o ident,ify tlie network soft-fa,ilures. Each of the sofLfailures is expect,ed t o have a remedy in t,he forin of tuning up of network performa.nce parameters. Performance 0-7803-4984-9/98/$10.00 01998 IEEE. tuning is done by using a neural network model that applies necessary correction t o some of the p a r a m e ters of the degraded communication network to bring it back t o normalcy. A reinforcement learning mechanism, called stochastic real valved algorithm [la], is used for on-line tuning of the network. Based on the improvement in the network performance, the learning process takes place by adapting t o the behaviour of the communicatioi~network. In this paper, the soft-failure identification is not discussed. Any of t,be fault-diagnosis models which arrive with a set of explanations for a given set, of ahnormalities, each with some assigned probability is assumed to be t.he input to this work. A model for network fault diagnosis using realistic abductive reasoning model that also assigns probability to each of the explanations is discussed in [lo]. A model t o identify the soft-failures in Etbernet using Realistic Abductive Reasoning is presented below. Reinforcement learning niechanisni for performance tuning a,nd the results of the performance tuning for Ethernet follow it. SOFT-FAILURE IDENTIFICATION IN ETHERNET We consider a, rest,ricted Ethernet, model t o illustra,te the soft-failure identificat,ion. For details on the Ethernet operation, refer [a][.i]. We consider an Ethernet performance management model with the following assumptions. The informa,tion that needs t.0 be monitored for the purpose of performance h i l i n g is collected froin the nodes and the channel. And, t h a t information which is beyond the normal values (both above a,nd below the norinal limit,s) is reported as symptom(s). Some nioiiitoriiig information like load is n.orm.nl and colC.sio~st7r.e within the range are included 1124 No. 1. I Fault 1 1 (Fl) 2. 1 (F2) , , 3. (F3) 4. (F4) 5. (F5) B. (F6) 7. (F7) 1 Remedy ( R l ) : Faulty Ethernel card, re- 1 (R2) . , : Request the network ma,nager to initiate fault, diagnostic measure8 (R3) : Ensure many packets are 1 ~~ 8. (F8) (R4) : Report to the network manager (R5) : Allocate more prima,ry memorg- to the required nodes (RB): Selectively control the hroatlcast packets (R.7) : Report to the network manager along with the specified ta,p (R8) : Ensure many packets are not, below the specified size Table 1 Suggested rcmedies for the soft-failures Fig. 1. Ethernet performance management knowledge model. Legend: Layer #1 : to support, the diagnostic process for minimizing the number of explanations. There may be some missing information and the entire information may not he available at, the time of performance t,uning. PM Knowledge Model Tlie Ethernrt perforinance management knowledge base [2][3][4][5][7] is constructed as a liyper-bipartite networ% (see Figure I ) . This maps t,he Ethernet, perforina,nce management~knowledge ont,o a model snita h k for t,he Healist,icAR.RI. The Tahle 1 describes the suggeskcl remedies for (,lie soft-failures of the EXhernet model. The Results 'To ideiit,if>: the soft-failures, t,lie algorithm Realistic-AR,Rl is rini bb- set,ting "the presprcified number of syrnpt,oms required to support. an observed s y m p tom before concluding a fault," to 1. Tlir algorithm is run foi various sets o f symploms (from layer 1 of Figure 1) and some of the rrsnlts are given in Tahle 2 . The proha,hilities are assigned to each of the links (not, shown here), which liave to he provided by the expert oE t l r p part,icular netwnrlr. From Tahle 2 . it ma;- h e oi-~servrdt,liat,the cov-ers grnerated ti)- tlir proposed inoilel conta.in appropria.tr 1. Packet loss helow normal; 2. Packet loss normal; 3. Packet loss above normal; 4. Load below normal; 6 . Loa,d normal; G. Load above normal; 7. C!ollisions below normal; 8. Collisions normal: 9. Collisions ahove normal; l0.Large packets below normal; ll.l,arge packets normal: l2.Large packets a,hove normal: 13.Small packets helow normal; 11.Small packets noruial; 1.5.Small packets a,bove normal: l6.Broadcast, packet,s normal: 17.Broadcast, packets a.hovr normal; 1S.Pacl;et loss on spine h o v e Iiornial: I9.Load on spine normal; 2O.Load on spine above normal Layer #'2 : 1. Light, t,raffic:2. Heavy traffic; 3 . Buffers are insufficient; 4. Users are t,oo many: 5 . Preambles a.r? 1,oo m a n y 6. Broadcast packet,s are too many; 7. Spine flooded wit,li t,oo many s n d packek: 8. Hea,v>-t,raffic on spiur Layer #3 : 1. ( F l ) Bahhling node; 2. (F2) Hardware prohleni: 3. (F3I Jabbering node: 4. Too many rctransmissions; 5. tinder nt,ilization of clia,nnel a niaiiy small packets arr in use: 6. ,Ittempt, for l.00 many hroadca,sts 1. (F4) Uridgc down: 2 . (F5)Diet,morl; paging; Y. (FG) Broadcast sl,ornr: 4. ( F i ) Bad t,ap: 5. (FX) Runt, d o r m 1125 T J I No. Symptoms Remedy Envlmment 1. 3,6,12,18,20 2. 1,4,10,15,17 3. 3,9,18,20 4. 10,15,16,18 Table 2. Sample results for Ethernet performance model explanation for any given symptoms without much of extra guess. Consider a case given in %No. 4 of Table 2. The observed symptoms in layer #1 are: Symptom 10 : number of large packets below normal; Symptom 15 : small packets above normal; Symptom 16 : number of broadcast packets are normal; Symptom 18 : packet loss on spine above normal. The soft-failure concluded is “Runt storm” (with some plausibility) and the remedy is to ensure that too many small paxkets a,re not inject,ed inbo the network. NETWORK PERFORMANCE TUNING Network performance management cannot be complete wit,hout,a control mechanism that tunes the net,work t,o overcome t,he performance degradation. When there is a degradation in network performance, we first identify the soft,-failures with the plausibility of their existence and then take up the performance tuning. A reinforcem,ent learnmg (RL) pamdigm is employed to accomplish performance t,uiiing. We use a, RL model, with two modules, similar t,o that discussed in [8]. The first module. called tlie ”actor”. learns the correct action for a given agent and the second module, called the “critic”, learns the maximum reinforcement expected from the environment, for a, given input. Reinforcement Fig. 2. The reinforcement learnins mechanism Aclor ciihc Fig. 3. A connectionist model for network PM. task is to learn how to react to a given input so as to maximize the reinforcement it receives from the environment In supervised learning, tlie agent is provided with t,he desired out,put from which it, can form a gradient to follow so as t.0 minimize the error in the out,put. In R.L, such an explicit gradient information is absent,. Hence the agent has to explore the output space to find the right direction. A reinforcement learning is suitable to use for net,work performance tuning because, the desired output t o bring the performance back t o normalcy is not known well in advance. The Neural Network Model A connectionist model, as described in [la], is used to realize the reinforcement, learning. This model uses Reinforcement learning problems are usually modeled t,mo neural networks, each of them ba,sed on the backas an ngeiil intera.cting with an e?z.u%ronm.ent(see Fig- propagat,ioii learning algorithm (see Figure 3.) The neural network has an input layer that is deure 2). The agent, receives an input from the environment, a,nd outputs a, suit,ahle act,ion The environment signed to accept tlie significant events of environment, accepts the a,ction from the agent as input and out,. as inputs. puts a sca,la,revaluatioii (reinforcement) of t.he action The output of NN is the value on which the coras well as the next, input, t,o the agent. The agent,’s rect.ion to the comniunic.ation network pa,rameter de- Reinforcement Learning 1126 pends. The NX also has one or more hidden layers, depending on the complexit)- of the problem. The number of neurons in the hidden layer(s) also depends on the estimate of the mimber of exemplars provided for t,raining [I]. Each layer of the N N consist,s of one or more neurons. T h e output of one layer is fed as the input of the next. Associa.ted with each of these connections is an adaptive weight. The output of a neuron in a hidden layer or output, layer is calculated as am a,ctivation function of the weighted sum of t,he input, to the neuron with an opt,ional bias term. which is also, in principle adaptive. For every pattern (or exemplar) presented to the iirural iiet,work for training, t,wo passes are used. l h e forward pass is used to eva,luate t,he oulput of the neural network for tlie given input, with the existing weights. In the reverse pass, the difference in t,he neural network output with the desired output, i s conipared and fed back to the neural network as an error t,o change the weights of the neural network For the input layer nenrons, tlie user input i s allplied aiid a,t the other neurons the net-input is calculated by using the following : ci(u’ii* inputj) where wij is the weight, of the link connecting the i in the current, layer and the neuron j in the preceding layer, and i n l ~ u t jis the output of the neuroil j fed t,o neuron i . T h e input to bias neuron in i,he preceding layer is sct, t,o -1 and the weight is adjust,ahle. Esseidially, tlre weight on t,lie Ilks link in the previous layer iridicat,es the t,hreshold of each nenroii in the current laycr. The out,put, of rach iieuroii is determined hy applying a transfer fuuci,ion f( .) l o the net-input to tlre iieuroii. The commonlg used fuiictioiis are sigmoidal fiinct,ions (i.r/.. (1 F-?)-’: and h>Tperbolic tangent,. neuroii + Lci,71 /U!) IVe use t,he function + .f(,.) = (1 (1) In t,he reverse pass, i . c . , for t,he particular neiiron < in out,pu( layer L , suppose t,liat the out,plJt of the ~ outpui is (1:. The hackncuroii is .yc aiid t h desired propagated error is 6: = f’(hF)[df. - yf.]. (2) where hf represents t,he net-input t,o t,he i”’ uiiit, iu t,lir Lih lilyrr a-nd , f ’ ( . j is the deriva,tiw 01 . f ( .j . For layers 1 = 1. . . ., (L-1). and Sor each neuron i i n layer I the error is comput,rd using : The a,dapt,ationof the weights of the link connecting neuroiis i and j is done using : tu:, x + I< * 6: * u‘-l 3 (4) where I< is the learning constant, t,hat depends on the rate a t which the neural network is expected to converge This neural network algorithm essentially minimizes t h r mean square error between the neural network out,put and the desired output using gradient, desceut, approa,ch RL for Performance Tuning The realistic ahductive reasoning model evaluates tlie performa.nce degradation with a set, of soft,-failures and respect.ive plausihilit,ies. These values are fed t o the two modules ol iieural network. One of the neural networks is the (‘actor’’ and the other i s tlie ‘.crit,ir”. Let t,he out,put,of tlie “actor” be p , and that, ofthe “crit,ic” he i~which is the goodness of p . The correction z is applied to the cornmuiiica~tionnet.work which is a.n instance o f a Ga,ussia,ii distribution wit,h mean p and varia,iicc where U is a non-increasing functioii of r . The variance t,erin is used t o cont,rol explorat,ion of the output space by having a small value when 1. is close to the maxiinrim valiie and ot,herwise a large value. Once again. t,he r e a l i h c ahdnctive reasoning model rvaluates t,lie coinmimicatioii network t o yield a, neu’ set of soft-failures axid respect,ive percentage degradat,ioii in performance. And. based on that,. the reiirforcenieni, is given t,o hot11 thi. “actor” and tlie “critic”. ‘The error used t,o ada.pt, t,he xeights of hot,li net,morks is a function of tlie values of 11,. I’, a,nd ?. This process of soft,-failnrr identificat,ion and perforniance tuning rep a t s till the network performance is brought, back t o normalcj The follorving are t,he equations for “actor’ a i d ”crit,ic” iieural net,warks that, are incorporated int,o the conmect,ionist model. “Aci,or” neural iiet,work : input I : th? degradation iii performance; plausiIdit>-o f t.he performance parameter being con1,rolled h?- t,lie iieiiral iictwork: current set,t.ingot the paramet,er; ot,lier paramet,ers of significance. output, : p . r = 1 if t,he pcrformanc,e improves and -1 if it degrades fnrt,lier. liaclr],ropagat,Pd IerroP : (7 ~ ?) * (z ~ A<j/C~ where z = f(p. U ) . where f(.) is a Gaossian with iii~aip i and variance m. 1127 “Critic” neural network : input : x. outuut : i.. U = max(0, (1 - e)). r = 1 if the performance improves and -1 if it degrades further. backpropagated error = (r - i). Algorithm performance Tuning Begin 1. Choose a neural network with 1, . . . , L layers. 2 Set the input of the bias neurons to -1.0 0,s *--Uwno NN. . --WM0uI ““I ”I 0.8 0.7s 0.7 1 2 3 4 1 lim”llOn4ims 1 7 e “lo‘ 3. Initialize w i j = 6 V i , j , a random value in the range (0.1). Fig. 4. Performance improvement on using neural network. 4.feed the input parameters 5. Propagate the signal forward through the neural the source buffers. The output is the recommended network till the output layer, evaluating output of each of the neurons i , f(Cj(Wji *iWtj)) where iirpvtj is the output from neuron j in the preceding layer, fed to neuron i . 6 For each of the neuron i in the output layer (layer L), compute the error 6: using the equations given in the respective p and f networks. 7 For 1 = 1, . . ., (L-1), compute the error 61 = f ’ ( h i )C j ( w Ii ij l * #+I I ) 8. Upda.te the weights using : * J! * Y j U,!. m!. SI + K where I< is t,he learning rate. 21 9 9 Repeat, hy going to Step 4 till PM is active. End ETHERNET PERFORMANCE TUNING The Ethernet performance management model considered for soft-failure identification is further ext,eiided to demonstrate the learning capabilit,y i . e . , improving the cominunication network behavior in case of performance degrada,tion. The parameters considered for improving the network performa,nce are (i) arrival proha,hility; (ii) basic ret,ry slot; (iii) maximum pa.cket, length; and (iv) minimum pa,cliet, length. Besides the parameter’s value, t,he t,uning a,lgorithm accepts performance degradation and average number of packets queued up in each of value of the parameters. The simulation model is developed in C++ programming language on Sun Sparc 20 worlcstation, running SUN OS 4.2. Each of the “actor” and the “critic” neural networks is trained by using backpropagation learning procedure with 5 neurons in the input, 4 neurons in the hidden layer and one neuron in the output layer. To demonstrate the proposed model for network performance tuning, the Ethernet model is simulated both in the absence of neural networks and after i n b grating with neural networks. The performance of the Ethernet a t several instances of simulation are presented. The improvement can be attributed to the adaptability of the neura,l net,works to learn the behavior of the Ethernet and make appropriate changes to result in better performance. From Figure 4,it ca,n be observed that, there is a. substantial improvement, in the performance (by over 5%, on an average) when the connectionist model for performance tuning is employed. Initial dip in performance is because of the neural network training- to reach the expected level of tuning. This may be overcome by providing the expert knowledge t o the neural network before it is placed on the Ethernet for performance tuning CONCLUSION In t.his paper, network performance problem is presented in t,wo parts. First part identifies the ca.uses of net,work performance degradataion and the second gives necessary inning t,o bring the performance back to normalcy using reinforcement learning model. The model is tested with the hypothetical Ethernet and the results show a significant iinproveineiit, of over 5%. 1128 References [I] A. G . Barto. R. S. Sutton and C. \V. Anderson, Neuronlike Elements that can Solw Difficult Learning Control Problems, IEEE Traii,%. on. Sgsfems, M a i i a n d Cyberrrttics, 13, pp.835- [12] G. Vijaykumar, J. A. Frakliii and H . Bellhrahim. Acquiring Robot, Skills via Reinforcement Learning. IEEE Control Syslenis, pp.1324. February 1994. 846, 1983. [2] D. R. Boggs. J . C. Mogul and C. A. Kent, Measured Capacity of an Ethernet, : Myths and Reality, Compiler Coininunicalion R E L ' C ~ W . pp.222-234, 1Y92. [3] F. E. Feather, Fault D e t e h o n in an Ethernet Net,work via Anomaly Detectors. Doctoral Dessserlatios. Carnegie Mellon University-. July 1992. [4] F. E. Feather: D. Slewlornk and R. Maxion. Fault Detection i n an Ethernet Net,mork TIsing Anomaly Signature Matching, C'onzpulcr C o n 7ii.unicatioii Reueiw. pp.279-288, 1993. [5] J. P. Hansen, The Use of Multi-Dimensional Paranirt,ric Behavior of a CSMA/CD Network for Nri~rvorl; Diagnosis, Pb.D. thesis. Depart,inrnt Elect,rical and Computer. Engineering, Chrniegie iMellon University: 1992. [li] S. Ilayes, Aiialpzing Nnt.n-orli Performance Management, IEEE Cunzmzinicnlioa Magnzaiic. [7] R. M . Metcafe arid D. R . Boggs, Et,hernet, : Dist,ril>ut,edPacket Swit,chiug for Local Computer Nebworks. Co~i~.iiiiiniculionsof AC'ilf, IS(7). [Q S. S. I<eerthi and B. Ravindran. A Tnt,orial Surw y of Reioforccnienl, Learning. i,o appear in : ,b'adhnnir. [9] (1:. P. Iiunrilr and P. Vrnliatarain, Network Perrorinancr I\lana,gemmt Using Realist,ic Abduct,ivr Reasoning Model. i n : IEEE/IFIP lritcriiot i o n a l ,Syii~po.~tuiiioit Iiiltgrnlid .Vrtii:ork Management. Saiida Barbara. California. pp.18i-198. n,iay 1995. [IO] G . P. Iiuinar and P. \:enkatarani, Ne1,work Fault. arid Perforniancc Management, Using Rea,list,ic i\hduct,ive Reasoning Alodel. Joarnal of Iiidi(iii I i i s t i l n f c of , S c i e u c e , Special issue oii Int,elligent Systems, Vol.28. No.6. pl).i09-7Oi, 199.7. [Ill G . P. Iiuinar. Inkgra,ted Neimork Managcnieirt~ TTsiiig Cst,zndvd Blackhoard Arch;tect,nre. D o < l o m i Ui,werlnlioir. Indian lnstitut,e OS Science. Bangdore. Jnlk- IUlJ6.