Cache replacement using BPNN Cache Replacement Scheme based on Back Propagation Neural Networks Project for course ECE 539 (Introduction to Artificial Neural Network and Fuzzy Systems) Guided by Prof. Yu Hen Hu By Rakesh Ramananda-9070043170 1|Page Project report ECE 539 Cache replacement using BPNN Index 1.1 Introduction 3 1.2 Selection of architecture of BPN network 4 1.3 BPNN based cache algorithm 5 1.3.1 Steps for cache replacement 6 1.4 Simulation and observation 7 1.5 Conclusion 15 1.6 Reference 15 2|Page Project report ECE 539 Cache replacement using BPNN 1.1 Introduction Cache memory: The cache is a smaller, faster memory which stores copies of the data from frequently used main memory locations. Cache memory is the nearest memory to the processor. Access to cache reduces the time spent by the processor for the instruction execution there by reducing the average CPI. Most processors have an instruction and data cache where the corresponding information is copied for main memory. The instruction/data copied from main memory will be done in a block of fixed size known as cache line. As the size of cache is less than main memory the number of lines in the cache will be less than that of cache. Lines are copied into cache based on two type of locality that is temporal locality and spatial locality. Temporal locality (Locality in Time): If an item is referenced it’s going to be referenced again soon Spatial locality (Locality in space): If an item is referenced than the item whose addresses are close to it will also be referenced. Based on these two logic the cache will be loaded with the lines of data/instructions. So the way in which these lines are selected and also the way in which the existing lines are replaced when a new line comes into the cache has a huge impact on the performance of the computer system. The common replacement algorithms used LRU, MRU and FIFO. Various cache replacement policies: a) Optimal cache replacement policy: This algorithm always discards the line that will not be needed in the future. As it is impossible to predict the future use of data during the execution this algorithm practically cannot be implemented. 3|Page Project report ECE 539 Cache replacement using BPNN b) LRU algorithm: In this algorithm the cache line that is least recently used will be replaced when a new line comes into the cache. c) MRU Algorithm: In this algorithm the cache line that is most recently used will be replaced when a new line comes into the cache. d) FIFO algorithm: In this algorithm the cache line that had come in to the cache first will be replaced by the new line coming into the cache. The problem with LRU, MRU and FIFO algorithms is that you need to track either when the cache line came into the memory or when it was last accessed, i.e. it requires a time stamp to be associated with each line in order to access its time of arrival or time when it was last accessed. This would consume lot of memory location inside the cache and the actual size of the cache where data is stored will be reduced. In order to reduce the memory used for storing information on when was the line was got into the cache or when was it accessed we use a back propagation neural network to assist us to know whether a line was accessed recently. This algorithm tries to improve the performance of cache memory by developing a replacement strategy that looks at a very long history of addresses by training the BPNN for the address that is accessed by the system. 1.2 Selection of the architecture of the BPN network: The various parameters that have to be decided during the design of network are 1) 2) 3) 4) 5) learning rate Momentum Number of hidden layers Number of neurons in the hidden layer Activation function of the neurons in different layer The variables where assigned different set of values as mentioned below and the network was tested for the best hit rate. 1) 2) 3) 4) 5) learning rate: 0.01,0.1,0.2,0.4 and 0.8 Momentum : 0, 0.5 and 0.8 Number of hidden layers : 1 and 2 Number of neurons in the hidden layers : 2,3,4,5,6,7,8,9 and 10 Activation function: Linear, sigmoidal and hyperbolic tangent function. 4|Page Project report ECE 539 Cache replacement using BPNN And it was found that the following network provided the best hit rate. 1) 2) 3) 4) 5) learning rate: 0.1 Momentum : 0.8 Number of hidden layers : 1 Number of neurons in the hidden layers : 7 Activation function for hidden layers: hyperbolic tangent function and activation function for output layer as sigmoidal function. 1.3 BPNN based cache Algorithm: The below described the replacement policy is best suitable for a set associative cache. Here we use BPNN as a shadow directory. Shadow directory in cache is a directory which stores the previously referenced addresses for each set of a set associative cache. BPNN is trained whenever a memory access is performed and the weights of neurons of BPNN are adjusted such that it can remember the addresses that are accessed by the system. This help in tracking whether a particular address is being repeatedly accessed. The BPNN helps to classify cache lines as transient lines and shadow lines. The transient lines are the new arrivals to the cache, whereas, the shadow lines are the ones which were replaced recently from the cache in favor of some other lines. As described before a line is placed inside the cache based on temporal and spatial locality. We use the addresses/tags stored in the shadow directory in order to determine shadow lines, i.e. if the line was present in the cache and was later replaced and provide them special status when replacing cache lines next time. If the line is not a shadow line i.e. it’s coming into the cache for the first time it will be given an opportunity to improve its status and later will be treated as dead. In order to implement this algorithm we need two additional bit associated with each cache line to depict its special status these bits are called special and shad. Special bit helps to identify the cache line which have special status during replacement and shad bit tells whether the line under consideration is a shadow line or not. A shadow line may lose its special status if it’s not accessed frequently. This algorithm saves lot of memory being wasted to store shadow directory or time stamps associated with cache line. Instead we use only two bits additional with the cache line. This gives more space to store data lines. 5|Page Project report ECE 539 Cache replacement using BPNN 1.3.1 Steps followed for Implementation of cache replacement: Step 1: Initialize weights of all the neurons in the network to a non-zero random value very close to zero. Number of neurons in the input layer is equal to sum of number of bits that depict the tag field and set field in the memory address. The number of neurons in the output layer of the neural network will have one-to-one correspondence with each set of the cache memory. Initialize the rest of network parameters as specified above. Step 2: Feed the system with address of memory location being accessed in binary form. By removing the offset bits. Step 3: determine whether the cache access is a hit or miss. If miss go to step 5. Step 4: If it is a cache hit place the hit data line to the last used position of the set and push the other lines to the previous position in the set. The value of special bit is set to 1 only if the shadow bit is set else special bit is set to 0. Increment the hit count. Continue with the next memory access. Step 5: Train the neural network for the corresponding memory address. If the value of output neuron corresponding to the desired set is 1 than it is considered as shadow line i.e. shadow bit is set to 1 else it is set to 0.Update the weights of neuron by error back propagation method. Step 6: if shadow bit is set to 1 set the special bit else reset the special bit. Step 7: to replace check the special bit of all the members of the set. If special bit for all the members is set to 1 do step 8, if special bit of one of the members in the desired set is 0 go to step 9. Step 8: If all the members have special bit as 1 store the new line in the last member position of the set and shift all other members to one position below in the set. Increment the miss count. Continue with the next memory access. Step 9: if one of the member in the set has special bit set to 0 replace the corresponding member with the new incoming member. Increment the miss count. Continue with the next memory access. 6|Page Project report ECE 539 Cache replacement using BPNN 1.4 Simulation and Observation: Various data sets where prepared and output of them using various algorithms where determined. Comparisons of output from various algorithms is provided below. A cache controller was designed using fuzzy logic as a part of the comparison. 1.4.1 Cache controller using fuzzy logic: A cache controller was built based on fuzzy logic. Following are the input linguistic variables used for the design of cache controller. 1) Access frequency: This variable indicate the number of times this block has been accessed during the execution of program. Each time this block is accessed, the variable is incremented. This variable will be denoted as AF. 7|Page Project report ECE 539 Cache replacement using BPNN Access Frequency variable is divided by total number of memory access before determining the degree of membership of the value with respect to each fuzzy set. Associated with this linguistic variable three fuzzy sets are generated namely, a) High Access frequency: denoted by green line in the figure. This depicts if the address is accessed regularly. This is depicted as HAF. b) Medium Access frequency: denoted by blue line in the figure. This is depicted as MAF. c) Low Access frequency: denoted by the red line in the figure. This is depicted as LAF. 2) Recently accessed: this variable indicates how recently the cache line has been accessed. This variable is reset every time the block is accessed, while this variable for other cache line are incremented. It is denoted as AR This variable is divided by total number of memory access before determining the degree of membership of the value with respect to each fuzzy set. Associated with this linguistic variable three fuzzy sets are generated namely, 8|Page Project report ECE 539 Cache replacement using BPNN a) Least recently accessed: this set is denoted by the red line in the figure. This is denoted as LRA. b) Moderately recently accessed: This set is denoted by blue line in the figure. This is denoted as MRA. c) Very recently used: This set is denoted by green line in the figure. This is denoted as VRA. 3) Distance between consecutive accesses: this variable depicts the distance between two consecutive accesses to a block. It is calculated as (1- Access recently/100). This denoted as DCA. Associated with this linguistic variable three fuzzy sets are generated namely, a) Short gap: this is depicted by red line in the figure. This set depicts that Distance between to access is small. This is denoted as SG. b) Medium gap: this is depicted by blue line in the figure. This is denoted as MG. c) Long gap: this is depicted by green line in the figure. This set depicts that Distance between to access is large. This is denoted as LG. 9|Page Project report ECE 539 Cache replacement using BPNN The output of Fuzzy inference system is a variable called as Block replaceable. If a cache line has to be replaced in a set than the cache line with high value of block replaceable will be replaced with the new incoming cache line. This variable is denoted as BR. Following are the rules used for development of fuzzy inference system If [(AR is LRA) AND (AF is LAF) AND (DCA is SG)] THEN BR is HBR If [(AR is LRA) AND (AF is LAF) AND (DCA is MG)] THEN BR is HBR If [(AR is LRA) AND (AF is LAF) AND (DCA is LG)] THEN BR is MBR If [(AR is LRA) AND (AF is MAF) AND (DCA is SG)] THEN BR is HBR If [(AR is LRA) AND (AF is MAF) AND (DCA is MG)] THEN BR is MBR If [(AR is LRA) AND (AF is MAF) AND (DCA is LG)] THEN BR is MBR If [(AR is LRA) AND (AF is HAF) AND (DCA is SG)] THEN BR is HBR If [(AR is LRA) AND (AF is HAF) AND (DCA is MG)] THEN BR is MBR If [(AR is LRA) AND (AF is HAF) AND (DCA is LG)] THEN BR is LBR If [(AR is MRA) AND (AF is LAF) AND (DCA is SG)] THEN BR is HBR If [(AR is MRA) AND (AF is LAF) AND (DCA is MG)] THEN BR is MBR If [(AR is MRA) AND (AF is LAF) AND (DCA is LG)] THEN BR is MBR If [(AR is MRA) AND (AF is MAF) AND (DCA is SG)] THEN BR is HBR If [(AR is MRA) AND (AF is LAF) AND (DCA is MG)] THEN BR is MBR 10 | P a g e Project report ECE 539 Cache replacement using BPNN If [(AR is MRA) AND (AF is LAF) AND (DCA is LG)] THEN BR is MBR If [(AR is MRA) AND (AF is HAF) AND (DCA is SG)] THEN BR is HBR If [(AR is MRA) AND (AF is HAF) AND (DCA is MG)] THEN BR is LBR If [(AR is MRA) AND (AF is HAF) AND (DCA is LG)] THEN BR is LBR If [(AR is VRA) AND (AF is LAF) AND (DCA is SG)] THEN BR is MBR If [(AR is VRA) AND (AF is LAF) AND (DCA is MG)] THEN BR is MBR If [(AR is VRA) AND (AF is LAF) AND (DCA is LG)] THEN BR is MBR If [(AR is VRA) AND (AF is MAF) AND (DCA is SG)] THEN BR is MBR If [(AR is VRA) AND (AF is MAF) AND (DCA is MG)] THEN BR is LBR If [(AR is VRA) AND (AF is MAF) AND (DCA is LG)] THEN BR is LBR If [(AR is VRA) AND (AF is HAF) AND (DCA is SG)] THEN BR is LBR If [(AR is VRA) AND (AF is HAF) AND (DCA is MG)] THEN BR is LBR If [(AR is VRA) AND (AF is HAF) AND (DCA is LG)] THEN BR is LBR The symbols used have been described in the section above. 11 | P a g e Project report ECE 539 Cache replacement using BPNN 1.4.2 Observations: Trial 1: Input file: memtracetest.txt Size of address bus=8bits Line size=1 byte Associativity=4 Sets=8 Total Number of memory access=55 Trial 1 24 NUMBER OF HITS 23 22 21 20 19 18 LRU MRU FIFO Cache using fuzzy logic Cache using BPNN TYPE OF REPLACEMENT ALGORITHM USED 12 | P a g e Project report ECE 539 Cache replacement using BPNN Trial 2: Input file: memtracetest3.txt Size of address bus=16 Line size=2 bytes Associativity=4 Sets=8 Number of memory access=103 Trial 2 38 NUMBER OF HITS 36 34 32 30 28 26 24 LRU MRU FIFO Cache using fuzzy logic Cache using BPNN rEPLACEMENT ALGORITHM 13 | P a g e Project report ECE 539 Cache replacement using BPNN Trial 3: Input file: memtracetest4.txt Size of address bus=16 Line size=2 bytes Associativity=8 Sets=4 Number of memory access=103 Trial 3 64 63 NUMBER OF HITS 62 61 60 59 58 57 56 LRU MRU FIFO Cache using fuzzy logic Cache using BPNN REPLACEMENT ALGORITHM 14 | P a g e Project report ECE 539 Cache replacement using BPNN 1.5 Conclusion: It is seen that the above implemented cache replacement algorithm performs similar to LRU, MRU and FIFO algorithms. With only two additional bits associated with each cache line to assist in the replacement algorithm this provides an advantage of consuming less memory for overhead information and utilizing maximum memory for actually data. ** The MATLAB code given with this report is dependent on the architecture of the computer. The current code is configured for 16 bit address bus with cache offset of 2 bytes. 1.6 Reference: 1) PERFORMANCE EVALUATION OF A NEW CACHE REPLACEMENT SCHEME USING SPEC by Humayun Khalid, and M.S. Obaidat 2) Using Fuzzy Logic to Improve Cache Replacement Decisions by Mojtaba Sabeghi and Mohammad Hossein Yaghmaee 3) Lecture presentation of course UWM ECE 539 by Prof. Yu Hen Hu. 15 | P a g e Project report ECE 539