Pattern association Associative memories provide an approach to the computer engineering problem of storing and retrieving data based on content rather than storage address . informn storage NN is distributed throughout the weights , a pattern does not have a storage address in the same sense that it would if it were stored in a traditional computer . Associative memory NNs are single layer nets is which the weights are determined in such way that the net can store a set of pattern associations . each association is an input–output vector pair s.t .if each vector t is the same as the vector s with which it is associated , then the net is called auto associative memory . if the t’s are different from the s’s , the net is called hetero associative memory . In each of cases , the net not only learns the specific pattern pairs that were used for training but also is able to recall the desired response pattern were given an input stimulus that is similar but not identical to training input. Before computation an associative memory NN , the original patterns must be converted to an appropriate representation for computation . however , different representations of the same pattern may not be equally powerful or efficient . In a simple problem , the pattern may be a binary vector or bipolar vector . the input and the output values can be real or binary . The architecture of anassociative memory NN may be feedforword or recurrent (iterative) . In a feedforword net , inform n flows from the input units to the output units . In a recurrent net ,there are connections among the units that form closed loops . A key question for all associative net is how many patterns ( or pattern pairs ) can be stored or learned before the net starts to “ forget “ patterns it has learn previously . A no. of factors influence how much the net can be learn . - the complexity of patterns (the no. of components) . - the similarity of the input patterns that are associated with significantly different response patterns Heteroassociative memory NN The associative NN can store a set of pattern associations . each associations is apair of vectors ( s(p) , t(p) ) with p = 1,2,……p . Each vector s(p) has n components (n-tuple) and each t(p) is an m-tuple . the weights may be found using the hebb rule or the delta rule . In general , a bipolar represent n of the patterns is computationally preferable to binary representation . One advantage of bipolar representation n is that gives a simple way of exprening two different levels of noise that may be comprised in an input pattern :1- mining data or a response of unsure denoted by 0 2- mistakes where a pattern contains a response of yes when the correct response should be no and vise versa , a response of yes would be represented by +1,no by -1 The weight matrix may be computed using the Hebb rule or outer products.To store a set of binary vector pairs s(p):t(p) for p=1,2,…..p , can use a weight matrix formed from the corresponding bipolar vectors where matrix W={wij} is given by wij=∑ (2si(p)-1) (2tj(p)-1) p 2 However if the vector pairs are bipolar the matrix W={wij} is given by wij=∑ si(p) tj(p) or W=∑ sT(p) t(p) p p i.e the weight matrix for the set of patterns is the sum of the weight matrices to store each pattern pair separately N.B The outer product of a vector pair is simply the matrix product of the training vector (column vector written as n x 1 matrix) and the target vector (row vector 1 x m matrix) Algorithm: Step0: initialize wts using either the Hebb rule or the Delta rule. Step1: For each input vector do step 2-4. Step2: Set activations for input layer units equal to the current vector xi = si Step3: Compute the net input to the output units y-inj = ∑ xi wi j i Step4: Determine the activation of the output units for bipolar targest 1 if y – inj > 0 yj = 0 if y- inj = 0 -1 if y – inj < 0 The output vector y gives the pattern associated with the input vector x .The hetero associative memory is not iterative. If the responses of the net are binary , a suitable activation fn is given by 1 if x > 0 0 if x <= 0 F(x) = 3 An example :- (Heteroassociatve net using the Hebb rule ) Suppose the net is to be trained to store the following mapping from input bipolar vectors S= (S1, S2, S3, S4) to output vectors with two components t = (t1,t2) where S (1) = (1, -1, -1, -1) t (1) = (1, -1) S (2) = (1, 1, -1, -1) t (2) = (1, -1) S (3) = (-1, -1, -1, 1) t (3) = (-1, 1) S (4) = (-1, -1, 1, 1) t (4) = (-1, 1) The weight matrix is wij = Σ si (ρ) tj ( ρ) p Hence W = 4 -4 2 -2 -2 2 -4 4 Where the weight matrix to store the first potties pais is given by the outer product of the vectors s = (1, -1, -1, -1) and t = (1, -1), the weight matrix is 1 -1 1 1 -1 = -1 -1 1 -1 -1 1 -1 -1 1 4 Similarly to store the 2nd, 3rd, 4th pair. The weight matrix to store all four pattem pairs is sum of the weight matrices to store each pattem pais separately, i.e. 1 -1 1 -1 1 -1 1 1 -1 + 1 -1 1 -1 1 1 -1 1 -1 1 -1 + -1 1 -1 4 -4 -1 + 1 -1 2 -2 -1 -1 1 -2 2 1 -1 1 -4 4 = Testing the net using the training input, note that the net input to any particular output unit is the dot product of the input vector (row) with the column of the weight matrix that has the weights for the output unit is question . The row vector with all of the net puts is the Product of the input vector and the at matrix. i.e. the entire is represented as X w=(y.in1 , y.in2 ) Hence using the fist input vector 1st ( 1, -1, -1, -1 ) .W = ( 8, -8 ) → ( 1, -1 ) 2nd ( 1, -1, -1, -1 ) . W = (12, - 12) → ( 1, -1 ) 3rd 4th ( -1, -1,-1, 1 ) . W = (-8, 8) → ( -1, 1 ) ( -1, -1, 1, 1 ) . W = ( -12, 12 ) → ( -1, 1 ) Using the activation fn. for bipolar targets i.e. 5 1 if x > 0 F(x) = 0 if x =0 -1 if x < 0 The above results show that the correct response has been obtained for all of the training patterns. However, using an input vector x = (-1, 1, -1, -1) which is similar to the training vector s = (1, 1, -1, -1) (the 2nd training vector) and differs only in the first component Hence (-1, 1, -1, -1). W = (4, -4) (1, -1) i.e. the net associates a known output pattern with this input Testing the net with the pattern x = (-1, 1, 1, -1) which differs from each of the training patterns in at least two components (-1, 1, 1, -1). W = (0, 0) (0, 0) i.e. the output is not one of the outputs with which the net was training i.e. the net does not recognize the pattern. Auto associative Net In this net the training is often called storing the vectors which may be binary or bipolar. A stored vector can be retrieved from distorted or noisy input if the input is sufficiently similar to it. The performance of the net is judged by its ability to reproduce a stored pattern from noisy input. Often, the weights on the diagonal (wts that connect on input pattern component to the corresponding component in the output pattern) are set to zero. Setting there wts to zero may improve the net's ability to generalize . 6 Usually the weights are set using Hebb rule, the X1 y1 Xi Yi Xn Yn outer product i.e. p W = ∑ s t (p) s (p) P=1 The auto associative net can be used to determine Whether an input vector is "known" or "unknown" (i.e. not stored in the net ) .The procedure is : Step0: Set the weights using Hebb rule Step1: For each testing input vector, do step 2-4 Step2: Set activations of the input units equal to the input vector Step3: Compute the net in put to each output unit y_inj = Σ xi xij Step 4: Apply activation fn. 1 if y.inj > 0 Yj = f (Y.inj) = -1 if y.inj ≤ 0 Example:An autoasociative net to store one vector (1, 1, 1, -1) Step o: compute the weights for the vector (1, 1, 1, -1) 1 1 1 -1 1 1 1 -1 W = 1 1 1 -1 -1 -1 -1 1 Step 1: For testing the input vector Step 2: X = (1, 1, 1, -1) 7 Step 3: Y_in = (4, 4, 4, -4) Step 4: Y = f (4, 4, 4, -4) = (1, 1, 1, -1) i.e. The input vector is recognized. N.B. using the net is simply ( 1, 1, 1, -1, ) . W Similarly it can be show that the net recognizes the Vectors (-1, 1, 1, -1, ) ( 1, -1, 1, -1, ) , ( 1, 1, -1, -1, ) and ( 1, 1, 1, -1, ) i.e. the net recognizes vector with one mistake in the input vector. The net can be shown, also to recognize the vector formed when one component is mining i.e. ( 0, 1, 1, -1, ) , (1, 0, 1, -1 ) , ( 1, 1, 0, 1 ) and ( 1, 1, 1, 0 ) Again , the net can recognize input vectors with two mining entrées much as : ( 0, 0, 1, -1 ) , ( 0, 1, 0, -1 ) , (0, 1, 1, 0 ) , (1, 0, 0, -1 ) , (1, 0, 1, 0 ) ∞ (1, 1, 0, 0 ) However , it can be shown that the net does not recognize input vector with two mistakes in the input vector much as (-1, -1, 1, -1 ) in general , a net is more tolerant of mining data than it is of mistakes in the data . it in fairly cornmor for an auto as ociatuie net to have its diagonal terms set to zero . e.g. 0 1 1 -1 W0 = 1 0 1 -1 1 1 0 -1 -1 -1 -1 0 Using this net, it still does not recognize as input vector. 8 Such as (-1,-1.1,-1) with two mustaches. Stage capacity: This is numbers of vectors (or pattern pairs) that can stored in the net be in the net before the net begins to forget. More than one vector can be stored in an auto associative net by adding the weight matrices for each vector together. The capacity of an auto associative depends on the net of components the stored vectors have and The relationships among the stored vectors: more vectors can be stored if they are mutually orthogonal. An auto associative net with four nodes can store three orthogonal vectors (to each of the other two) However, the wt metrics for four mutually orthogonal vectors is always singular ( all the elements are zero) Iterative auto associative net: Can make the input &output units be the same to obtain a recusant auto associative net i.e x1.w =x2 iterate x2.w = x3 and so on. Y1 Y2 9 Yi Yn 10