T 1

Towards Immunocomputer Alexander O. Tarakanov  Abstract — The paper shows how the principles of information processing by proteins and immune networks could lead to a new kind of computers. We propose to call such computers 'immunocomputers' by analogy to the widely spread neurocomputers, which are based on the models of neurons and neural networks. We consider a rigorous mathematical basis and possible applications of the immunocomputer. Index Terms — formal protein, formal immune network, immunocomputer I. INTRODUCTION From the computational viewpoint we can consider that proteins realize the main functions of information processing in living nature. In fact, namely the proteins recognize and execute programs (instructions) represented in the form of genetic code. Being the neuro-mediators and the receptors of neurons, proteins control the electrical activity of the brain. Proteins are also the main components of the immune system, which thus involves free proteins (antibodies, messengers, etc.), and proteins as receptors of immune cells (B-cells and T-cells). Apparently, proteins should play the key role both for immune and intellectual processes. For example, specialists call the immune system “the second brain of vertebrates'' [6]. Indeed, the immune system possesses all the main features of Artificial Intelligence (AI): memory, abilities to learn, to recognize, and to make decisions how to treat any non-self protein (antigen), even if the latter had never existed before on the Earth. Of the especial interest for computer science is the widespread theory of immune networks, formed on the interactions between proteins of the immune system. Nowadays there is no doubt that such networks exist. Their fragments and interactions have been detected experimentally. It is worth to note that similar networks under the name of molecular circuits have been proposed as a possible molecular basis of neuronal memory in the human brain [2]. Based on the biological principles of the immune system, there arises a new and rapidly growing field of the Artificial Russian Academy of Sciences St.Petersburg Institute for Informatics and Automation 14 line, 39, VO, St.Petersburg, 199178, Russia tarakanov@togetherlab.nw.ru Immune Systems (AIS), offering powerful and robust information processing capabilities for solving complex problems [3]. Like Artificial Neural Networks (ANN), the AIS can learn new information, recall previously learned information, and perform pattern recognition in a highly decentralized mode. The AIS have already been applied in several specific problems, such as information security, faults detection in manufacturing, vaccine design, data mining, robotics, etc. However, comparing with the ANN, the field of AIS has yet neither a clear and sound mathematical basis, nor a hardware implementation analogous to the existing neurocomputers that are based on ANN. Nowadays most of the AIS represent some hybrid and heuristic algorithms, using ideas from genetic algorithms, cellular automata, ANN, etc. [3]. On the other hand, the role of proteins, as the basic natural elements of the information processing, has not yet been fully exploited by computer science, including AI, ANN and AIS. Therefore, this paper makes an attempt to bridging the existing gaps. We propose a concept of immunocomputer (IC) that is based on the principles of information processing executed by proteins and immune networks. We develop a proper mathematical basis of the IC by introducing notions formal protein (FP) and formal immune network (FIN) [15-17]. The need of such novel notions is caused by very specific objects and interactions of immune networks, which differ remarkably from genetic algorithm, cellular automata, ANN, or intelligent agent. We hope that such mathematical basis could raise the AIS up to the level of the wide spread ANN, and even allow to speak about hardware implementation of FIN in a special electronic scheme (immune chip). Such chip could be treated as a core of the future IC. We demonstrate also potential applications of the IC in terms of numerical algorithms, which are able to solve specific complex problems. Among them we should consider briefly complex evaluation of ecological and medical indicators, information security, and some of the other ones. II. BIOPHYSICAL BACKGROUND According to biological prototypes and their mathematical models [19], the principal difference between the IC and the neurocomputer should be determined by functions of their basic elements. If artificial neuron is considered as a summation with a threshold, connected with fixed neurons [23], then protein as a basic element of the IC ensures quite other conditions. Consider them more detailed. From the computational viewpoint the living nature has the uniformed information basis It consists of the universal genetic code and the universal alphabet, where words are molecules of proteins. Using analogy with computer, it can be said that the genetic code is like “software”, i.e. instructions of program that the cell receives from its parent cell, while proteins are “hardware”, i.e. the biophysical mechanisms that execute the program. No wonder that proteins are the most complex of the known molecules and the most universal in their properties and functions. Although genes and proteins are exceptionally complex, some of their features can be explained by rather simple and general mechanisms. These biophysical mechanisms, though, are not easy to uncover. A striking example is the discovery in 1953 of double helix structure of the chain molecules that store the genetic code. This spatial structure is formed by the so-called weak interactions between very strictly determined molecular shapes situated in the same plane. This is one of the most significant examples of the geometrical correspondence between biomolecules. But for proteins the mechanisms with similar simplicity and explanatory power have not been found yet. Nevertheless, the following principles are evident [5]: 1. 2. Spatial conformation of protein is determined by the linear sequence (word) of its amino acid’s code; This conformation determines functions of any protein. The first correspondence between the code and the stable conformation of protein (the so-called native form) is realized by mechanisms of self-assembly (or folding). The second correspondence between spatial conformation and the function of protein is realized by mechanisms of molecular recognition. Just as for the double helix, these mechanisms are based essentially on the week interactions between different parts of the protein molecule and between different molecules of proteins. As a result of such interactions, protein can bind with another protein or molecule. As a result of binding, protein can change its spatial shape (the so-called allosteric effect [5]). Furthermore, from this effect the protein can receive an ability to bind with another protein (antigen, antibody, receptor, etc.), with which it couldn't bind before. And new proteins can be involved in such process of subsequent binding, forming molecular circuits or immune networks. Therefore, binding could be seen as the main information processing function of proteins. Like “key and lock”, such binding is highly specific, because it depends on the existence of highly adjusted local shapes of interacting proteins. It could be said also, that the protein is able to select or “recognize” the appropriate “pattern”, as well as reject all the other ones. The main biophysical characteristic of interaction between proteins is a free energy [5]. The lower is the energy, the stronger is the binding, and vice versa. Thus, the negative energy, lower than the energy of the Brownian motion, corresponds to the proper binding, while the positive energy corresponds to repulsion between proteins. According to [19], consider the free energy of interactions between proteins as a binding energy, to distinguish it from a free energy of protein’s folding. III. BASIC COMPONENTS In spite of the proteins, cells can be considered as the second basic component of information processing by immune networks. Two main sorts of immune cells can be distinguished: B-cells and T-cells. Cells produce and secrete proteins, as well as expose proteins as their receptors. Accordingly, let us distinguish two kinds of proteins: “free proteins” independent from cells, and proteins anchored in the membranes as cell’s receptors. The examples of free proteins may be any peptides (small proteins), antigens, antibodies produced by B-cells, and numerical peptides (limphokines) produced by T-cells. The examples of receptors may be the so-called proteins MHC I and MHC II (Major Histocompatibility Complex class I and II). These proteins are used by the immune system as universal markers of any own (self) cell of the body to distinguish it from non-self antigens. On the other hand, the architecture of any computer includes at least two basic components: memory and processor. They can be gathered in the separate modules, like RAM (Random-Access Memory) and CPU (Central Processing Unit) of the traditional PC, or distributed among other structural elements, like neuron of neurocomputer, or cell of cellular automata [1]. Nevertheless, memory and processing units are the intrinsic components of any computer. Thus, consider architecture of memory of the IC as shown on Fig.1. DNA (deoxyribonucleic acid) or antibodies (corresponding to the array 3 of the IC). These probes are immobilized to a solid surface, such as nylon, glass, or silicon substrates (array 4), and exposed to a set of testing samples (array 2). The results of binding between samples and probes are determined by fluorescence or electric signals (array 1). On the other hand, if any memory array of the IC is able to store only few discrete states, and all units of the array change states simultaneously in discrete time, then the wellknown cellular automata machines [22] or excitable media [1] could realize horizontal interactions within the array. Fig.1. Architecture of memory of the IC Each memory unit is depicted in Fig.1 as a square. Units are gathered in the four main arrays, depicted in the Fig.1 as horizontal layers (top-down): 1. 2. 3. 4. The output array corresponding to the binding energies between free proteins (array 2) and receptors (array 3); The input array corresponding to the free proteins; The array of the receptors corresponding to the stored patterns; The array of the cells corresponding to patterns’ control. Consider, that the every memory unit has only the strictly determined neighboring units. Namely, every memory unit has: 1. 2. Four vertical neighbors arranged strictly above and/or below the unit in any other array; Four horizontal neighbors arranged in a cross manner in the same array, as shown in Fig.2. However, such special kinds of the IC are obviously insufficient to simulate features of immune networks. Hence, consider that any memory unit of the IC is able to store a set of real numbers, and processing units of IC are able to compute this set using the set of any horizontal or vertical neighbor. IV. MATHEMATICAL BASIS Designate states of the memory units as wij , Pij , Rij , Cij , where i and j are row and column numbers (address) of the unit within the array, wij is real value of binding energy, Pij , Rij ,and Cij are vectors with real components of the dimensions nP , nR,, and nC, correspondingly. Consider vectors Pij , Rij ,and Cij as column vectors which code the states of proteins, receptors, and cells, correspondingly. Let binding energy be defined by a bilinear form wij   PijT MRij , (1) where M is a matrix nP nR of real values, and upper case ‘T‘ is a symbol of transposing. Consider several special cases that implement important mathematical constructions. A. Singular Value Decomposition Fig.2. Four horizontal neighbors of a memory unit Consider the content of any memory unit as its state. Then general function of the IC is to determine the states of the output array by the states of the input array in accordance with stored patterns that can change dynamically. For this purpose processing units of the IC determine the interactions only between states of the neighboring units. It is worth to note, that vertical interactions in such IC could be realized by already existing biochips, also called microarrays [9, 13]. Actually, microarray of the biochip is an orderly arrangement of probes, such as short strands of Let matrix M be given. Consider a set of unit vectors Pij and Rij of the dimensions nP and nR , correspondingly: PijT Pij  RijT Rij  1, i , j . Compute wij for all pairs of such vectors by Eq. (1). Select the minimal wij=w* and the corresponding pair of the vectors P*, R*: w*  min{ wij }, w*  [ P*]T MR * . (2) i, j Let value w*=w(P*, R*) satisfy to the following condition: w*  w( P, R ), P, R : PT P  RT R  1 . 1  2 ,  k  k 1 , k  0 ,1,..., n  1 . n Then, according to [16], s1=w* is the maximal singular value of the matrix M, while X1=P* and Y1=R* are the left and the right singular vectors of this matrix. Then we have exactly n types of the vectors. Designate them as S(k), k=0,...,n1. Compute the matrices Mk by the following recurrent rule: Define matrix M as a unit matrix: M k  M k 1  sk 1 X k 1YkT1 , k  2,...,r , M 1  M , 1 0  M  . 0 1 where r is the rank of the matrix M. Analogously to the matrix M, determine the maximal value sk=w* and the corresponding vectors Xk=P* and Yk=R* for the matrix Mk. Finally, we obtain the so-called Singular Value Decomposition (SVD) of the initial matrix M: M  s1 X1Y1T  ... sr X rYrT . Note, that IC allows minimizing the value wij at least in two different ways. Firstly, the IC can use a process of random “mutations” of vectors’ coordinates so, that Pij and Rij still remain unit vectors. For example, if the IC had received a value of wij , which satisfies to Eq. (2), and this value had not been reduced after a big number of mutations, then this value could be considered as a minimum. Secondly, the IC is able to use more strictly determined procedure. For example, let wij , Pij , Rij ,and Cij be computed by the following recurrent scheme (we omit the lower indexes for convenience): [ R( k ) ] T  [ P( k 1 ) ] T M , C ( k )  MR ( k ) , P( k )  C ( k ) , w( k )  [ P( k ) ] T MR ( k ) , k  2 ,..., (k ) while w ( k 1 ) w  . According to [16], such scheme converges to the maximal singular value and singular vectors in general case of the matrix M. B. Formal Immune Networks Then binding energy between vectors S k1  and S k2  is   w   cos  k1   k2 . (3) Define binding as such event, when wwh , where wh is a given threshold of binding. Let an integer nh define the threshold as follows: wh   cosnh1  . Hence, binding condition can be reduced to the following inequality: mink1  k2 , n  k1  k2  nh . Let memory of the IC is one-dimensional. Then states can be marked by an index j and represented in a form of a matrix:  w1 P  1  R1  C1 ... w j ... Pj ... R j ... C j ... ... ...  ... Designate an empty memory unit (gap) by the symbol . Let the initial sequence (population) {R} of the length m without gaps be given: R : R j   j  m, R j   j  m. (4) Let the population {P} be an arbitrary, and the initial population {C} be empty. Consider processing of the population {R} by the following algorithm. Algorithm 1. Consider only unit vectors of the dimension 2. Then any vector can be represented as depending on one angle, for example: S(  )  [cos , sin ] T . Let this angle accept only one of n discrete values: 1. 2. 3. 4. Compute wj between Pj and Rj by Eq. (3); Change Rj and Cj according to wj and wh ; Merge the sequences {R} and {C}; Repeat the steps 1-3 until {R} becomes empty or overflows a memory limit. Step 2 is performed simultaneously for all j by the following rules: If wj>wh or Pj= then Rj= . If wj= 1 then Cj=Rj . If 1<wjwh then Cj=[Rot(1)]Rj , Rj=[Rot(1)]Rj , cos  1 Rot(  1 )    sin  1  sin  1  . cos  1  Simply say, if Pj doesn’t bind with Rj , then Rj dyes, else Rj reproduces. If strength of the binding is the highest possible, then Rj creates a copy, else Rj reproduces in the two nearest types (mutates). Step 3 is performed simultaneously for all j by the following rules: If wj>wh or Pj= then Rj= . If wjwh then Cj=[Rot((1))]Rj , Rj=[Rot((1))]Rj . Simply say, if Pj doesn’t bind with Rj , then Rj dyes, else Rj reproduces with mutations. According to [19], the algorithm 2 implements another variant of FIN. Namely, it is the so-called BB(n, nh) network where several types of B-cells are generated and stored through interactions among themselves, in spite of the absence of any antigen. Step 3 includes the following sub-steps: Theorem 2. 3.1. Attach the sequence {C} to the end of the sequence {R}; 3.2. If Rk= for any k then shift Rj-1=Rj for any j>k; 3.3. Perform step 3.2 until Eq. (4) is satisfied; 3.3. Compute the length m of the resulting sequence {R}; 3.4. Make the sequence {C} empty. For any initial population of any BB(n, nh) network only one of the three regimes is possible: 1. Death of all B-cells; 2. Unlimited reproduction of B-cells; 3. Cyclic reproduction of the initial population (formal immune memory). According to [19], algorithm 1 implements the simplest variant of FIN. Namely, it is the so-called AB(n, nh) network, where the sequence {P} corresponds to antigens and the sequence {C} corresponds to B-cells. Theorem 3. Several mathematical results can be obtained for such AB networks [16]. Studies [16] show, that there exist variants of cyclic regimes with several periods and lengths of populations, including those, where the number of B-cells changes from population to population. Theorem 1. If all antigens in any AB(n, nh) network are of the same type, and at least one B-cell binds an antigen, then after a finite number of steps, for every antigen there will be corresponding matching B-cell. This result affirms that even the simplest variant of FIN models the mechanism by which antigens can control the reproducing and the death of B-cells. Besides, we have determined the conditions of arising and supporting of the (formal) immune response, which implies the B-cells' desire for acceptation of antigen's type. Consider now processing of the initial population {R} with empty initial population {C} by another algorithm. Algorithm 2. 1. 2. 3. 4. 5. Form the sequence {P: Pj-1=Rj, j=2,…m}; Compute wj between Pj and Rj by Eq. (3); Change values of Rj and Cj according to wj and wh ; Merge {R}{C} (see step 3 of the algorithm 1); Repeat the steps 1-4 until {R} becomes empty or overflows the memory limit. For any n there exists such threshold nh that at least one cyclic regime is possible in BB(n, nh) network. C. Formal Grammars Consider any vector as a coded FP [16]. Let wh=1. Then, according to Eq. (3), any FP: S(k), k=0,...,n1, can bind only with FP of the same type. Consider the following initial sequences {P}, {R}, and {C}: P : Pj  , j  n, Pj  , j  n, R : R0  , R j  , j  0, C : C j   j  m, C j  , j  m. Consider processing of the sequence {P} by the following algorithm. Algorithm 3. 1. 2. 3. Assign k=0; Compute wk between Pk and Rk; If wk=1 then change the sequence Pk, Pk+1, …, Pn to the sequence Ck+1, …, Ck+m, Pk+1, …, Pn, and assign n=n+m; 4. 5. Shift the sequence Ck, Ck+1 , …, Ck+m to Ck+1, Ck+2 , …, Ck+m+1, and the sequence Rk, Rk+1 , …, Rk+m to Rk+1, Rk+2 , …, Rk+m+1; While k<n assign k=k+1 and repeat steps 2-5. According to [19], the algorithm implements a variety of the so-called formal T-cell. Such T-cell has a receptor, the type of which is stored in R0. When the receptor is matched, T-cell becomes activated and synthesizes a sequence of FPs: P1 , …, Pm, the types of which are stored in C1 , …, Cm, correspondingly. Then the IC inserts this sequence instead of P0. Thus, moving along the sequence {P}, T-cell replaces every Pj if its type is matched with the type, stored in R0. k  Designate by S j the type of the vector, which is stored in Cj. Then the function of any T-cell can be described formally by the following rule: S k0   S k1  ... S k m  . (5) Consider a correspondence between types of vectors and symbols. For example, S(0)=’A’, S(1)=’B’, S(2)=’a’, S(3)=’b’, etc. Let us take a set of n+1 symbols: S(0),…, S(n). Let the set consist of two disjoint subsets: non-terminals, say S(0),…, S(k), and terminals S(k+1),…, S(n). Point out exactly one particular non-terminal (the so-called axiom), say S(0). Consider now a set of the rules (5), which satisfy the following conditions: 1. 2. Any symbol of the left side is non-terminal; There exists only one rule, which contains the axiom. According to [8, 16], such set of the rules (5) is equivalent to a context-free (CF) grammar. Hence, behavior of the set of corresponding T-cells can be also described by CF grammar. It is worth to note, according to [8], that the class of CF grammars is the most interesting class of formal grammars both for theory and applications. (classification) of the characteristic space. If the space is being partitioned to the known classes (e.g. by experts), then it is said about supervised learning. If the number of the classes kn and the classes themselves are unknown a priory, then it is said about unsupervised learning. The main feature of the IC approach to pattern recognition consists in treating an arbitrary pattern as a way of setting a binding energy by a bilinear form in Eq. (1). A mathematical basis of the approach is considered in details in our previous works [12, 16]. It based essentially on the properties of SVD of an arbitrary matrix over the field of real numbers. According to the approach the task of pattern recognition can be solved as follows. 1) Supervised Learning a) Folding vectors to matrices Fold vector X of dimension n1 to a matrix M of dimension nP nR=n. It has been shown strictly in [16], that such folding increases the specificity of recognition. b) Learning Form matrices M1,..., Mk for all classes c=1,...,k, and compute their singular vectors: {P1, R1} – for M1, ... , {Pk, Rk} – for Mk. c) Recognition Compute k values of binding energy for the every input pattern M: w1 = – P1TMR1 , ... , wk = – PkTMRk . Determine the class to be found by the minimal value of binding energy, according to Eq. (2). 2) Unsupervised learning. V. POTENTIAL APPLICATIONS Consider examples of potential applications of the IC. They include, but are not limited to, pattern recognition, information security, problems solving, and modeling of natural systems. A. Pattern Recognition Pattern recognition can be defined as follows. Let us treat real values x1, …, xn as a set of characteristics. Consider an arbitrary vector X=[ x1, …, xn]T as a pattern that belongs to a characteristic space {X}. Consider, that the space can be partitioned on the subsets (classes) {X}k , k=1, 2, …kn. Then recognition of X consists in determination of such class k that X {X}k, while learning consists in partition Consider the matrix M= [X1... Xm] T of dimension m  n formed by the m vectors (patterns) X1, ..., Xm. Consider the SVD of this matrix:  p11   p21    T M  s1  ...  R1  s2  ...  R2T  ... ,  p1n   p2 n  (6) where s1, s2 are the first two singular values, and R1, R2 are the right singular vectors. Note, that every string i of the matrix M represents the values xij of n characteristics of the pattern Xi, where i=1,…,m and j=1,…,n. Hence, according to [16], the components p1i, p2i of the left singular vectors P1, P2 satisfy to the following equations: p1i  X iT 1 1 R1 , p2i  X iT R2 . s1 s2 (7) Comparison of Eq. (1) and Eq. (7) makes obvious, that the IC is able to compute the components p1i, p2i as binding energies w1i, w2i between Xi and R1, R2, correspondingly. Thus, every vector Xi with n characteristics is mapped to only two values of binding energies. Such mapping gives a mathematically rigorous way to represent and view all patterns, with no matter how many characteristics, as points in two-dimensional space of binding energies {w1, w2}. This plane could be treated also as a shape space of the IC, according to [7]. Such representation of patterns in the shape space of the IC allows classify them in a very natural way by the groups (clusters) of the neighboring points. The IC using unsupervised learning can perform by experts using supervised learning as well as such classifying. The approach has already appeared to be useful in solving a number of important practical tasks, including detection of dangerous ballistic situations in near-Earth space [17], complex evaluation of ecological and medical indicators in Russia [11, 12], and prediction of danger by space-time dynamics of the plague infection in Central Asia [20]. For example, consider a task of complex ecological evaluation [12]. The task is the following. Let a set of special ecological characteristics, also called indicators (SEI), be given. It is required to find its complex ecological characteristic, also called index (CEI). In other words, it is required to classify this set by assigning an index (class) denoted, usually, as a number 1,2,3, and so on. The general solution of the task comprises two stages: learning and recognition. The stage of learning comprises choosing a set of typical patterns of SEI. These patterns may be the results of monitoring areas with known CEI, or data, determined by experts. Then, using such data, several samples of SEI are formed for every class of CEI. At the stage of recognition a testing set of SEI (probes) is compared with the samples. Thus, the sample pattern, which is the most resembling to the testing pattern, is the CEI. The described approach has been used to solve important practical tasks. Among them there are detecting of the detailed correlation and casual relationships between the quality of environment and the children's morbidity in Tula city [12] and computing the CEI map of Kaliningrad city [11]. The results obtained so far [12, 17, 20] show, that this approach to pattern recognition is rather powerful, robust and flexible. It is able to give rather fine classification and sharply focus attention on the most dangerous situations, which is beyond the possibilities of the traditional statistics. B. Information Security Like in the natural immune system, the problem of protecting computer systems from malicious intrusions can similarly be viewed as the problem of distinguishing “self “ from dangerous "other" (or "non-self") and eliminating this "other". In this case the "non-self" may be an unauthorized user, foreign code in the form of a computer virus or worm, unanticipated code in the form of a Trojan horse, or corrupted data, etc. According to [10], the information security could be completely specified based on the abstract representation of "self" and "non-self" as sets of bit strings, designated even as "proteins" and "peptides". For example, "protein" could be a sequence of viral bytes in a legitimate program, or a "signature" of computer virus. To preserve generality, in [10] it has been proposed to represent both the protected system (self) and infectious agents (non-self) as dynamically changing sets of bit strings, because in cells of the body the profile of expressed proteins (self) changes over time. Besides, "peptide" for a computer system is defined in terms of short sequences of system calls executed by privileged processes in a networked operating system. Preliminary experiments on a limited testbed of intrusions and other anomalous behavior [10] show that short sequences of system calls (currently sequences of length 6) provide a compact signature for self that distinguishes normal from abnormal behavior. By this analogy proteins can be thought of as "the running code" of the body while peptides serve as indicators of behavior. Consider now that vector X represent a set of information security indicators. For example, it can be a bit string of a legitimate program, a signature of computer virus, a coded sequence of system calls, statistics of current activity of the network, etc. Consider a space {X} of such indicators, partitioned to the k subspaces (classes) {X}1,...,{X}k. For example, it can be simply k=2, where {X}1 is normal behavior and {X}2 is "infection". Then, as we have a concrete vector X, the task consists in determining it's class c={X}c where c=1,...,k. Thus the problem is reduced to the pattern recognition, mentioned above. In addition, the IC could be also applied to some other issues of information security. Consider, for example, data hiding and encryption. According to [4], data hiding, a form of steganography, embeds data into digital media for the purpose of identification, annotation and copyright. It represents a class of processes used to embed data, such as copyright information, into various forms of media such as image, audio, or text with a minimum amount of degradation to the "host" signal; i.e., the embedded data should be invisible and inaudible to a human observer. Note that data hiding, while similar to compression, is distinct from encryption. Its goal is not to restrict or regulate access to the host signal, but to ensure that embedded data remain inviolate and recoverable. For an example of data hiding by the IC, consider that matrix M represents an initial data array. It could be an image, a folded audio signal, etc. Consider the SVD of the matrix in the form of Eq. (6). Let us add to this sum an item in the form sr+1Pr+1RTr+1, where r is a rank of the matrix, Pr+1, Rr+1 are unit vectors, sr is a minimal singular value of the matrix, and sr+1<sr. According to [16], such addition only slightly disturbs the matrix. Although such disturbance is invisible or inaudible to a human observer, the presence of the "hidden" addition can be surely detected in the shape space of the IC. In this case the IC functions like the natural immune system, which verifies identity by the presence of peptides, or protein fragments. Consider now data encryption. In modern cryptography, encrypting of information is based on a widely known algorithm and a number or string, called a key, which is kept in secret. The key is used as a parameter to the algorithm to encrypt and decrypt the data. Decryption with the key is simple, but without the key is very difficult and in some cases nearly impossible. Therefore the "fundamental rule of cryptography" is that both the sending and receiving sides know the method of encryption [14]. As an example of encryption by the IC, consider BBnetworks. Specifically, in the network BB(10,2) for any type i=0,...,9 there exist populations of the type Let algorithm 3 be also modified to implement such T-cell with a name corresponding to a type, stored in R1, and with receptors, the types of which are stored in R2 , …, Rm (right side of the rule). When all receptors are matched, T-cell is activated, synthesizes FP of the type stored in C0 , and puts the corresponding vector into P0 (left side of the rule). Let us add also T-cells of two specific types. Consider we have an “initial” rule for some type k1: S ( 0 )  S  k1  , (9) and a set of “terminal” rules for some of the types k2,…,km S k2   S k2  , ..., S km   S km  . (10) According to [19], S(0) can be regarded as corresponding to an antigen, while rules (10) correspond to T-cells that synthesize FPs independently from binding with any FP. According to [16], a set of T-cells described by rules (8)(10) is equivalent to a special kind of attributive CF grammar, where antigen corresponds to the axiom of the grammar, types of R2, …, Rm – to non-terminals, types of R1, …, Rm – to terminals, names in angle brackets – to synthesized attributes. This method can produce some kind of grammars for solving tasks as inference engine. For example, consider triangle ABC in Fig. 3 with angles A, B, C, and sides a, b, c. S ( i  2 ) S ( i ) S ( i 2 ) S ( i ) , which is cyclic with the period 4, according to[16]. For example, 1979 187800 1770991 17980 1979 ... , where the type S(i) is denoted by only one number i. Consider now the numbers 10 and 2 as a key which defines the network BB(10,2). Then the string 1979 could encrypt the string 1770991. Knowing the key, the data can be decrypted, say, as the string of the maximal length, generated by the network BB(10,2) from the given string 1979. Although the example seems rather simple, it shows the principal possibility of using the IC in cryptography. C. Problem Solving Fig. 3. Triangle It is known that parameters of any triangle satisfy to the following equations: A + B + C =  (theorem of angles), a b c   (theorem of sines), sin A sin B sin C a2 =b2 + c2  2 (bc)cosA (theorem of co-sines), b2 = a2 + c2  2 (ac)cosB (theorem of co-sines), c2 = a2 + b2  2 (ab)cosC (theorem of co-sines). Consider a modification of the rule (5): S k0   S k1  S k2  ... S km  . (8) Hence, a model of triangle for the IC could be the following: S(1)=A, S(2)=B, S(3)=C, S(4)=a, S(5)=b, S(6)=c, T(1)=Tang, T(2)=Tsin, T(3)=Tcos, S(1) <T(1)>S(2) S(3), S(1) <T(2)>S(2) S(4) S(5), S(2) <T(1)>S(1) S(3), S(2) <T(2)>S(3) S(5) S(6), S(3) <T(1)>S(1) S(2), S(3) <T(2)>S(1) S(6) S(4), S(4)<T(3)>S(1) S(5) S(6), S(5)<T(3)>S(2) S(4) S(6), S(6)<T(3)>S(3)S(4) S(5), D. Modeling of Natural Systems Consider the following task: find a, when C, b, c (circled in Fig. 1), are given: S(0) S(4), S(3) < S(3)>, S(5)  < S(5)>, S(6)  < S(6)>. (11) (12) The IC can solve the task in the following way. Firstly, given FPs, which correspond to the rules (12), activate Tcell that synthesizes FP of the type S(2): S(2) <T(2)>S(3) S(5) S(6). Secondly, this FP binds the receptor of the same type, and together with given S(3), activates the corresponding T-cell that synthesizes S(1): S(1) <T(1)>S(2) S(3). Thirdly, this FP together with given S(5) and S(6) activates the corresponding T-cell that synthesizes S(4): S(4)<T(3)>S(1) S(5) S(6). Finally, this S(4) activates T-cell of the rule (11), which gives the following solution of the task. S(0)<T(3)><T(1)><T(2)>S(3)S(5)S(6)S(3)S(5)S(6). In usual designation it means a = <Tcos><Tang><Tsin>CbcCbc. Thus, the IC has synthesized the following solution: 1. 2. 3. Find angle B by given angle C and sides b and c using the theorem of sines; Find angle A by known angles B and C using the theorem of angles; Find side a by known angle A and sides b and c using the theorem of co-sines. Moreover, the IC gives the solution in the so-called “prefix Polish notation”, which can be interpreted strictly in the program of computations. Although this geometrical example seems to be rather simple, it shows the general principles of using the IC as a problem solver. Namely, the IC would represent a kind of engine, where inference simulates behavior of immune networks. It is known, that proteins represent chains consisting of 20 basic amino acids, like words consist of letters of alphabet. Usually, it escapes attention that this number is approximately equal to the number of letters in the alphabets of the so-called “classical” Indo-European languages (e.g. the Italian alphabet consists of 21 letters). But the similar analogy induces an idea, that the IC could be used also for linguistic modeling. Specifically, consider T-cells, corresponding to Eq. (8), which bind FPs by the receptors. Consider behavior of such T-cells as a model of formation of “correct words” (morphology) and/or “correct sentences” (syntax). Of particular interest is the fact that there is a rather advanced linguistic model of language, as if it was developed especially for the proposing IC. It is L.Teniere's theory of linguistic valence [21]. This theory stays somewhat isolated in linguistics, because it differs strongly from the widespread generative (or formal) grammars of N.Chomsky [8]. Meanwhile, on the biological level nothing like “innate grammars”, postulated by Chomsky, has been detected. Moreover, the existence of such grammars is rather problematical. At the same time, the theory of linguistic valence considers ability of a word to enter into syntactic relations with other elements based on the straight analogy with chemical interactions, even fixed in the name of the theory. It is worth to note, that such way allows unite digital computations with language representation by the IC. At the same time, representation of linguistic knowledge is a very serious problem for neurocomputers. The IC could be also a perspective device for simulating the natural immune system, including important deceases (e.g. AIDS). This simulation is essentially based on the hardware implementation of FIN. As it was shown above, even the simplest variants of FIN possess the inherent properties of immune response and immune memory. For example, Theorems 1-3 affirm that one dimensional FIN with small number of FP’s types is able to demonstrate such important effects as: 1. 2. Immune response in AB-networks under the control of antigens; Immune memory and generation of new immune repertoire in the absence of outer antigens by means of the cyclic regimes in BB-networks. If one-dimensional FIN still yields to a pure mathematics, then two-dimensional FIN is already much more fuzzy. Investigation of its properties is practically impossible without a computer simulation. Simultaneously, such FIN's properties seem more close to the ones of the natural immune system. No wonder that recently used biochips are also two-dimensional [9, 13]. On the other hand, the mathematical basis of the IC relies on the notion of FP. According to [15], the features of FP give opportunity not to move far away from natural protein, as artificial neuron did from its biological prototype. At any case, modeling of the natural immune system by the IC seems more promising then by neurocomputers or even cellular automata with discrete states. [2] In addition, a promising approach to modeling continuous states dynamics of natural systems could be connected with the IC by the so-called Cellular Immune Networks. Such networks have been introduced in [18] as a combination of FIN with hybrid cellular automata. Their application for particular task of virtual clothing gives almost three-fold speed up comparing to traditional methods of computations (numeric integration, finite elements method, etc.). [7] VI. CONCLUSION [12] Although the present paper gives only a sketch of the IC, we should like to highlight three features, which make the way towards the IC especially promising: 1. 2. 3. Highly appropriate biological prototype of immune networks; Rigorous mathematical basis of FIN; Possibility of hardware implementation by a special immune chip. Such implementation could raise artificial immune systems as well as their principal applications (e.g. to information security) on the new level of reliability, flexibility and operating speed. [3] [4] [5] [6] [8] [9] [10] [11] [13] [14] [15] [16] [17] [18] On the other hand, there matures a strong need to overcome main disadvantages of the neural networks’ models, including spurious patterns, small storing capacities comparatively to the dimension of networks, non-localized errors, etc. The matter is that they block wide application of neurocomputers in those fields where the cost of single error is too high (e.g. aviation, medicine, information security). But the natural immune networks successfully protect organism namely from such dangerous “errors” and invaders. This allows hope that the IC in perspective would be able to play the similar role in control systems and computer networks. Acknowledgement This work was supported by the EU in the frame of the project IMCOMP IST-2000-26016. References [1] Adamatzky A. Universal computation in excitable media: the 2+medium. Advanced materials for optics and electronics, 1997, v.7, pp.263-272. [19] [20] [21] [22] [23] Agnati L.F. Human brain in science and culture. Casa Editrice Ambrociana, Milano, 1998 (in Italian). Artificial immune systems and their applications (ed. D.Dasgupta). Springer-Verlag, Berlin, 1999. Bender W., Gruhl D., Morimoto N. and Lu A. Techniques for data hiding. IBM Systems J., v.35, no.3-4, 1996, pp.313-336. Cantor C. and Schimmel P. Biophysical chemistry. - W.H. Freeman, San Francisco, CA, 1980. Coutinho A. Immunology: the heritage of the past. Letters of the Institute of L.Pasteur. Paris, 1994, no.8 , pp.26-29 (in French). DeBoer R.J., Segel L.A. and Perelson A.S. Pattern formation in one and two-dimensional shape space models of the immune system.J. Theoret. Biol., 1992, no.155, pp.295-333. Ginsburg S. The mathematical theory of context-free languages. Mc Graw-hill, NY, 1966. Ekins R. and Chu F.W. Microarrays: their origins and applications. Trends in Biotechnology, 1999, 17, pp.217-218. Forrest S., Hofmeyer S. and Somayaji A. Computer immunology. Communication of the ACM, v.40, no.10, 1997, pp.88-96. Kuznetsov V.I., Gubanov A.F., Kuznetsov V.V., Tarakanov A.O., Tchertov O.G. Map of complex ecological evaluation of Kaliningrad city environment. In: Kaliningrad. Ecological atlas (11 maps), 1999 (in Russian and English). Kuznetsov V.I., Milyaev V.B. and Tarakanov A.O. Mathematical basis of complex ecological evaluation.- St.Petersburg University Press, 1999. MacBeath G. and Schreiber S.L. Printing Proteins as Microarrays for High-Throughput Function Determination. Science, 2000, September 8; 289(5485): pp. 1760-1763. Tannenbaum A.S. Computer networks. Prentice Hall (3rd Edition), 1996. Tarakanov A.O. Mathematical models of biomolecular information processing: formal peptide instead of formal neuron. Problems of Informatization, 1998, no.1, pp.46-51 (in Russian). Tarakanov A.O. Mathematical models of information processing based on the results of self-assembly. Thesis for the Doctor of Sciences degree in physics & mathematics. St.Petersburg, Russia, 1999 (in Russian). Tarakanov A. Formal peptide as a basic agent of immune networks: from natural prototype to mathematical theory and applications. Proc. of the 1st Int. Workshop of Central and Eastern Europe on Multi-Agent Systems (CEEMAS’99), St.Petersburg, Russia, 1999, pp.281-292. Tarakanov A. and Adamatzky A. Virtual clothing in hybrid cellular automata. 2000, http://www.ias.uwe.ac.uk/~aadamat/clothing/cloth_06.htm Tarakanov A. and Dasgupta D. A formal model of an artificial immune system. BioSystems, 2000, vol.55(1-3), pp.151-158. Tarakanov A., Sokolova S., Abramov B. and Aikimbayev A. Immunocomputing of the natural plague foci. Proc. of the Genetic and Evolutionary Computation Conference (GECCO-2000), Workshop on Artificial Immune Systems, Las Vegas, USA, 2000, pp.38-39. Teniere L. Basis of structural syntax.- Moscow, 1998 (in Russian, translated from French). Toffoli T. and Margolus N. Cellular automata machines. London, MIT Press, 1987. Wasserman P. Neural computing. Theory and practice. New York: Van Nostrand Reihold, 1990.

T 1

Related documents

Products

Support

T 1

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib