3. Artificial immune systems

advertisement
Immunocomputing: a survey
I.Antoniou, S.Gutnikov, V.Ivanov, Yu.Melnikov, A.Tarakanov
International Solvay Institutes for Physics and Chemistry, Campus Plaine ULB, CP231,
Bd.du Triomphe, Brussels 1050, Belgium
The University of Oxford, Department of Biochemistry, South Park Road, Oxford OX1
3QU, United Kingdom
Amtel Systems Overseas Ltd., PO Box 307, Circular Road 19/21, Douglas IM99 2BE,
United Kingdom
Abstract
The recently appeared notion of the immunocomputing is currently under
implementation in the frame of the EU project IMCOMP. The aim of this project is to
create a new kind of computational paradigm based on some principles of information
processing by proteins and immune networks in the living nature. This paradigm will be
used for solving specific complex problems and protection from computer viruses,
intruder attacks, noise, and random errors. The implementation of this immunocomputing
paradigm will lead to development of a new kind of computer that we propose to call
immunocomputer by analogy to the widely spread neurocomputers based on the models
of neurons and neural networks. The objective of this review is to compare IMCOMP
with the existing approaches in computer science and to highlight its novelty and
advantages.
1. Introduction
Biological systems, even at the levels of cells and biomolecules, can be regarded as
sophisticated information processing systems and can provide inspiration for various
ideas in engineering and technology. However, there are only two systems in animals that
possess extraordinary capabilities of information processing such as learning and
memory, ability to recognize patterns and to make decisions about how to behave in an
unfamiliar environment. These two are (1) nervous system and (2) immune system.
The animal nervous system has been already intensively used in computer science as a
biological prototype for mathematical algorithms of artificial neural networks (ANN).
Software, based on ANN, has been created and found its hardware implementation in
neural computers [20,27].
1
However, the extraordinary information processing capabilities of the natural immune
system has been appreciated only recently. The aim of the IMCOMP project is to create
mathematical algorithms based on the principles of functioning of the natural immune
systems and to develop software and hardware implementation of these algorithms. In
this paper we present the overview of the IMCOMP project.
An introduction to the main principles of functioning of the natural immune systems is
given in Section 2. Section 3 contains a brief review of Artificial Immune Systems (AIS)
and their applications, as this is the field of computer science closest to IMCOMP. The
basic elements and main functional principles of immunocomputing are described in
Section 4. These include mathematical background (4.1), principles of information
processing and their application for the solution of some data processing problems (4.2),
and a brief description of a prototype of the immuno-chip: the basic element of the future
immunocomputer (4.3). Finally, in Section 5 the main innovations and objectives of the
IMCOMP project are discussed.
2. Overview of the natural immune system
The word immunity (from Latin immunitas) means "freedom from". The main purpose
of the immune system is to keep the organism free from unfriendly foreign organisms,
cells, or molecules (collectively called pathogens). The organism’s defense against
intrusion is multilayered. First there are mechanical barriers: skin, mucus of the
respiratory, digestive and urogenital tracts, tears, etc. The second barrier is
environmental: excreted body fluids (sweat, saliva, tears, etc.) have physical and
chemical characteristics that provide inappropriate living conditions for many pathogens.
Pathogens that managed to pass the first lines of defense and enter the body are handled
by the immune system
Non-specific innate defence mechanisms exist; this innate immunity is primarily
maintained by circulating scavenger cells such as macrophages that ingest extracellular
molecules and materials, clearing the system of both pathogens and debris. For the most
efficient protection, the specific acquired immunity based on recognition and selective
targeting "non-self" patterns has evolved. Acquired immunity is based on a sophisticated
physiological mechanism that involves many different types of cells and molecules. It is
also called adaptive because it is responsible for immunity that is adaptively acquired
during the lifetime of the organism. An important part of adaptive immunity is the ability
of the immune system to "memorise" encountered pathogens and to produce enhanced
response in the case of repeated intrusion of the same or similar pathogen.
Parts of the pathogen that are recognised by the immune system are called antigens. A
single pathogen, e.g. bacteria, may contain a large number of different antigens. The
adaptive immune system can be viewed as a distributed detection system, which consists
primarily of white blood cells, called lymphocytes that circulate through the body in the
blood and lymph. Detection, or recognition, occurs when molecular bonds are formed
between antigens and receptors that cover the surface of the lymphocyte. When antigen is
detected, a mechanism is triggered that causes proliferation of cells producing antibodies
capable of selective binding to that particular antigen. When an antigen is bound with the
antibody, its carrier pathogen becomes a target for destruction by macrophages.
2
Both antigens and cell receptors are molecules of protein nature. The immune system's
pattern recognition mechanism must be highly effective: it can distinguish about 105
"self" proteins from more those 1016 "non-self" ones. This powerful recognition
mechanism is a property of the immune system as a whole, not that of a single
lymphocyte. Each lymphocyte has on its surface receptors of only one type and hence it
can recognise only one antigen.
The ability to detect most pathogens requires a huge diversity of lymphocyte
receptors, which is achieved by generating lymphocyte receptors through genetic process
that provides a huge amount of randomness. When in this random process a lymphocyte
with receptors to a "self" protein is created, that lymphocyte is eliminated before it
matures. Thus, only lymphocytes with receptors to "non-self" are released into
circulation. In this respect, lymphocytes can be viewed as negative detectors, because
they detect only “non-self “ patterns, and ignore “self” patterns.
Even though receptors are randomly generated, there are not enough lymphocytes in
the body to provide a complete coverage of the space of all possible antigens: one
estimate is that there are some 108 different lymphocyte receptors in the body at any
given time, while the potential number of antigens is in the order of 1016. Immune
protection is a probabilistic process. First, pathogens usually have several different
antigens, so there is a chance that at least some of them will be recognised and that is
sufficient for triggering immune response. Second, protection is made dynamic by
continual circulation of lymphocytes through the body, and by continual turnover of the
lymphocyte population. Lymphocytes are typically short-lived (several days) and are
continually replaced with new lymphocytes that have new randomly generated receptors.
Finally, if by misfortune the immune system of a single organism fails to recognise and
resist infection, there is sufficient probability that other organisms in the population will
have appropriate detectors at the time of infection and this is sufficient for survival of the
species.
The immune learning and memory achieve a more efficient protection against a
specific pathogen. If immune system detects an antigen it had not encountered before, it
undergoes a primary response, during which it “learns” to recognise that specific antigen
more effectively, i.e. it produces a large number of lymphocytes with high affinity for
that antigen, through a process called affinity maturation. These so called memory cells
remain in circulation and provide faster detection and elimination of the pathogen at the
next encounter.
Summary. The natural immune system has many features that are desirable from a
computer science standpoint. The system is massively parallel and its functioning is truly
distributed. Individual components are disposable and unreliable, yet the system as a
whole is robust. Previously encountered infections are detected and eliminated quickly,
while novel intrusions are detected on a slower time scale, using a variety of adaptive
mechanisms. The system is autonomous, controlling its own behaviour both at the
detector and effector levels. Individual organism's immune systems detect infections in
slightly different ways, so pathogens that are able to evade the defences of one organism
cannot necessarily evade those of every other organism in population.
3
3. Artificial immune systems
The most close to IMCOMP is the field of Artificial Immune Systems (AIS). The
formation of this field could be seen as completed in 1999 when the fist book on the
question has been issued [2].
AIS represent the new and rapidly growing field of computer science. AIS are
expected to give rise to powerful and robust information processing capabilities for
solving complex problems. Like ANN, AIS can learn new information, recall previously
learned information and perform pattern recognition in a highly decentralized fashion.
AIS have already been applied in:
– detection of faults in manufacturing
– security of information
– design of vaccines
– control of autonomous mobile robots
– mining of commercial data
– monitoring of plague foci in Central Asia.
3.1. Immune Network Model
Of special interest is the widespread theory of immune networks, formed from the
interactions as well as between antibodies and immune cells. Niels Jerne, who worked in
the Institute Pasteur of Paris, proposed in 1973 the general theory of idiotypic networks,
also called as immune networks [18]. These theories is based on the concept that immune
cells (lymphocytes) are not isolated, but communicate with each other among different
species of lymphocytes through interaction among antibodies. Accordingly, the
identification of antigens is not done by a single recognizing set but rather a system level
recognition of the sets connected by antigen-antibody reaction as a network.
Nowadays the existence of the immune networks is established beyond all doubts.
Their fragments and interactions have been detected experimentally. It is worth to note
that similar networks under the name molecular circuits have been even proposed as a
possible molecular basis of neuronal memory in the human brain.
Jerne's immune network theory received a lot of attention among the researchers over
the last two decades and many computational aspects of this model are derived for
practical use.
From the mathematical viewpoint namely N.Jerne initiated the development of a
rigorous framework to modelling immune system. His theory is modelled with
differential equations, which simulate the dynamics of lymphocytes.
Based on Jerne's work, Perelson [22] presented a probabilistic approach to idiotypic
networks. His approach is very mathematical, discussing more about phase transition in
idiotype networks.
4
3.2. Negative Selection Algorithm
Forrest et. al. [12] developed a negative-selection algorithm for change detection
based on the principles of self-nonself discrimination in the immune system.
This approach can be summarised as follows:
1.
Define self as a collection S of strings of length l over a finite alphabet, a
collection that needs to be protected or monitor. For example, S may be normal
pattern (program, data file) of activity, which is segmented into equal-sized substrings.
2.
Generate a set R of detectors, each of which fails to match any string in S.
Instead of exact or perfect matching, the method uses a partial matching rule, in
which two strings match if and only if they are identical at least r contiguous
positions, where r is a suitable chosen parameter.
3.
Monitor S for changes by continually matching the detectors in R against S. If
any detector ever matches, then a change is known to have occurred, because the
detectors are designed to match any of the original strings in S.
The algorithm seems to have many potential applications in change-detection.
3.3. Other Models
There exist other computational models [10,11] which emulate different
immunological aspects, for example, its ability to detect common patterns in a noisy
environment, its ability to discover and maintain coverage of diverse pattern classes, and
its ability to learn effectively, even when not all antibodies are expressed and not all
antigens are presented. Hoffman has compared the immune system and the nervous
system, and has found many similarities at the level of system behaviour. Farmer et al.
[10], and Bersini and Varela [3] have compared the immune system with learning
classifier systems. Gilbert and Routen [14] experimented with immune network model to
create a content-addressable auto-associative memory, specifically for image recognition.
3.4. Some Applications
The models based on immune system principles are finding increasing applications in
the fields of science and engineering.
3.4.1. Computer Security
S.Forrest and her group at the University of New Mexico are working on a research
project with a long-term goal to build an artificial immune system for computers. Their
computer immune system has to protect a computer against non-authorized use of
computer facilities, maintain the integrity of data files, and prevent the spread of
computer viruses. Their research program is based on the negative-selection algorithm.
5
3.4.2. Anomaly Detection in time series data
Dasgupta and Forrest [6] experimented with several time series data sets (both real and
model) to investigate the performance of the negative-selection algorithm for detecting
anomaly in the data series. The objective of this work is to develop an efficient algorithm
that can be used for noticing any changes in steady-state characteristics of a system or a
process. In this case, the notion of self is considered as the normal behaviour patterns of
the monitored system. Any deviation that exceeds an allowable variation in the observed
data is considered as an anomaly in the behaviour pattern.
The results have shown that this approach can be used as a tool for automated
monitoring of safety-critical operations.
3.4.3. Fault Diagnosis
Ishida [16] studied the mutual recognition feature of the immune network model for
fault diagnosis. In his implementation, fault tolerance was attained by mutual recognition
of interconnected units in the studied plant. That is, system level recognition was
achieved by unit level recognition. The results are very promising and worth further
investigation.
Ishiguro et al. [17] applied the immune network model to on-line fault diagnosis of
plant systems. This work attempts to develop an integrated fault diagnosis method, which
can be used in industrial plants.
3.4.4 AIS for Pattern Recognition
Hunt and Cooke [1996] investigated an AIS based on the theory of immune network
within the context of machine learning. Such a system combines the advantages of
learning classifier systems
with some of the advantages of neural networks, machine induction and case-based
retrieval. They have shown the potential of AIS on a pattern recognition problem, namely
the recognition of promoters in DNA sequences.
3.5. Summary
AIS are a subject of great research interest because of their powerful information
processing capabilities. In particular, they perform many complex computations in a
completely parallel and distributed fashion. Like ANN, AIS can learn new information,
recall previously learned information and performs pattern recognition tasks in a highly
decentralized fashion. Also learning takes place by evolutionary processes similar to
evolutionary computations.
There are many potential application areas in which immunity-based computational
models appear to be very useful.
However, a comparison with ANN shows that the field of AIS has not yet:
1.
A clear and sound mathematical basis
2.
Hardware implementation analogous to the existing neurocomputers that were
based on ANN.
6
Nowadays AIS is represented by software tools based on heuristic algorithms, using ideas
from genetic algorithms, cellular automata, ANN, etc. Thus, solving the above problems
could raise AIS as well as their principal applications (e.g. to information security) on the
new level of reliability, flexibility and operating speed.
4. Immunocomputing
The natural immune system is based on interaction of proteins. The main goal of the
IMCOMP is to implement the principles of information processing by proteins and
immune networks in a new kind of computational paradigm in order to solve specific
complex problems while protected from viruses, noise, errors and intrusions. We shall
demonstrate that our immunocomputing leads to a new kind of computer, we propose to
call immunocomputer by analogy to the widely spread neurocomputers, which are based
on the models of neurons and neural networks.
Three main innovations are expected to emerge from the IMCOMP project:
1. Appropriate mathematical framework (formal immune networks);
2. New approach to information processing (immunocomputing);
3. New hardware (immuno-chips).
These are discussed in detail below.
4.1. Appropriate mathematical framework.
According to biological prototypes and their mathematical models [23-26], the
principal difference between IMCOMP and other types of computations should be
determined by functions of their basic elements. For example, if artificial neuron, as a
basic element of ANN and neural computing is considered as a summation with a
threshold, connected with fixed neurons [27], then protein as a basic element of the
IMCOMP ensures quite other conditions [4]:
 Spatial conformation of protein is determined by the linear sequence (word) of its
amino acid’s code;
 This conformation determines functions of any protein.
In fact, there is no mathematical models even approach to these demands. Thus we
need to develop a new concept of formal protein (FP) as a mathematical abstraction for
key biophysical mechanisms of natural proteins’ behavior. The FP has the same
importance for IMCOMP as the well-known concept of artificial (or formal) neuron has
for the neural computing.
Namely in the frame of interaction between formal proteins, we intend to develop the
new concept of Formal Immune Networks (FIN) and demonstrate rigorously, that such
networks are able to learn, recognize and solve problems like artificial intelligence
systems.
The most close to FIN could be considered mathematical models based on the theory
of idiotypic networks of N.Jerne. His theory can be modeled also with differential
equations, which simulates the dynamics of lymphocytes – the increase or decrease of the
concentration of a set of lymphocyte clones and the corresponding immunoglobines.
7
However such approach doesn’t consider the concrete mechanisms of interactions
between biomolecules and cells. So it couldn’t form a basis for a new approach to
information processing.
4.2. New approach to information processing (immunocomputing).
As an information processing approach, IMCOMP gives rise to the following
innovations:
1. New methods for pattern recognition and data mining based on the principles of
biomolecular recognition (binding energy of proteins);
2. New methods for synchronization of events in computer networks based on
biomolecular principles of self-synchronization (biomolecular messengers);
3. New methods for simulating dynamics of natural processes in 3D modeling based
on the principles of biomolecular interaction (excitable lattices of biomolecules);
4. New methods for language representation and problem solving based on the theory
of linguistic valence (behavior of words is equivalent to the behavior of
biomolecules);
4.3. New hardware (immuno-chips).
As a new approach to information processing IMCOMP needs hardware
implementation in a special type of electronic scheme – immuno-chip. The matter is that
architectures of the traditional PC or neuro-computers are not convenient for fast and
distributed immuno-computations. Apparently, the most appropriate architecture of the
immuno-chip can be developed using the analogy with the architecture of the modern
biochips or microarrays [9,21].
5. Discussion of follow-up
The following innovations, are expected to be achieved through immunocomputing as
a follow-up of the IMCOMP project:



Immunocomputers would be able to overcome the main drawbacks of
neurocomputers (spurious patterns, low capacity in relation to the size of neural
network, difficulty with location of errors). These drawbacks block the wide
application of neurocomputers in fields, where errors cost too much, like control
and navigation of spacecrafts, aircrafts, ships, submarines, security systems,
intensive care medicine.
Immunocomputers could provide an effective simulation of the natural immune
system and aspects of relevant diseases, as AIDS. Even the simplest variants of
formal immune networks effectively simulate important properties of immune
response and immune memory.
We expect that in the field of diagnostics (fault detection) for spaceships,
aircrafts, nuclear power plants and ecology it will be possible to:
- deal with huge amounts of data in hard time constraints;
8
- detect early and reliably critical situations, errors and faults;
- overcome neurocomputing difficulties.
 In the field of information security for computer networks, the development of:
- self-learning security systems to resist unknown invaders (viruses,
unauthorized users);
- software/hardware implementation of security systems.
 In the field of control of mobile objects (robots, etc.), the improvement of
reliability and flexibility of system behavior in unpredictable situations.
 In the field of data mining, the detection of small deviations and errors from
normal behavior in large amount of data (credit card, mortgage fraud detection).
In the field of management of complex socio-ecological systems, the development of
integrated approaches to modeling of interactions between population and environment
based on the resilience concept
Acknowledgements
This work was supported by the Commission of the European Communities in the
frame of the Contract IST-2000-26016 IMCOMP.
References
1. Agnati L.F. Human brain in science and culture. Casa Editrice Ambrociana, Milano,
1998 (in Italian).
2. Artificial immune systems and their applications (ed. D.Dasgupta). Springer-Verlag,
Berlin, 1999.
3. Bersini H. and Varela F. Hints for adaptive problem solving gleaned from immune
networks. Proc. of the 1st workshop on Parallel Problem Solving from Nature, 1990,
343-354.
4. Bohinski R. Modern concepts in biochemistry. Allyn and Bacon, Boston, 1983.
5. Dasgupta D. and Attoh-Okine N. Immunity-based systems: a survey. Proc. of the
IEEE Int. Conf. on Systems, Man and Cybernetics. Orlando, USA, 1997.
6. Dasgupta D. and Forrest S. Novelty detection in time series data using ideas from
immunology. ISCA (th Int. Conf. on Intelligent Systems. Reno, USA, 1996.
7. DeBoer R.J., Segel L.A. and Perelson A.S. Pattern formation in one and twodimensional shape space models of the immune system.- J. Theoret. Biol., 1992, 155,
295-333.
8. Coutinho A. Immunology: the heritage of the past. Letters of the L.Pasteur Institute of
Paris, 1994, 8, 26-29 (in French).
9. Ekins R. and Chu F.W. Microarrays: their origins and applications. Trends in
Biotechnology, 1999, 17, 217-218.
10. Farmer J.D., Packard N.H. and Perelson A.S. The immune system, adaptation and
machine learning. Physica D, 1986, 22, 187-204.
9
11. Forrest S., Javornik B., Smith R. and Perelson A. Using genetic algorithms to explore
pattern recognition in the immune system. Evolutionary Computation, 1993, 1(3),
191-211.
12. Forrest S., Perelson A. Aleen L. and Cherukuri R. Self-nonself disctimination in a
computer. Proc. of IEEE symposium on reseqrch in security and privacy. Oakland,
USA, 1994, 202-212.
13. Forrest S., Hofmeyer S. and Somayaji A. Computer immunology. Communication of
the ACM, 1997, 40(10), 88-96.
14. Gilbert C. and Routen T. Associative memory in an immune-based system. Proc. of
the 12th Nat. Conf. on Artificial Intelligence. Seattle, USA, 1994, 852-857.
15. Hunt J. and Cooke D. Learning using an artificial immune system. J. of Network and
Computer Applications, 1996, 19, 189-212.
16. Ishida Y. An immune network model and its applications to process diagnosis.
Systems and Computers in Japan, 1993, 24(6), 38-45.
17. Ishiguru A., Watanabe Y. and Ychikawa Y. Fault diagnosis of plant system using
immune networks. Proc. of the IEEE Int. Conf. on Multisensor Fusion and Integration
for Intelligent Systems. Las Vegas, USA, 1994, 34-42.
18. Jerne N.K. The immune system. Scientific American, 1973, 229(1), 52-60.
19. Jerne N.K. Towards a network theory of the immune system. Ann. Immunomol. (Inst.
Pasteur), 1974, 125, 373-389
20. Haykin S. Neural networks: a comprehensive foundation. Prentice Hall Inc., 1999.
21. MacBeath G. and Schreiber S.L. Printing Proteins as Microarrays for HighThroughput Function Determination. Science, 2000, September 8; 289(5485): 17601763.
22. Perelson A. Immune network theory. Immunological Reviews, 1989, 10, 5-36.
23. Tarakanov A.O.: Mathematical models of information processing by biomolecules:
formal peptide instead of formal neuron. Russian Academy of Sciences, Problems of
Informatization J., 1998, 1, 46-51 (in Russian).
24. Tarakanov A. and Adamatzky A. Virtual clothing in hybrid cellular automata. 2000,
http://www.ias.uwe.ac.uk/~a-adamat/clothing/cloth_06.htm
25. Tarakanov A. and Dasgupta D. A formal model of an artificial immune system.
BioSystems, 2000, 55(1-3), 151-158.
26. Tarakanov A., Sokolova S., Abramov B. and Aikimbayev A. Immunocomputing of
the natural plague foci. Proc. of the Genetic and Evolutionary Computation
Conference (GECCO-2000), Workshop on Artificial Immune Systems, Las Vegas,
USA, 2000, 38-39.
27. Wasserman P. Neural computing. Theory and practice. Van Nostrand Reihold, New
York, 1990.
10
Download