Sample article and terms with definitions - SABIA

advertisement
SAMPLE ENCYCLOPEDIA SUBMISSION
Artificial Neural Networks and Data Mining Techniques
Juan R. Rabuñal, juanra@udc.es
Julián Dorado, julian@udc.es
Alejandro Pazos, apazos@udc.es
Dept. of Information & Communications Technologies,
University of A Coruña, Spain
INTRODUCTION
The world of Data Mining (Cios, Pedrycz & Swiniarrski, 1998) is in constant
expansion. New information is obtained from databases thanks to a wide range of
techniques, which are all applicable to a determined set of domains and count with a
series of advantages and inconveniences. The Artificial Neural Networks (ANNs)
technique (McCulloch & Pitts, 1943) (Orchad, 1993) (Haykin, 1999) allows us to resolve
complex problems in many disciplines (classification, clustering, regression, etc.), and
presents a series of advantages that convert it into a very powerful technique that is easily
adapted to any environment. The main inconvenience of ANNs, however, is that they can
not explain what they learn and what reasoning was followed to obtain the outputs. This
implies that they can not be used in many environments in which this reasoning is
essential.
This article presents a hybrid technique that not only benefits from the advantages
of ANNs in the data-mining field, but also counteracts their inconveniences by using
other knowledge extraction techniques. Firstly we extract the requested information by
applying an ANN, then we apply other Data Mining techniques to the ANN in order to
explain the information that is contained inside the network. We thus obtain a twolevelled system that offers the advantages of the ANNs and compensates for its
shortcomings with other Data Mining techniques.
BACKGROUND
Ever since artificial intelligence appeared, ANNs have been widely studied. Their
generalisation capacity, and their inductive learning convert them into a very robust
technique that can be used in almost any domain. An ANN is an information processing
technique that is inspired on neuronal biology and consists of a large amount of
interconnected computational units (neurons), usually in different layers. When an input
vector is presented to the input neurons, this vector is propagated and processed in the
network until it becomes an output vector in the output neurons.
The ANNs have proven to be a very powerful tool in an array of applications, but
they present a big problem: their reasoning process cannot be explained, i.e. there is no
clear relationship between the inputs that are presented to the network and the outputs it
produces. This means that ANNs cannot be used in certain domains, even though several
approaches and attempts to explain their behaviour have tried to solve this problem.
…………………………………………..
…………………………………………..
…………………………………………..
The most recent works in rules extraction from ANNs are presented by Rivero et al
and Rabuñal et al (Rivero, Rabuñal, Dorado, Pazos & Pedreira, 2004) (Rabuñal, Dorado,
Pazos & Rivero, 2003). They extract rules by applying a symbolic regression system,
based on Genetic Programming (GP) (Koza, Keane, Streeter, Mydlowec, Yu & Lanza,
2003) (Wong & Leung, 2000) (Engelbrecht, Rouwhorst & Schoeman, 2001), to a set of
inputs / outputs produced by the ANN. The set of network inputs / produced outputs is
dynamically modified, as explained on this paper.
MAIN FOCUS OF THE CHAPTER
This article presents an architecture in two levels for the extraction of knowledge
from databases. In a first level, we apply an ANN as Data Mining technique; in the
second level, we apply a knowledge extraction technique to this network.
Data Mining with ANNs
Artificial Neural Networks constitute a Data Mining technique that has been widely
used as a technique for the extraction of knowledge from databases. Their training
process is based on examples, and presents several advantages that other models do not
offer.
…………………………………………..
…………………………………………..
…………………………………………..
The purpose of this article is to correct this defect through the use of another
knowledge extraction technique. This technique is applied to the ANN in order to extract
its internal knowledge and express it in terms that are proper of the used technique (i.e.
decision trees, semantic networks, IF-THEN-ELSE rules, etc.).
Extracting knowledge from ANNs
In general, the second knowledge extraction technique is applied to the network to
explain its behaviour when it produces a set of outputs from a set of inputs, without
considering its internal functioning.
…………………………………………..
…………………………………………..
…………………………………………..
In this way, we create a closed system in which the patterns set is constantly
updated and the search continues for new areas that are not yet covered by the knowledge
that we have of the network. A detailed description of the method and of its application to
concrete problems can be found in Rabuñal et al (Rabuñal, Dorado, Pazos & Rivero,
2003) (Rabuñal, Dorado, Pazos, Gestal, Rivero & Pedreira, 2004) (Rabuñal, Dorado,
Pazos, Pereira & Rivero, 2004).
FUTURE TRENDS
The proposed architecture is based on the application of a knowledge extraction
technique to an ANN, whereas the technique that will be used depends on the type of
knowledge that we wish to obtain. Since there is a great variety of techniques and
algorithms that generate information of the same kind (IF-THEN rules, trees, etc.), we
need to study them and carry out experiments to test their functioning in the proposed
system, and in particular their adequacy for the new patterns generation system.
Also, this system for the generation of new patterns involves a large number of
parameters, such as the percentage of patterns change, the number of new patterns, or the
maximum error with which we consider that a pattern is represented by a rule. Since there
are so many parameters, we need to study not only each parameter separately but also the
influence of all the parameters on the final result.
CONCLUSION
This article proposes a system architecture that makes good use of the advantages
of the ANNs, such as Data Mining, and avoids its inconveniences. On a first level, we
apply an ANN to extract and model a set of data. The resulting model offers all the
advantages of ANNs, such as noise tolerance and generalisation capacity. On a second
level, we apply another knowledge extraction technique to the ANN, and thus obtain the
knowledge of the ANN, which is the generalisation of the knowledge that was used for its
learning; this knowledge is expressed in the shape that is decided by the user. It is
obvious that the union of various techniques in a hybrid system conveys a series of
advantages that are associated to them.
REFERENCES
Cios, K., Pedrycz, W., & Swiniarrski, R. (1998). Data Mining Methods for Knowledge
Discovery. Kluwer Academic Publishers.
Engelbrecht, A.P., Rouwhorst, S.E., & Schoeman L. (2001). A Building Block Approach
to Genetic Programming for Rule Discovery. Data Mining: A Heuristic Approach,
Abbass, R. Sarkar, C. Newton editors, Information Science Reference (formerly
Idea Group Reference) Publishing.
Haykin, S. (1999). Neural Networks (2nd ed.). Englewood Cliffs, NJ: Prentice Hall.
Koza, J. R., Keane, M. A., Streeter, M. J., Mydlowec, W., Yu, J., & Lanza, G. (Editors)
(2003). Genetic Programming IV: Routine Human-Competitive Machine
Intelligence. Dordrecht, The Netherlands: Kluwer Academic Publishers.
McCulloch, W.S., & Pitts, W. (1943). A Logical Calculus of Ideas Immanent in Nervous
Activity. Bulletin of Mathematical Biophysics. (5) 115-133.
Orchad, G. (1993). Neural Computing. Research and Applications. Ed. Institute of
Physics Publishing, Londres.
Rabuñal, J.R., Dorado, J., Pazos, A., & Rivero, D. (2003). Rules and Generalization
Capacity Extraction from ANN with GP. Lecture Notes in Computer Science. 606613.
Rabuñal, J.R., Dorado, J., Pazos, A., Gestal, M., Rivero, D., & Pedreira, N. (2004 a).
Search the Optimal RANN Architecture, Reduce the Training Set and Make the
Training Process by a Distribute Genetic Algorithm. Artificial Intelligence and
Aplications 1, 415-420.
Rabuñal, J.R., Dorado, J., Pazos, A., Pereira, J., & Rivero, D. (2004 b). A New Approach
to the Extraction of ANN Rules and to Their Generalization Capacity Through GP.
Neural Computation, (16) 7, 1483-1523.
Rivero, D., Rabuñal, J.R., Dorado, J., Pazos, A., & Pedreira, N. (2004). Extracting
Knowledge from Databases and ANNs with Genetic Programming: Iris Flower
Classification Problem. Intelligent Agents for Data Mining and Information
Retrieval. 136-152.
Wong, M.L., & Leung, K.S. (2000). Data Mining using Grammar Based Genetic
Programming and Applications, Kluwer Academic Publishers.
TERMS AND DEFINITIONS
Area of the Search Space: Set of specific ranges or values of the input variables that
constitute a subset of the search space.
Artificial Neural Networks: A network of many simple processors (“units” or
“neurons”) that imitates a biological neural network. The units are connected by
unidirectional communication channels, which carry numeric data. Neural networks
can be trained to find nonlinear relationships in data, and are used in applications
such as robotics, speech recognition, signal processing or medical diagnosis.
Backpropagation algorithm: Learning algorithm of ANNs, based on minimising the
error obtained from the comparison between the outputs that the network gives
after the application of a set of network inputs and the outputs it should give (the
desired outputs).
Data Mining: The application of analytical methods and tools to data for the purpose of
identifying patterns, relationships or obtaining systems that perform useful tasks
such as classification, prediction, estimation, or affinity grouping.
Evolutionary Computation: Solution approach guided by biological evolution, which
begins with potential solution models, then iteratively applies algorithms to find the
fittest models from the set to serve as inputs to the next iteration, ultimately leading
to a model that best represents the data.
Knowledge Extraction: Explicitation of the internal knowledge of a system or set of
data in a way that is easily interpretable by the user.
Rule Induction: Process of learning, from cases or instances, if-then rule relationships
that consist of an antecedent (if-part, defining the preconditions or coverage of the
rule) and a consequent (then-part, stating a classification, prediction, or other
expression of a property that holds for cases defined in the antecedent).
Search Space: Set of all possible situations of the problem that we want to solve could
ever be in.
Download