8.1. How to use a multi-layer neural net for recognition? (Translation by Piotr Ciskowski; piotr.ciskowski@pwr.wroc.pl) Multi-layer neural nets, which you have studied thoroughly in the previous chapter, may be used for many tasks. Nevertheless, if we want to analyze their properties and the way they work, the most appropriate area is image recognition. Image recognition is a problem, in which a neural net (or other kind of a learning machine) decides on the membership of images to certain classes. The analyzed objects may be of various kinds – from digital camera images to scanned or grabbed analog images. We present a comparison of a digital and analog image in fig. 8.1, in order to illustrate what we are talking about, but also to displease all those readers too much confident that “digital” always means “better”. analog image digital image Fig. 8.1. A comparison of an analog and digital image This book is about neural nets however, not about images, so we are not going to go deep into the theory of image recognition, especially considered as picture recognition. Still it is worth noticing that the problem of recognizing images, namely pictures, actually started that field of research and gave name for it. The first (historical!) neural net built by Frank Rosenblatt (presented in fig. 8.2) was used to recognize images and that is why it was called a “Perceptron”. Please regard this picture (even if quite ancient and of poor quality) with the proper respect, as a relic of one of the first achievements in the discipline we are studying in this book. Fig. 8.2. Rosenblatt’s „perceptron” – the first neural network recognizing images The meaning of the word “image” has been generalized so much that now neural nets are used for recognizing samples of sound signals (e.g. spoken commands), seismic or other geophysical signals (when searching for geological ledges), symptoms of patients to be diagnosed, scores of companies applying for loans – and much more. We consider all these tasks as image recognition, even if the above mentioned “images” are in fact, respectively: acoustic, geophysical, diagnostic, economical, or other. A neural net used for image recognition has usually got several inputs, supplied with signals representing features of the objects being recognized. These may be for example some coefficients describing the shape of a machine’s part, or the liver’s tissue texture. We often use many inputs as we want to show the neural net all the features of the analyzed object, so that the net is able to learn properly how to recognize it. However, the number of image features is much less than the number of image’s elements (pixels). If you supply the net with the raw digital image, then the number of its inputs will go into hundreds of thousands, or even a few million! That is why we practically never use neural nets for analyzing raw images. The nets usually “see” the features of the analyzed image, extracted by other independent programs outside the net. This image preparation is sometimes called “preprocessing”. A neural net used for recognition has usually got several outputs as well. Generally speaking, each output is assigned to a specific class. For example an OCR (optical character recognition) system may use over 60 outputs, each assigned to a certain character – e.g. the first output neuron indicates letter A, the second neuron – letter B and so on. We discussed that in chapter 2, if you would like to recall the possible output signals of a net used for recognition, go back to figures 2.30, 2.31 and 2.33. There is usually at least one hidden layer of neurons between the input and the output of the net – we will now study the hidden layer in a more detail. Generally speaking, there may be many neurons in the hidden layer or just a few of them. After analyzing briefly the processes going on in neural nets, we usually think that more neurons in the hidden layer give a more “clever” net. However, you will soon learn that it is not always worth having a net with large “built-in intelligence”, as it sometimes turns out to be surprisingly disobeying!