l Supplementary Material Technology: Level 1 Robotics and the meaning of life: a practical guide to things that think T184 RobotLab8: Neural networks Prepared for the course team by Jeffrey Johnson, Tony Hirst and Jon Rosewell You may like to print out this booklet. It contains the instructions you will need for the laboratory session for Lesson 8. Contents 1 Introduction 2 2 Multidimensional data Activity: Classifying fruit with a neural network The anatomy of a neural network 2 4 6 3 4 5 6 Training a neural network Activity: Building the network Activity: Entering data and training a network Activity: Testing the network on unseen data Recognizing textured backgrounds Activity: Collect texture data The Pedro challenge (optional) Copyright 2003 The Open University 6 7 8 10 11 12 15 Before you start make sure you have downloaded the RobotLab8 programs. You will find instructions on how to do this in Session 8.5 of the website. The program files can be found in the RobotLab8 folder of the T184 Guide. 1 Introduction We introduced the idea of neural networks in Lesson 3. Neural networks can solve subtle pattern recognition problems, which are very important in robotics. In RobotLab8, you will get hands-on experience of using neural networks, and you will build and train neural networks to perform specific tasks. There are many different kinds of neural network, but we will focus exclusively on one of the best known, the multilayer perceptron. Despite this rather daunting name, building and training this type of network is very easy in RobotLab. 2 Multidimensional data Consider four classes of fruit: pears, bananas, strawberries and oranges. How might a robot recognize and distinguish between them? Let us suppose that a robot’s vision system can isolate the fruit objects and make two measurements. The first is the length of the ‘long axis’ (Figure 1). This is called the long measurement. The second measurement is taken at right angles to the long axis, half way along it. This is called the short measurement. Oranges are approximately spherical in shape, which means that their long and short measurements are nearly the same. Bananas, on the other hand, are long and thin, so their long measurements are ‘longer’ than their short measurements. Figure 1 Measurements for classifying fruits In this way I made the following pairs of measurements for the fruit in Figure 1. The long measurement is given first followed by the short measurement. pears (5.2, 3.1) (6.3, 2.4) (6.7, 1.8) (5.3, 2.9) bananas (8.5, 1.9) (8.3, 1.6) (9.7, 2.0) (7.5, 1.7) strawberries (2.1, 1.4) (2.8, 1.8) (2.0, 1.8) (2.2, 2.0) oranges (4.7, 4.5) (4.6, 4.2) (4.6, 4.1) (4.0, 3.7) 2 T184 ROBOTICS AND THE MEANING OF LIFE These measurements can be plotted on a grid. The longest measurement is plotted on the horizontal axis, and the shorter measurement is plotted on the vertical axis. As can be seen in Figure 2, the objects form clusters. When data are grouped like this, we can use various techniques to learn about these clusters. Neural networks provide a powerful method of finding them. Figure 2 Clustering objects in a two-dimensional data space Let us suppose that a robot has the data in Figure 2 in its memory, and it comes across (as yet) unclassified fruit having the following pairs of measurements. Write down which class of fruit the robot would associate each measurement pair with. (2.5, 2.1) strawberry (4.6, 4.5) orange (6.3, 2.9) _________ (9.5, 1.9) _________ (1.8, 1.5) _________ (5.1, 2.1) _________ (4.5, 4.1) _________ You probably found this quite easy. Arranging data points like this on a grid as in Figure 2 is a simple idea, and there are many techniques that use it to enable automatic classification. The general idea behind classification using neural networks is that data are fed into the network, and one of the outputs ‘fires’, signifying that the class associated with this output has been recognized. For example, in Figure 3 the measurements (9.5, 1.9) are fed into the network, and the output corresponding to ‘banana’ fires. Figure 3 A trained neural network responding to input data The network in Figure 3 has four outputs, each of which corresponds to one of the classes of fruit. If the numbers on the output neurons are in the range of 0 to 1, we let 0 mean ‘not recognized’ and 1 mean ‘recognized’. So, for example, if the four ROBOTLAB8 3 outputs of the network were (0.1, 0.9, 0.2, 0.1) the second output would be closest to 1, while the other three outputs are closer to 0. This would be interpreted as the second neuron having fired, and the class associated with it having been recognized (in this case, ‘banana’). So, for the input (9.5, 1.9) the desired output would be (0, 1, 0, 0). Thus (9.5, 1.9) and (0, 1, 0, 0) can de used as a ‘training pair’ of known inputs and outputs. A set of training pairs of data can be prepared from Figure 1 as follows: pears strawberries input data output data input data output data 5.2, 3.1 1, 0, 0, 0 2.1, 1.4 0, 0, 1, 0 6.3, 2.4 1, 0, 0, 0 2.8, 1.8 0, 0, 1, 0 6.7, 1.8 1, 0, 0, 0 2.0, 1.8 0, 0, 1, 0 5.3, 2.9 1, 0, 0, 0 2.2, 2.0 0, 0, 1, 0 bananas oranges input data output data input data output data 8.5, 1.9 0, 1, 0, 0 4.7, 4.5 0, 0, 0, 1 8.3, 1.6 0, 1, 0, 0 4.6, 4.2 0, 0, 0, 1 9.7, 2.0 0, 1, 0, 0 4.6, 4.1 0, 0, 0, 1 7.5, 1.7 0, 1, 0, 0 4.0, 3.7 0, 0, 0, 1 These measurement pairs are called the training data. Once a system is trained, it can be used to classify previously unseen data. Neural networks provide a means of classifying data like this. Given a pair of input numbers, the network will report which class has been recognized. Activity: Classifying fruit with a neural network From the T184 Guide, Double-click on Neural network editor. Your screen will be similar to Figure 4. This network has already been trained with the fruit data from above. Figure 4 4 The T184 RobotLab neural network screen T184 ROBOTICS AND THE MEANING OF LIFE The screen in Figure 4 is divided into three parts. On the left is the Training window, in the middle is the Data window and on the right is the Network window. The inputs – the long and short measurements of the fruit – are shown at the bottom of the Network window, and the outputs – the type of fruit that has been recognized – are shown at the top. The network has two inputs – the long and short measurements of the fruit, and four outputs, which indicate the class of fruit that has been recognized. Each output is associated with one class of fruit. Left to right, these are: pears, bananas, strawberries and oranges. For example, a number close to 1 on the ‘orange’ output and numbers close to 0 on the other three signifies that ‘orange’ has been recognized. Item 1 in the Data window reads: Tr 1: 5.2 3.1 = 1 0 0 0 ‘pear’. The letters Tr mean this item was selected for training. The 1 next to it means this is the first item in the list. The numbers 5.2 3.1 are the measurements of the first pear in Figure 1. The numbers 1 0 0 0 mean that this is the first of four classes of things to be recognized, and ‘pear’ means that the members of this class are called pears. The numbers 1 0 0 0 are what we would expect the network to output when it is presented with the input pattern (5.2, 3.1). In item 2, 0 1 0 0 ‘banana’ means that banana is the second of the four classes. The third class (item 3) is strawberry (0 0 1 0 ‘strawberry’) and the fourth (item 4) is orange (0 0 0 1 ‘orange’). This network was trained using items 1 to 16. The other data items, 17 to 23, are previously unseen by the network. Note that these are the pairs you tried to recognize from Figure 2 earlier in this session. If you have not already done so, click on item 17 in the Data window. Your screen should be similar to Figure 5. The third output from the left has a value approaching 1 (0.960) while the other three outputs are nearly 0 (0.011, 0.000 and 0.023). The class associated with this set of outputs is ‘strawberries’, which is the class that the program has associated with the two measurements 2.5 and 2.1. Figure 5 The T184 RobotLab neural network screen To find out if the trained neural network is able to classify the remaining test items correctly, highlight items 18 to 23 in turn, and check their classifications in the Network window. You should find that the trained network is able to recognize each fruit correctly from the input data. ROBOTLAB8 5 3 The anatomy of a neural network The main components of neural networks are the neurons, shown as circles in Figure 6. (You saw something similar to this in Lesson 3.) Neurons have inputs where the data enters the network to be processed, and outputs for the processed data. The outputs of some of the neurons can be inputs to others, and neurons are typically arranged in layers. This type of neural network (‘multilayer perceptrons’) have an input layer of neurons, an output layer of neurons, and one or more hidden layers of neurons, as illustrated in Figure 6. Figure 6 The parts of a neural network When designing a neural network it is necessary to decide how many neurons there will be in the input layer, how many neurons there will be in the output layer, how many hidden layers there will be and how many neurons there should be in the hidden layers. The number of inputs is usually the number of data items to be entered into the network. Usually it is necessary to process the data before they enter the network to get them into the most appropriate form. This is shown above as a preprocessor. The number of outputs depends on the purpose of the network and how the outputs will be interpreted. When the network is used as a classifier, as in our example, the number of outputs is usually the number of classes. For example, the network used to classify the four types of fruit had four outputs. In most cases a single hidden layer is all that is required. The number of hidden neurons required depends on how clear the clusters are. Too few or too many hidden neurons may result in the network not behaving as it should. There is no analytic way of determining the optimum number of hidden neurons, and engineers tend to make their selection based on their experience of what has worked in the past. The outputs of a neural network have to be interpreted, often being converted into a useful form by a post-processor. For example, in the example of the fruit, the post-processor displayed the name of the fruit identified. 4 Training a neural network Consider a robot bartender. One of its many jobs could be to pick up bar stools when they have fallen over. This means it has to know whether a stool has fallen over or not. The correct identification of a fallen bar stool is what you are going to ‘teach’ the robot, and you are going to do this using a neural network. 6 T184 ROBOTICS AND THE MEANING OF LIFE Let us suppose that the robot has a vision system and is able to recognize the stools as 2D-objects. It surrounds them with a red rectangle whose dimensions it can calculate (Figure 7). Figure 7 Four stools to be used as training data Bar stools will be classified as either ‘upright’ or ‘fallen over’. The network will have two inputs (the dimensions of the horizontal and vertical sides of the red rectangles) and two outputs (upright: 1, 0 and fallen over: 0, 1). Activity: Building the network With the Neural network editor program open, click on File and select the New Network option, as shown in Figure 8(a). The New Network window, containing the settings for the existing data, will then appear (Figure 8(b)). First click the box with a tick in it next to the words ‘For existing data’, to clear the tick. You must do this before you can change the network settings. (a) Figure 8 (b) Defining a new network (c) Click in the Output Layer box, and type in the number 2. Click in the Input Layer box, and change this to 2 if necessary. The inputs to the network will be the horizontal and vertical dimensions of the box bounding the stool. We need to decide how many hidden layers should be used and how many neurons there should be in each. This is a simple example, so one hidden layer with five neurons should be enough. In the box marked 1st hidden change the value to 5. Change the value in the box marked 2nd hidden to 0, which denotes that there is no second hidden layer (Figure 8(c)). Click on OK and you will get the message below. ROBOTLAB8 7 Click OK again, and you will have a new network (Figure 9). Your numbers will be different to mine because the network assigns random values. You are now ready to input your data in order to train the robot how to recognize upright and fallen bar stools. Figure 9 The T184 neural network windows, ready to input data Activity: Entering data and training a network The training data for the bar stools in Figure 7 are: Inputs 75, 110 76, 142 107, 68 124, 71 Outputs 1, 0 1, 0 0, 1 0, 1 Position of bar stool upright upright fallen fallen Input the data for the first bar stool as shown in Figure 10(a). Click on OK, then New Item and input the data for the second stool. Input the data for the remaining two stools. Your finished data should look similar to Figure 10(b). You may get an error message and be unable to proceed (Figure 11(a)). If so, click on OK, then Delete. This will take you out of data entry mode and enable you to proceed. You now need to tell RobotLab that the input data are to be used for training. To do this, click to the left of each item number in the Data window to select it. The data item will turn dark red, and the letters Tr for ‘training’ will appear (Figure 11(b)). You are now ready to train the network. 8 T184 ROBOTICS AND THE MEANING OF LIFE (a) Inputting the first data set (b) After inputting all four data items Figure 10 (a) Error message after data entry (b) Training items selected Figure 11 Click on ‘Seed and Scale’ to initialize the network with random weights, and click on ‘Cycle Until’. The network should then cycle until it is trained. Figure 12 The menu for training the network How can you tell if the network is trained? In the following, the outputs in your network may vary from mine because they depend on the initial (random) values. When a neural network is trained it will recognize its own training data. If you click on a data item in the Data window it becomes the currently selected item (Figure 13). Click on each of the four training items in turn, and look in the top left hand corner of the Network window. For example, when you click on item 3, the postprocessor gives the message Matches ‘Fallen Stool’ = item 3. The outputs are 0.084 and 0.915, which are very close to 0 and 1, the desired outputs for this training item. The network has thus ‘trained to recognize’ this item. ROBOTLAB8 9 Figure 13 Testing the network on its training data You should find that the network correctly recognizes all the training data, although there is a small chance that this won’t happen. If you get the wrong matches, retrain the network by clicking on ‘Seed and Scale’, then ‘Cycle Until’ and try again. Saving and retrieving data To save the data and network you have just entered, click on ‘File’ and ‘Save As...’. Select the c:\T184\Lessons\Lesson-8\ directory and give the file the name mystore.nnd. To reload a saved file, click on ‘File’, then ‘Load…’, and browse the c:\T184\Lessons\Lesson-8\ directory. Activity: Testing the network on unseen data With the trained network of the previous activity open, input the test data as follows (Figure 14 and items 5 to 8 below). Note: this time you only enter the input data; leave the Output and Item Label boxes blank. Figure 14 Previously unseen stools for testing Click on each item of ‘unseen’ data. I hope you agree that this is rather remarkable. Somehow the network has learnt the essence of being upright or fallen over from the training data you provided. Of course it’s ‘obvious’ that if the height of the enclosing rectangle is greater than its width the stool is upright, and if it is less the stool is on its side. So surely a simple logical test would do just as well! 10 T184 ROBOTICS AND THE MEANING OF LIFE The important thing to note here is that the network learnt from examples. This approach to information processing is completely different from programming, which you did in previous RobotLabs. Here, the programmer tries to abstract some principles from the available data, and then writes code to implement these principles. When the data are complex the programmer may not see the underlying pattern. However, the neural network approach adapts to the data, and automatically picks out the patterns. 5 Recognizing textured backgrounds In RobotLab4 Simon had to ‘decide’ which room it was in on the basis of the colour of the floors. If the floors were the same colour but different textures, how might Simon distinguish them? Figure 15 shows six textures to be classified. Figure 15 Textures to be classified Double-click on is shown below: Collect texture data in the T184 Guide. The program code ROBOTLAB8 11 This program causes the robot to circle around while collecting data on the floor textures. It collects three statistics: the minimum grey level (gmin), the maximum grey level (gmax) and the average, or mean, grey level (gmean). I ran the program a number of times and got the following data. My results Trial 1 Denim Parchment Granite Sand Oak Marble gmin 7 92 32 20 38 10 gmax 85 96 92 86 75 42 gmean 54 94 66 58 60 22 Denim Parchment Granite Sand Oak Marble gmin 15 92 31 39 41 10 gmax 81 96 98 87 74 51 gmean 56 94 66 62 60 23 Denim Parchment Granite Sand Oak Marble gmin 9 92 9 26 46 7 gmax 82 47 96 94 98 65 96 56 76 63 51 22 Trial 2 Trial 3 gmean Activity: Collect texture data Now it’s your turn. Run the ‘Collect texture data’ program to collect data to fill in the table on page 13. Do this by positioning the robot in the centre of each texture square in turn, then run the program. Repeat three times for each texture. The values of gmin, gmax and gmean for each square can be obtained from the Variables window as shown in Figure 16. To open the Variables window click on ‘6. Variables’ in the Windows dropdown menu. Figure 16 The Variable window showing the values of gmin, gmax and gmean 12 T184 ROBOTICS AND THE MEANING OF LIFE Trial 1 Denim Parchment Granite Sand Oak Marble gmin _______ _______ _______ _______ _______ _______ gmax _______ _______ _______ _______ _______ _______ gmean _______ _______ _______ _______ _______ _______ Denim Parchment Granite Sand Oak Marble gmin _______ _______ _______ _______ _______ _______ gmax _______ _______ _______ _______ _______ _______ gmean _______ _______ _______ _______ _______ _______ Denim Parchment Granite Sand Oak Marble gmin _______ _______ _______ _______ _______ _______ gmax _______ _______ _______ _______ _______ _______ gmean _______ _______ _______ _______ _______ _______ Trial 2 Trial 3 Click on File and New Network, as you did before. This time your network requires three inputs (gmin, gmax and gmean) and six outputs (denim, parchment, granite, sand, oak and marble). I suggest you use the following output codes: Denim 1, 0, 0, 0, 0, 0 Parchment 0, 1, 0, 0, 0, 0 Granite 0, 0, 1, 0, 0, 0 Sand 0, 0, 0, 1, 0, 0 Oak 0, 0, 0, 0, 1, 0 Marble 0, 0, 0, 0, 0, 1 Input the data from Trials 1 and 2 as training data. Recall that training items are set up by clicking on the item in the Data window, to the left of the item number. When your network is trained, use the data from Trial 3 to test the network. My results I got the following results: 13. Matches ‘Denim’ = item 1 Correct 14. Matches ‘Parchment’ = item 2 Correct 15. Matches ‘Granite’ = item 3 Correct 16. Matches ‘Granite’ = item 3 Error! 17. Matches ‘Oak’ = item 5 Correct 18. matches ‘Marble’ = item 6 Correct Discussion The random numbers used to set up your untrained network will be different to mine. When I looked at my data I was not surprised by my observations. I plotted two so-called scatter diagrams of gmin against gmean and gmax against gmean (Figure 17). ROBOTLAB8 13 Figure 17 Scatter diagrams for the texture data Scatter diagrams can be a bit misleading, but given the closeness of the classes for denim (D) and oak (O), I think the network has done a remarkable job. Sand (S) and gravel (G) are also close but much more difficult to separate with these data. The classes for parchment (P) and marble (M) are quite distinct from the others. Where they are distinct like this a network usually has no problem discriminating between different classes. The art of using neural networks is to find measurements that separate classes well. For example, the classes of parchment and marble are well separated by the gmin, gmax and gmean data. When the existing data are not sufficiently discriminating, it is necessary to look for a new measurement. What could be used here? I experimented with the numbers of pixels having light sensor values below 10% and 20%. I don’t think that the gmin statistics are very robust, but these values gave me better results for discriminating the troublesome granite and sand classes. A more sophisticated approach would be to exploit the differences in the wave forms shown by the downloaded data in Figure 18, but I did not try this. Figure 18 The different downloaded data logged for granite and sand If you have a MindStorms kit you may go on to Section 6, which is optional. If you don’t have a MindStorms kit or if you have a kit but would like to skip Section 6 you should return to the T184 website now. 14 T184 ROBOTICS AND THE MEANING OF LIFE 6 The Pedro challenge (optional) Double-click on Pedro data collection in the T184 Guide. This program makes Pedro move forward about 5 cm, collecting data as it goes. The send command is used to return values of gmin, gmax and gmean. Before running the program make sure you have Monitor messages switched on. (Click on Monitor messages in the Connect dropdown menu.) The output data will be played as .wav files, but you can ignore these. Instead, maximize the Messages window and read off the output data from there (the numbers only, not the text). Collect three sets of data for each of the four areas shown in Figure 19. Use these data to train a neural network with three inputs and four outputs. I suggest you have six neurons in a single hidden layer. When your network is trained, collect another set of data. Are the textures correctly recognized? You may find that the network fails to classify all four backgrounds. If so, you could try using some different statistics, such as the number of measurements with light sensor readings less than 47. Note that for Pedro, black is typically recorded as 30% and white 51%, compared with 0% and 100% for Simon. This sensor response is a property of the Lego light sensor, and the RCX electronics and software. This is the end of RobotLab8. You should now return to the T184 website. ROBOTLAB8 15 Figure 19 Test textures for the Pedro challenge (clockwise from top left: tiles, granite, squares, lines) 16 T184 ROBOTICS AND THE MEANING OF LIFE