Exercise 1. Function Approximation Your task is to create and train a neural network that solves the XOR problem. XOR is a function that returns 1 when the two inputs are not equal, see table below: The XOR-problem A B A XOR B 1 1 0 1 0 1 0 1 1 0 0 0 To solve this we will need a feedforward neural network with two input neurons, and one output neuron. Because that the problem is not linearly separable it will also need a hidden layer with two neurons. Now we know how our network should look like, but how do we create it? To create a new feed forward neural network use the command newff. You have to enter the max and min of the input values, the number of neurons in each layer and optionally the activation functions. >> net = newff([0 1; 0 1],[2 1],{'logsig','logsig'}) The variable net will now contain an untrained feedforward neural network with two neurons in the input layer, two neurons in the hidden layer and one output neuron, exactly as we want it. The [0 1; 0 1] tells matlab that the input values ranges between 0 and 1. The {'logsig','logsig'} tells matlab that we want to use the logsig function as activation function in all layers. The first parameter tells the network how many nodes there should be in the input layer, hence you do not have to specify this in the second parameter. You have to specify at least as many transfer functions as there are layers, not counting the input layer. If you do not specify any transfer function Matlab will use the default settings. The logsig activation function Now we want to test how good our untrained network is on the XOR problem. First we construct a matrix of the inputs. The input to the network is always in the columns of the matrix. To create a matrix with the inputs "1 1", "1 0", "0 1" and "0 0" we enter: >> input = [1 1 0 0; 1 0 1 0] Now we have constructed inputs to our network. Let us push these into the network to se what it produces as output. The command sim is used to simulate the network and calculate the outputs, for more info on how to use the command type helpwin sim. The simplest way to use it is to enter the name of the neural network and input matrix, it returns a output matrix. >> output=sim(net,input) output = 0.5923 0.0335 0.9445 0.3937 (not unique) The output was not exactly what we wanted! We wanted (0 1 1 0) but got near to (0.60 0.03 0.95 0.40). (Note that your network might give a different result, because the network's weights are given random values at the initialization.) You can now plot the output and the targets, the targets are the values that we want the network to generate. Construct the target vector: >> target = [0 1 1 0] To plot points we use the command "plot". We want that the targets should be small circles so we use the command: >> plot(target, 'o') We want to plot the output in the same window. Normally the contents in a window is erased when you plot something new in it. In this case we want the targets to remain in the picture so we use the command hold on. The output is plotted as +'s. >> hold on >> plot(output, '+') In the resulting figure below it's easy to see that the network does not give the wanted results. To change this we have to train it. Now we will train the network by hand by adjusting the weights manually. Manually set weights The network we have constructed so far does not really behave as it should. To correct this the weights will be adjusted. All the weights are stored in the net structure that were created with newff. The weights are numbered by the layers they connect and the neurons within these layers. To get the value of the weights between the input layer and the first hidden layer we type: >> net.IW ans = [2x2 double] [] >> net.IW{1,1} ans = 5.5008 -5.6975 2.5404 -7.5011 This means that the weight between the second neuron in the input layer to the first neuron in the first hidden layer is -5.6975. To change it to 1, enter: >> net.IW{1,1}(1,2)=1; >> net.IW{1,1} ans = 5.5008 1.0000 2.5404 -7.5011 The weights between the hidden layers and the output layer are stored in the .LW component, which can be used in the same manner as .IW. >> net.LW ans = [] [] [1x2 double] [] >> net.LW{2,1} ans = -3.5779 -4.3080 The change we made in the weight makes our network give an other output when we simulate it, try it by enter: >> output=sim(net,input) output = 0.8574 0.0336 0.9445 0.3937 >> plot(output,'g*'); Now the new output will appear as green stars in your picture. Training Algorithms In the neural network toolbox there are several training algorithms already implemented. That is good because they can do the heavy work of training much smoother and faster than we do by manually adjust the weights. Now let us apply the default training algorithm to our network. The matlab command to use is train, it takes the network, the input matrix and the target matrix as input. The train command returns a new trained network. For more information type helpwin train. In this example we do not need all the information that the training algorithms shows, so we turn it of by entering: >> net.trainParam.show=NaN; The most important training parameters are .epochs which determines the maximum number of epochs to train, .show the interval between each presentation of training progress. If the gradient of the performance is less than .min_grad the training is ended. The .time component determines the maximum time to train. And to train the network enter: >> net = train(net,input,target); Because of the small size of the network, the training is done in only a second or two. Now we try to simulate the network again, to se how it reacts to the inputs: >> output = sim(net,input) output = 0.0000 1.0000 1.0000 0.0000 That was exactly what we wanted the network to output! You may now plot the output and see that the +'s falls in the o's. Now examine the weights that the training algorithm has set, does they look like the weights that you found? >> net.IW{1,1} ans = 11.0358 -9.5595 16.8909 -17.5570 >> net.LW{2,1} ans = 25.9797 -25.7624 Exercise 2. Prediction (Timeseries, stock …) Solution to the timeseries competition by using an MLP network. First load the variable data (1 x 1000) Import the timeseries.txt file and then: >> data=transpose (timeseries) Reset the random generators to its initial state. These functions will be used in the network creation >> randn('state',0); >> rand('state',0); The idea is to use the timeseries that we have to setup a network, train it and test the sets. We will create 3 input layers to the network, using 997 data points. So, we define a matrix of 3x997 from the loaded timeseries.mat file. The same way, we have to define our output matrix, which is composed of the fourth element till the end >> in_data = [ data(1:end-3); data(2:end-2); data(3:end-1)]; >> out_data = data(4:end); Now, split the data in training and test sets >> in_tr=in_data(:,1:900); >> out_tr=out_data(:,1:900); >> in_tst=in_data(:,901:end); >> out_tst=out_data(:,901:end); Let N be the number of neurons in the hidden layer >>N = 7; Now, to create a feed-forward network with 3 input, N hidden neurons with tanhnonlinearity and one output with linear activation function we use the same function used in Exercise 1 newff >> net = newff([min(in_tr,[],2) max(in_tr,[],2)],[N 1],{'tansig' 'purelin'}); >> net.trainParam.epochs = 1000; >> V.P=in_tst; >> V.T=out_tst; Lets train the network: the default training method is Levenberg-Marquardt [net,tr] = train(net,in_tr,out_tr,[],[],V); The structure of the network can be optimized by monitoring the error on the test set. Here is a list of test set errors as a function of the number of hidden neurons N=2: 1.6889e-03 N=3: 1.5500e-03 N=4: 6.0775e-10 N=5: 5.2791e-08 N=6: 3.2476e-08 N=7: 3.2816e-10 N=8: 6.3030e-10 N=9: 5.8722e-10 N=10: 2.7228e-09 N=15: 5.16033e-08 N=30: 2.3727e-11 The number of delays could also be optimized the same way different random initialization would also give different values Now, the trained network can be used for predicting a new datapoint recursively (ten steps ahead) >> for i=1:10 >> data(end+1)=sim(net,data(end-2:end)'); >> end >> data(end-9:end)