Your task is to create and train a neural network that solves the XOR

advertisement
Exercise 1. Function Approximation
Your task is to create and train a neural network that solves the XOR problem.
XOR is a function that returns 1 when the two inputs are not equal, see table below:
The XOR-problem
A B A XOR B
1 1
0
1 0
1
0 1
1
0 0
0
To solve this we will need a feedforward neural network with two input neurons, and one
output neuron. Because that the problem is not linearly separable it will also need a
hidden layer with two neurons.
Now we know how our network should look like, but how do we create it?
To create a new feed forward neural network use the command newff. You have to enter
the max and min of the input values, the number of neurons in each layer and optionally
the activation functions.
>> net = newff([0 1; 0 1],[2 1],{'logsig','logsig'})
The variable net will now contain an untrained feedforward neural network with two
neurons in the input layer, two neurons in the hidden layer and one output neuron, exactly
as we want it. The [0 1; 0 1] tells matlab that the input values ranges between 0 and 1.
The {'logsig','logsig'} tells matlab that we want to use the logsig function as activation
function in all layers. The first parameter tells the network how many nodes there should
be in the input layer, hence you do not have to specify this in the second parameter. You
have to specify at least as many transfer functions as there are layers, not counting the
input layer. If you do not specify any transfer function Matlab will use the default
settings.
The logsig activation function
Now we want to test how good our untrained network is on the XOR problem. First we
construct a matrix of the inputs. The input to the network is always in the columns of the
matrix. To create a matrix with the inputs "1 1", "1 0", "0 1" and "0 0" we enter:
>> input = [1 1 0 0; 1 0 1 0]
Now we have constructed inputs to our network. Let us push these into the network to se
what it produces as output. The command sim is used to simulate the network and
calculate the outputs, for more info on how to use the command type helpwin sim. The
simplest way to use it is to enter the name of the neural network and input matrix, it
returns a output matrix.
>> output=sim(net,input)
output =
0.5923 0.0335 0.9445 0.3937 (not unique)
The output was not exactly what we wanted! We wanted (0 1 1 0) but got near to (0.60
0.03 0.95 0.40). (Note that your network might give a different result, because the
network's weights are given random values at the initialization.)
You can now plot the output and the targets, the targets are the values that we want the
network to generate. Construct the target vector:
>> target = [0 1 1 0]
To plot points we use the command "plot". We want that the targets should be small
circles so we use the command:
>> plot(target, 'o')
We want to plot the output in the same window. Normally the contents in a window is
erased when you plot something new in it. In this case we want the targets to remain in
the picture so we use the command hold on. The output is plotted as +'s.
>> hold on
>> plot(output, '+')
In the resulting figure below it's easy to see that the network does not give the wanted
results. To change this we have to train it. Now we will train the network by hand by
adjusting the weights manually.
Manually set weights
The network we have constructed so far does not really behave as it should. To correct
this the weights will be adjusted. All the weights are stored in the net structure that were
created with newff. The weights are numbered by the layers they connect and the neurons
within these layers. To get the value of the weights between the input layer and the first
hidden layer we type:
>> net.IW
ans =
[2x2 double]
[]
>> net.IW{1,1}
ans =
5.5008 -5.6975
2.5404 -7.5011
This means that the weight between the second neuron in the input layer to the first
neuron in the first hidden layer is -5.6975. To change it to 1, enter:
>> net.IW{1,1}(1,2)=1;
>> net.IW{1,1}
ans =
5.5008 1.0000
2.5404 -7.5011
The weights between the hidden layers and the output layer are stored in the .LW
component, which can be used in the same manner as .IW.
>> net.LW
ans =
[] []
[1x2 double]
[]
>> net.LW{2,1}
ans =
-3.5779 -4.3080
The change we made in the weight makes our network give an other output when we
simulate it, try it by enter:
>> output=sim(net,input)
output =
0.8574
0.0336
0.9445
0.3937
>> plot(output,'g*');
Now the new output will appear as green stars in your picture.
Training Algorithms
In the neural network toolbox there are several training algorithms already implemented.
That is good because they can do the heavy work of training much smoother and faster
than we do by manually adjust the weights. Now let us apply the default training
algorithm to our network. The matlab command to use is train, it takes the network, the
input matrix and the target matrix as input. The train command returns a new trained
network. For more information type helpwin train. In this example we do not need all the
information that the training algorithms shows, so we turn it of by entering:
>> net.trainParam.show=NaN;
The most important training parameters are .epochs which determines the maximum
number of epochs to train, .show the interval between each presentation of training
progress. If the gradient of the performance is less than .min_grad the training is ended.
The .time component determines the maximum time to train.
And to train the network enter:
>> net = train(net,input,target);
Because of the small size of the network, the training is done in only a second or two.
Now we try to simulate the network again, to se how it reacts to the inputs:
>> output = sim(net,input)
output =
0.0000 1.0000 1.0000 0.0000
That was exactly what we wanted the network to output! You may now plot the output
and see that the +'s falls in the o's. Now examine the weights that the training algorithm
has set, does they look like the weights that you found?
>> net.IW{1,1}
ans =
11.0358 -9.5595
16.8909 -17.5570
>> net.LW{2,1}
ans =
25.9797 -25.7624
Exercise 2. Prediction (Timeseries, stock …)
Solution to the timeseries competition by using an MLP network.
First load the variable data (1 x 1000)
Import the timeseries.txt file and then:
>> data=transpose (timeseries)
Reset the random generators to its initial state. These functions will be used in the
network creation
>> randn('state',0);
>> rand('state',0);
The idea is to use the timeseries that we have to setup a network, train it and test the sets.
We will create 3 input layers to the network, using 997 data points. So, we define a
matrix of 3x997 from the loaded timeseries.mat file. The same way, we have to define
our output matrix, which is composed of the fourth element till the end
>> in_data = [ data(1:end-3); data(2:end-2); data(3:end-1)];
>> out_data = data(4:end);
Now, split the data in training and test sets
>> in_tr=in_data(:,1:900);
>> out_tr=out_data(:,1:900);
>> in_tst=in_data(:,901:end);
>> out_tst=out_data(:,901:end);
Let N be the number of neurons in the hidden layer
>>N = 7;
Now, to create a feed-forward network with 3 input, N hidden neurons with tanhnonlinearity and one output with linear activation function we use the same function used
in Exercise 1 newff
>> net = newff([min(in_tr,[],2) max(in_tr,[],2)],[N 1],{'tansig' 'purelin'});
>> net.trainParam.epochs = 1000;
>> V.P=in_tst;
>> V.T=out_tst;
Lets train the network: the default training method is Levenberg-Marquardt
[net,tr] = train(net,in_tr,out_tr,[],[],V);
The structure of the network can be optimized by monitoring the
error on the test set. Here is a list of test set errors as a function
of the number of hidden neurons
N=2: 1.6889e-03
N=3: 1.5500e-03
N=4: 6.0775e-10
N=5: 5.2791e-08
N=6: 3.2476e-08
N=7: 3.2816e-10
N=8: 6.3030e-10
N=9: 5.8722e-10
N=10: 2.7228e-09
N=15: 5.16033e-08
N=30: 2.3727e-11
The number of delays could also be optimized the same way
different random initialization would also give different values
Now, the trained network can be used for predicting a new datapoint recursively (ten
steps ahead)
>> for i=1:10
>>
data(end+1)=sim(net,data(end-2:end)');
>> end
>> data(end-9:end)
Download