Trial to trial variability of monkey spiking data

advertisement
Trial to trial variability of monkey spiking data
Project in Introduction to Computational Neuroscience
Final Report
Marek Oja and Andre Tättar
Supervisor: Kristjan Korjus and Raul Vicente
Introduction
In this small project we are studying neurons trial-to-trial variability. When the same
stimulus is used then the neuron response is different in every trial. According to recent
studies trial-to-trial variability of spiking activity characteristic of cortical neurons could be a
source of information about the state of neurons and their participation in behavioural tasks
[1]. The variability drops rapidly with the onset of stimuli and after that it declined more
slowly [1,2]. These experiments were carried out with monkeys performing motor
preparation and motion discrimination tasks [1,2]. In our project we have data recorder from
monkeys brain while performing memory task which requires short-term memory. The
variability of data is described with different measures and we use coefficient of variation
[3]. The aim of this project is to calculate the coefficient of variation, to cluster the neurons
on some similarity measure, interpret the results and see if we can see similar change in
variability before stimuli as in other experiments.
Methods
Coefficient of variation
The coefficient of variation (CV), also known as “relative variability”, equals the standard
deviation divided by the mean [5]. The CV for a single variable aims to describe the
dispersion of the variable in a way that does not depend on the variable's measurement
unit. The higher the CV, the greater the dispersion in the variable. The CV for a model
describes the model fit in terms of the relative sizes of the squared residuals and outcome
values. The lower the CV, the smaller the residuals relative to the predicted value. This is
suggestive of a good model fit. We use CV because it is good for comparing variation
between datasets, where the means are considerably different from each other. We can use
and make sense of CV because amount of spikes is always positive, CV cannot be used with
negative sets. A disadvantage of CV is that when mean value is close to zero, CV will
approach infinity and is therefore sensitive to small changes in the mean.
The monkey memory test explanation
The experiment was performed on a female rhesus monkey (Macaca mulatta) and
microelectrodes were placed in the monke’s prefrontal cortex [4]. The experiment is
illustrated on figure 1. First monkey is shown sample picture for one second and after that
there is three second delay before test picture is shown. Test picture can be the same or
different that sample picture. Now monkey has to answer if the picture is the same as
sample picture or not. The picture showing, monkey response and reward given times are
also recorded.
Figure 1. The experiment to study monkey’s short-term memory. [4 and references therein]
The dataset description
The used dataset was experimentally collected by (and belongs to) professor Matthias Munk
at the Max Planck Institute for Brain Research in Germany [4 and references therein]. The
signals were recorded from 58 neurons and experiment was performed 871 times during
one day. One trial lasted for 6.5 seconds during which time electrodes recorded when
neurons fired. The recordings were made in every 1 ms and we have 6500 data points. Also
there are recorded the trial numbers when monkey answered correctly and incorrectly. In
this project we use only data for which the monkey answered correctly. The monkey
answered correctly 615 times. In the end we have 3D matrix with dimensions 58 times 871
times 6500. Sample data is displayed on figure 2.
Neuron
Rasterplot of spiking in trial 1
58
57
56
55
54
53
52
51
50
49
48
47
46
45
44
43
42
41
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
1000
2000
3000
4000
5000
6000
Time (ms)
Figure 1. Rasterplot of spiking for trial one. On x-axis we have time in milliseconds and on y-axis we
have neurons. For every neurons corresponds one line and every blue line marks that the neuron
fired at that time point. Between about 1000 ms to 2000 ms (2 first red lines) the sample picture was
shown. At time moment about 5000 ms (third red line) the test picture was shown. Fourth red line
shows when the monkey answered the question and last red line shows when monkey got the treat.
Description of the code
To run the code, the file TrialToTrialVariabilityProject.m has to be run in Matlab. In the same
folder there has to be the data TrialToTrialVariabilityProject.mat. First we have to change
the data, we have the time points, when spikes are occurring, but we want data where there
is 1 if spike occurs and 0 if there is no spike. Then we divide the data into 100 ms or 500 ms
segments (time windows) and sum the number of spikes in these intervals. After that we
find the mean and standard deviation over all trials and calculate the coefficient of variation.
In the end is the plotting and clustering part. We also normalize the coefficient of variation
using data points between 0 – 1000 m. Depending on the time window we take 1 or more
values from the region 0 -1000 ms and find the average of these and we divide other data
points with these values. In this way we can see how many times coefficient of variation
changes in different parts of the experiment.
Clustering
In order to see which neurons are acting similarly we tried to cluster the neurons according
to their coefficient of variation. Clustering was done using Matlab’s hierarchical clustering
functionality [6]. In hierarchical clustering first step is to find similarity or dissimilarity
between every pair in the data set. The distance between pairs in the data set can be
calculated using different similarity measures, e.g. Euclidean distance, cosine (one minus the
cosine of the angle between points (the data points are treated like angles)), correlation
(one minus the correlation between points (the data points are treated like sequence of
values)) and others. Important parameter is linkage which describes how the distance
between clusters is calculated. In this project we use “average” as this parameter. It means
that the distance between clusters is taken as the average of distances between two clusters
points. In matlab the distance is calculated using function pdist. In the next step objects are
grouped into a binary, hierarchically clustered tree. This is done in Matlab using command
linkage. In the last part in the algorithm is to determine where to cut the hierarchical tree
into clusters. This is done using cluster function in Matlab. In Matlab all these step are put
into one function called clusterdata. For clustering we used Euclidean distance and
maximum number of clusters was 15.
Results and Discussion
On figure 3 and 4 we have plots with coefficient of variation for two different time windows of 100
and 500 ms. It is very difficult to understand what means blue (values around 1) and red (values
around 10) colours on the graph. To understand better the coefficient of variation values, we also
plotted histograms (figure 5) for neuron 56 in region 5000 – 6000 ms (it is with red colour) and for
neuron 15 in region 2000 – 3000 (with blue colour) to see how many spikes we have in these regions
in different trials. From these histograms (figure 5) we can see that the blue values correspond more
to these neurons which have almost normal distribution but almost no zeroes in this time window
(on figure 5 right). Going from smaller coefficient of variation values to larger values this histogram
the histogram peaks shifts to the left towards zero (on figure 5 left). When the coefficient of variation
is large then there are many trials where in this region there were no spikes present (on figure 5 left).
Many zero values cause the mean to be small and standard deviation large and due that also the
coefficient variation is large. We also clustered these data using hierarchical clustering method.
Results are displayed on the figure 6 and 7. From these plots we can see that there is a big cluster
and other smaller clusters but we can’t tell if there is any drop or rise before stimulus in the
coefficient of variation.
Figure 3. Using trials where monkey answered correctly. The coefficient of variation is plotted using
time window of 100 ms.
Figure 4. Using trials where monkey answered correctly. The coefficient of variation is plotted using
time window of 500 ms.
900
250
800
200
700
Number of spikes
Number of spikes
600
500
400
150
100
300
200
50
100
0
0
0
1
2
3
4
5
6
7
Number of spikes in one trial in certain time window
8
0
50
100
Number of spikes in one trial in certain time window
150
Figure 5. Histograms for neuron 56 in region 5000 – 6000 ms (on left) and for neuron 15 in region
2000 – 3000 (on right).
Neuron
10
56
34
55
53
52
51
49
48
47
44
43
42
41
40
39
37
36
35
30
29
25
24
23
20
19
18
17
16
15
14
13
11
10
9
6
3
1
57
54
46
38
33
28
27
22
12
7
5
4
26
2
50
8
58
45
31
21
32
9
8
7
6
5
4
3
2
1
0
1000
2000
3000
4000
5000
0
6000
Time (ms)
Neuron
Figure 6. Using trials where monkey answered correctly. The clustered results are plotted using time
window of 100 ms.
6
56
34
26
57
54
46
43
38
33
28
27
22
21
12
7
5
4
45
32
31
58
50
2
8
55
53
52
48
42
41
40
37
36
24
23
19
18
14
13
10
9
3
51
49
47
44
39
35
30
29
25
20
17
16
15
11
6
1
5
4
3
2
1
0
1000
2000
3000
4000
5000
6000
0
Time (ms)
Figure 7. Using trials where monkey answered correctly. The clustered results are plotted using time
window of 500 ms.
To see the change in the neurons variability we normalized the data using data points in region 0 –
1000 ms. The results are displayed on figure 8 (for 100 ms time window) and figure 9 (for 500 ms
time window). From these data we see that the variability of neurons rises just before the test
picture and stays high until the reward has come (figures 8 and 9). In previous studies it was shown
that the variability drops just before stimulus [1,2]. We also clustered these results (figures 10, 11
Neuron
and 12). On figure 10 is dendrogram presented for 100 ms time window. From this dendogram we
can estimate the number of clusters present in our data. From these clusters we can see that large
number of neurons are acting similarly and there is a number of outliers. There are two large sets of
neurons. In one set coefficient of variation rises before test picture and stays high until monkey
receives reward. In the second set coefficient of variation does not change much when the test
picture is shown. On similarity for almost all neurons is that after sample picture is shown the
coefficient of variability drops. We did not identify similarly acting neurons as in previous
publications [1,2].
2
58
57
56
55
54
53
52
51
50
49
48
47
46
45
44
43
42
41
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
0
6000
Time (ms)
Neuron
Figure 8. Using trials where monkey answered correctly (normalized data). The normalized results
are plotted using time window of 100 ms.
2
58
57
56
55
54
53
52
51
50
49
48
47
46
45
44
43
42
41
40
39
38
37
36
35
34
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
6000
0
Time (ms)
Figure 9. Using trials where monkey answered correctly (normalized data). The normalized results
are plotted using time window of 500 ms.
3.5
3
2.5
2
1.5
1
12 23 18 14 21 9 22 1 15 16 27 28 7 8 2 3 4 19 13 5 20 24 6 10 29 30 17 26 11 25
Neuron
Figure 10. Dendrogram for normalized data (time window 100). The maximum number of clusters
should be about 8.
2
56
34
26
50
8
58
57
7
5
4
3
2
38
19
33
32
55
52
49
48
43
42
41
40
37
36
24
23
22
21
18
17
15
14
13
12
10
9
51
47
46
45
44
39
35
30
29
28
27
25
20
16
11
6
1
54
53
31
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
6000
0
Time (ms)
Figure 11. Using trials where monkey answered correctly. The normalized and clustered results are
plotted using time window of 100 ms.
Neuron
2
26
34
57
42
5
2
50
55
52
51
47
46
44
39
35
30
29
28
27
25
20
16
11
8
6
1
7
56
38
33
31
58
49
48
43
41
40
37
24
23
22
21
17
15
14
13
12
10
9
4
54
53
45
36
32
19
18
3
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
1000
2000
3000
4000
5000
6000
0
Time (ms)
Figure 12. Using trials where monkey answered correctly. The normalized and clustered results are
plotted using time window of 500 ms.
Conclusion
We worked with data measured from a monkey’s brain while performing memory task which tested
short term memory. We did data pre-processing, normalization and clustering. We found that
neurons are acting differently. For some neurons the coefficient of variation rises when sample
picture and test picture are shown. For other neurons the coefficient of variation does not change
much during the experiment. We did not identify similarly acting neurons as in previous publications
[1,2].
References
[1] C. Hussar, T. Pasternak, PNAS, 107(50), 2010, 21842-21847.
http://www.pnas.org/lookup/doi/10.1073/pnas.1009956107
[2] M. M. Churchland, B. M. Yu, S.I. Ryu, G. Santhanam, K. V. Shenoy, The Journal of
Neuroschience 26(14), 2006 3697-3712. DOI:10.1523/JNEUROSCI.3762-05.2006
[3] http://en.wikipedia.org/wiki/Coefficient_of_variation
[4] K. Martšenko, “Using Machine Learning to Analyze Brain Activity During a Short-Term
Memory Task”, Bachelor’s Thesis, Tartu, 2014.
http://comserv.cs.ut.ee/forms/ati_report/datasheet.php?id=41081&year=2014
[5] http://apacgemba7.wikidot.com/statistics:variance-standard-deviation-and-coefficient
[6] http://www.mathworks.se/help/stats/hierarchical-clustering.html
Download