Visually guided behavior in drosophila Can R2 cells activation predict navigational performance? Aviv Kadair Neuroscience Drosophila fly is using view based homing to navigate around. Although it has been shown that the central complex is TABLE OF CONTENTS Abstract……………………………………………………………………………………………………………….…………1 Acknowledgements…………………………………………………………………….………………………………….2 List of Figures……………………………………………………………………………………………………….…………3 Chapter 1. Introduction……………………………………………………………………………….…………………………….4 1.1. Drosophila visual system 1 1.2. Visually guided behaviour and the central complex 1 1.3. Visual navigation in insects 1 2. Methods 2 2.1. Material 2.2. Computational visual system 3. Results 3.1. Orientation error 3 3.2. Catchment area: depth 3.3. Catchment area: width 4. Discussion 4.1. Interpretation 5. Conclusions 5.1. Summary 5.2. Further research 6. Figures 7. Appendix 8. List of references abstract Introduction: Insects use view based homing for navigation (Graham and Philippides, 2012; Ofstad et al., 2011; Zeil, 2012), where they navigate by memorising snapshots of the environment and use them to extract direction. This is done successfully although insects, such as drosophila, have compound eyes that provide low resolution vision (Paulk et al., 2013). Researchers are highly interested in understanding the location of the navigational memories, and the main candidate in the fruit fly is the ellipsoid body (Ofstad et al., 2011). Flies, unlike insects like bees and ants, do not receive visual input into the mushroom bodies. Ofstad et al. (2011) claimed this might be due to working memory needed in flies, compared to long term memory other insects need to store the representations. Drosophila’s eye has multiple lenses, and the retina is comprised of 750 ommatidia, eye units, made of a single lens and photoreceptors (Graham and Philippides, 2012; Paulk et al., 2013). Therefore, the spatial resolution is highly limited, as diffraction restricts the resolution possible to each aperture (Graham and Philippides, 2012).Each ommatidium corresponds with a specific angle of the visual field, and the density of the eye units is not constant to overcome the limitation described above. Despite the low spatial resolution, the insects have better temporal resolution and they enjoy a wide field of view (Graham and Philippides, 2012). Downstream of the photoreceptor cells, the cartridges process visual information from the matching region of the visual field. The synapses that link the R1-R6 photoreceptors to the ~750 cartridges are located in the lamina, the first neuropil in the optic lobe (Takemura et al., 2013), and they are involved in motion detection. Inhibiton if L1 and L2 neurons of the lamina leads to blockage of wide field motion detection, as they are responsible for increase and decrease in light detection (Zhu, 2013). The medulla, which is downstream the lamina in the optic lobe, is composed of ~750 columns which synapse to the R7 and R8 cells, and allow colour processing as well as the second synapses of the motion circuit (Zhu, 2013). The medulla's columns contain more than 60 types of neurons, and has 10 different layers, 6 of them synapse to R8 receptors (Paulk et al., 2013; Zhu, 2013). The neurons in the medulla are categorised by their morphologies and connectivity: Mi are the intrinsic medulla neurons, Tm are the transmedulla neurons that connect the medulla and lobula, TmY are Y shaped transmedulla neurons connecting the medulla, lobula and lobula plate, and the bushy T neurons connect neurons across different layers of medulla and lobula (Zhu, 2013). 8 types of photoreceptors, ranging from R1 to R8, detect light that may range from UV to green. The photoreceptors and majority of the neurons of the retina project retinotopically, as they maintain a spatial relationship between the activated neurons when the information is mapped onto the optic lobe (Paulk et al., 2013). Vision begins as the photons hit the light capture structure, the rhabdomere, and initiate visual cascade of rhodopsin in the photoreceptors (Paulk et al., 2013). 5 different types of rhodopsin are expressed in the eye; they differ in their peak of absorption range (345-508 nm), and each photoreceptor can only express one type of rhodopsin: R1-R6 cells express Rh1, R7 expresses Rh3/Rh4, and R8 expresses Rh5/Rh6. The varied rhodopsin expression implies the fly has colour vision. In addition to varied wavelengths and light direction, a specialized ommatidia next to the dorsal rim is able to detect polarised light. Polarised light can also be detected by distributed facets of the ventral eye. While the dorsal rim is using celestial cues for navigation, the ventral eye is using water, plants and other ground related items for pattern recognition (Zhu, 2013). From the medulla, the information flows into the lobula complex. This complex contains the lobula and lobula plate, and the retinotopic map is loosely preserved there too (Paulk et al., 2013). Following the optic lobe, the information flows into the central complex. This complex is made of the Protocerebral Bridge, located in the posterior dorsal brain; noduli; and central body (Pfeiffer and Homberg, 2014). 4 types of neurons innervate the central complex: tangential neurons link the Protocerebral bridge (TB neurons), upper (TU neurons) and lower (TL neurons) division of the central body or the noduli (TN neurons); amacrine neurons are anaxonal and present only in the fan shaped body; Pontine neurons are intrinsic neurons of the fan shaped body; columnar neurons have link the Protocerebral Bridge and central body. The central body has 2 compartments, the fan shaped bodies in the upper division and ellipsoid body in the lower one (Pfeiffer and Homberg, 2014). The latter has 4 layers shaped as rings, made of the tangential ring neurons. These GABAergic neurons have inhibitory and excitatory subfields, and they are the analogue of the mammalian primary visual cortex cells (Seelig and Jayaraman, 2013). Hence, they have strong orientation tuning with a preference to vertical lines, while the mammalian cells have preference towards horizontal lines. As seen in figure 1, the R1 cells form the core of the ring while the R2 and R4d cells create the outer ring. R2 and R4d cells have overlapping receptive fields, with different peak sensitivities, and they are essential for landmark-driven spatial memory and visual pattern recognition (Neuser et al., 2008; Seelig and Jayaraman, 2013). Zhu (2013) suggested that the visual response can be divided into 3 categories: motion perception, supported by the time gap a flying object will have on each separate ommatidium; colour vision, supported by comparison of spectral characteristcs of the photoreceptors; pattern recognition. Drosophila’s visual system is therefore low cost, but it provides advantages when building artificial systems: a large field of view but with a cheap and easy construction, and just enough information to navigate without overloading the system (Graham and Philippides, 2012). In a world where systems are designed to be smaller and more efficient without reducing performance, the fly eye is a great model organism. However, as behavioural genetics provide us with the knowledge of the importance of structures like the central complex, the lack of in vivo recordings prevents us from knowing how the processing is done (Seelig and Jayaraman, 2013). A computational model such as the one used in this project can help and bridge the gap. Methods: The photos were taken using Nexus 4 smartphone, running stock rom 4.4.2 and stock camera app, version 2.4.008, with photosphere function. They were filmed in October and November, mostly between 25/10-10/11 , a time which was characterised in multiple changes in the weather. The filming took place in various hours of the day, to provide large range of illumination conditions. They were categorised into 6 scenes: urban, in Brighton’s streets; woods around the university; university structures; downs of Stanmer park; my flat’s kitchen; and library shelves. The photos were spread across 19 locations; some of them were different point of views of the same place while some were different locations. The same point of view was maintained to allow a clean comparison. A total 156 photos were filmed and analysed. A few filming ideas were unsuccessful; close distance faces are impossible to film with this panorama algorithm; angry kettle prevented me from going back to some of my filming locations; low level shelves in the library, where light is not visible, were also impossible to shoot. Computational visual system: The photos were then analysed in MATLAB 2014, using a code (appendix) which was provided by the supervisor. The raw photos are all resized into 39*360 dimension, and were converted into greyscale. The code compares pairs of photos, by placing them one on top of each other and rotating one against the other. It then assesses the position of the minima point, and provides an output of width and depth of the catchment area. A greater similarity between photos would result in a deeper and wider catchment area (figure 2). 3 degrees of processing were applied on the raw photos by MATLAB (figure 3): high resolution, as if the photo is processed in a human visual system; low resolution, mimicking the fly's visual processing; and R2 resolution, even lower than the "low" setting (figure 3a-c). The results were categorised by resolution level, and averaged to retrieve the orientation error of the comparison. The code output was examined with a 2 tailed paired t-test in excel 2013, to assess the impact of the information reduction as a matter of resolution. The analysis was done in 3 different conditions: self-comparison of all the photos; comparison of photos from the same location, based on date and time, so the impact of the weather and illumination was considered; comparison of the same location from different points of view, assessing whether the fly can navigate in a scene without fully retrieve the original perspective, hence mimicking a learning flight. Results: The orientation error was measured by averaging image differences and plotting the information in 2 histograms, one for same perspective and one for different perspective. The results were between 50-80 degrees (figure 4). The orientation error was smaller when the perspective was maintained (figure 4a; table 1a), 50-60 degrees, and significantly increased as the resolution was reduced to R2 cells level (p<0.05). As for different perspective comparison (figure 4b, table 1b), the range was greater, 50-80 degrees, and the low resolution showed lower error than the high and R2 resolutions. R2 cells, in that case, were not significantly different than either of the other two resolution, but the difference was notable when high and low resolutions were compared (p=0.048771). While examine the depth of the catchment area using paired t-test (figure 5), both comparisons showed a similar pattern of depth increment inversely of the resolution. The resolutions were compared in pairs, as presented in table 2. When the same perspective is maintained (figure 5a, table 2a), we can see the only significant result was while comparing high resolution to R2 resolution (p=0.01), and so gradually reduction of detail level is not likely to affect the performance. When we recovered the perspective, any pairing with the high resolution resulted in significant differences (figure 5b, table 2b). In both graphs there were no significant differences between low and R2 resolution. Width analysis (figure 6, table 3) showed no significant differences in any of the categories (P>0.05). Discussion: This project aimed to compare R2 cells activation to overall cells during processing, and assess whether there is a predicting trend. The orientation error provide us an evolution of the pixel differences and thus ensuring the results are not random. Random results would be around 90 degrees, while ideal results would be closer to zero degrees. The outcomes show that although the match was not perfect, as it was significantly higher than 0 degrees, it also was not arbitrary. The non-perfect match might have emerged from spinning during the photo-shoot. This can be solved in a further experiment by using a rotating tripod, and ideally a monopod with only one leg. It will keep the camera in a constant height without basing on human movements, and so would only change the filming angle. Since the orientation error was significantly increased with decrement of resolution, we can conclude it is indeed harder to find a match between pairs of photos only basing on R2 cells output. As for perspective changes, low resolution provides us with a significantly smaller orientation error, and thus promoting the idea that other cells are important during the processing and R2 cells cannot act alone when the images are not aligned. Zeil et al. (2003) define the catchment area as a spatial difference resulted from the distance of an image from its reference position. The desired catchment area (figure 2) is as deep and wide as possible, as such an area will provide more angles of matching. The statistics analysed the depth and width of the area independently. Regarding depth, both comparisons showed a similar trend of increment with the reduction of information, and they also both showed a significant difference when R2 resolution was matched with high resolution. This suggests that a gradual reduction would not affect the performance significantly, and may explain why R2 cells are not sufficient for vision and being supported by other cells. Furthermore, when the pairs of photos are from different perspectives, high resolution is notably different than low resolution too. Nevertheless, since the fly's vision is closer to "low" setting rather than high, we can assume that recovering perspective is not significantly harder using only R2 cells rather than the whole system. We can conclude that the overall performance would not change much with information filtering. As for the width of the catchment area, we can see a slight decrement with the reduction of information, but it is not sufficient to imply a significant difference. R2 columns suggest the area is a little narrower compared to the other two levels of information, but it is not enough to conclude that the performance would be harmed by the reduction. As the main question in this paper is to assess navigational performance based on R2 cells compared to the entire visual system, we can conclude that R2 cells in this computational model are an important predictor. While the depth of the catchment area is affected by sharp resolution changes, the width does not follow the same trend. R2 cells perform similarly to the low resolution setting, but more importantly recovering the point of view is done better in low resolution. It implies the fly can navigate better than us without aligning former snapshots with the current present view. This conclusion is supported by Zeil et al. (2003) that stated that insects do not only take snapshots of a scene, but also perform learning flights in which they move backwards while facing the target, and pivoting around. This allow the insect to eliminate shadow contours so the future matching process will only be based on motion defined outlines. This work was only based on image differences, but it is long known that insects use landmarks on top of the matching process (Collett and Zeil, 1997). Although Zeil et al. (2003) showed in their work that the matching process is sufficient for view based homing, it is also important to note that landmarks are used to point out the orientation of two objects towards each other. The example given in their work describes the difficulty of locating a nest entrance in either sides of a small landmark. The model used in this work is using moving people in the photos to add noise, but there is no assessment and comparison of how the amount and location of people in the photo can used as live landmarks. A further research would examine the R4d cells in the same context, and would allow comparison between the two types of cells. As mentioned before, R2 and R4d cells form the outer layer of the ellipsoid body and therefore it is interesting to compare their behaviour in different scenes, and assess if one of them is more usable in specific circumstances. Information is costly, whether it is a biological system or a computational system we analyse. The neurons consume a lot of energy, and much of the acquired information does not cross the bottle neck exhibited in each organism. In an artificial system such as robots, sensitive sensors are expensive and detailed information requires larger resources, for example a bigger hard drive, as well as frequent clearing of the drive. Therefore, the importance of computational models such as this is shown in economical engineering. We can use it to design low-budget robots, which are more reliable and less expensive than guide dogs (the writer owns one dog that was unsuccessfully trained); that way, we can overcome the current gap between behavioural genetics and our ability to perform in vivo experiments. As for the main question, whether more information equals better performance, the answer is no, but there is also a boundary where the reduction is too much and so useless. Figure 1: the ellipsoid body ring layers (Seeling and Jayaraman, 2013) Figure 2: catchment area An ideal catchment area is deep and wide. Parameters are shown in red A C B D Figure 3: processing level of photos (a) Raw photo of library square (b) grayscale, high resolution degree (c) low resolution degree (d) R2 resolution degree Figure 4: orientation error Same perspective A 80 (a) Photos comparison from the same perspective. The error increases with the reduction of the information level. 60 40 20 0 high low R2 Different perspective B (b) Pairs of photos from different perspective. Low setting is significantly smaller then high level. 80 60 40 20 0 high low R2 A B Resolution compared P value T stat High-low 0.645 T(18)=-0.46914 Table 1: paired t-test orientation error High-R2 Low-R2 (a) Same perspective error increases with reduction of resolution. R2 level is significantly higher than other resolutions. 0.027 0.016 T(18)=-2.42442 T(18)=-2.64980 Resolution compared P value T stat High-low 0.049 T(5)=2.591117 High-R2 0.355 T(5)=-1.01934 Low-R2 0.067 T(5)=-2.32902 (b) Different perspective of the same location. Low setting is significantly smaller than high setting. Figure 5: Depth of catchment area Depth (same perspective) A (a) Same perspective. Depth increases with details reduction. The differences are significant only between high and R2 levels. 0.30 0.20 0.10 0.00 high: depth low: depth R2: depth Depth (different perspective) (b) Different perspective. Depth increases with detail reduction. High setting is significantly different than the other two levels. B 0.30 0.20 0.10 0.00 high: depth low: depth R2: depth Resolution compared P value T stat T(18)=-1.96 High-low 0.658 Resolution compared High-low P value 0.013 T stat T(5)=-3.817 High-R2 0.01 T(18)=-2.856 High-R2 0.006 T(5)=-4.622 Low-R2 0.338 T(18)=-0.984 Low-R2 0.4 T(5)=-0.918 A B Table 2: depth of catchment area (a) Same perspective depth. R2 catchment area is significantly deeper than high setting. (b) Different perspective depth. High setting catchment area is significantly shallower than the other processing levels. Width (same perspective) A Figure 6: width of the catchment area (a) Same perspective width. A nonsignificant reduction of the width follows the reduction of the resolution. 60 40 20 0 high: width low: width R2: width Width (different perspective) B 60 (b) Different perspective width. A nonsignificant reduction of the width follows the reduction of the resolution. 40 20 0 high: width low: width R2: width Resolution compared P value T stat 0.645 T(18)=-0.468 High-low Resolution compared P value T stat 0.571 T(5)=0.606 High-low High-R2 0.317 T(18)=1.028 High-R2 0.545 T(5)=0.648 Low-R2 0.147 T(18)=1.515 Low-R2 0.65 T(5)=0.482 A B Table 3: width of catchment area (a) Same perspective width. No significant differences between the categories. (b) Different perspective width. No significant differences between the categories. Reference list Collett, T.S., Zeil, J., 1997. The selection and use of landmarks by insects, in: Lehrer, D.M. (Ed.), Orientation and Communication in Arthropods, EXS. Birkhäuser Basel, pp. 41– 65. Graham, D.P., Philippides, A., 2012. Insect-Inspired Vision and Visually Guided Behavior, in: Bhushan, P.B. (Ed.), Encyclopedia of Nanotechnology. Springer Netherlands, pp. 1122–1127. Neuser, K., Triphan, T., Mronz, M., Poeck, B., Strauss, R., 2008. Analysis of a spatial orientation memory in Drosophila. Nature 453, 1244–1247. doi:10.1038/nature07003 Ofstad, T.A., Zuker, C.S., Reiser, M.B., 2011. Visual place learning in Drosophila melanogaster. Nature 474, 204–207. doi:10.1038/nature10131 Paulk, A., Millard, S.S., van Swinderen, B., 2013. Vision in Drosophila: seeing the world through a model’s eyes. Annu. Rev. Entomol. 58, 313–332. doi:10.1146/annurevento-120811-153715 Pfeiffer, K., Homberg, U., 2014. Organization and Functional Roles of the Central Complex in the Insect Brain. Annu. Rev. Entomol. 59, 165–184. doi:10.1146/annurev-ento011613-162031 Seelig, J.D., Jayaraman, V., 2013. Feature detection and orientation tuning in the Drosophila central complex. Nature 503, 262–266. doi:10.1038/nature12601 Takemura, S., Bharioke, A., Lu, Z., Nern, A., Vitaladevuni, S., Rivlin, P.K., Katz, W.T., Olbris, D.J., Plaza, S.M., Winston, P., Zhao, T., Horne, J.A., Fetter, R.D., Takemura, S., Blazek, K., Chang, L.-A., Ogundeyi, O., Saunders, M.A., Shapiro, V., Sigmund, C., Rubin, G.M., Scheffer, L.K., Meinertzhagen, I.A., Chklovskii, D.B., 2013. A visual motion detection circuit suggested by Drosophila connectomics. Nature 500, 175–181. doi:10.1038/nature12450 Zeil, J., 2012. Visual homing: an insect perspective. Curr. Opin. Neurobiol., Neuroethology 22, 285–293. doi:10.1016/j.conb.2011.12.008 Zeil, J., Hofmann, M.I., Chahl, J.S., 2003. Catchment areas of panoramic snapshots in outdoor scenes. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 20, 450–469. Zhu, Y., 2013. The Drosophila visual system. Cell Adhes. Migr. 7, 333–344. doi:10.4161/cam.25521