>> Hannes Gamper: Okay, good morning, everyone. Welcome to this talk. It's my pleasure to introduce Supreeth Krishna Rao. He's been with us as a summer intern over the past 12 weeks, working on ultrasounds. He's a master student at the Worcester Polytechnic Institute of Robotics Engineering, and without further ado, the floor is yours. >> Supreeth Krishna Rao: Thank you, Hannes. Very well. Good morning, everybody, the audio team, Dr. David Heckerman, Mark Heath, Dr. Ivan and Hannes and everyone. So, basically, today, I will be presenting to you the work that I did with my mentor, Hannes Gamper, at Microsoft Research, Redmond, Washington, in the last 12 weeks. So we attempted to basically develop hardware and algorithms for ultrasound Doppler radar, and the primary objective was basically imaging range and velocity profiles of objects of interest in the field of view. So let's see in a more formal way what the objective was. So given a system, we wanted to image and estimate right around the system that is basically a 360-degree horizontal field of view. In that field of view, we wanted to estimate -- measure and estimate targets' position and velocity relative to the user, and why would anyone do that? Probably for such situations wherein, with increasing technology or with more number of disabled people, we need more intelligent systems to be actively sensing the ever-changing environment for us. And this is one typical case, wherein a user is happily listening to some music while walking on a busy road. That's definitely going to come in the very close future, and he need not really worry about the vehicles approaching him, because such a system aims to warn him about these targets. So the underlying principle for this system was basically Doppler effect and time of flight, and just to give you a quick illustration of what Doppler effect really captures is basically when you actually transmit a signal, there is a relative -- there is actually a change in frequency when there is relative motion between the source and the receiver. So, in this case, the source would be the user. The object would be probably the car or the target, and again, the receiver is present with the user. So let's take a look at why we chose ultrasound to achieve this task, whys and why nots. And, firstly, ultrasound is very low power consuming, and since it's not very high frequency -- it's about 40 kilohertz -- high-frequency components are not pumped into the circuit, so this is basically why the electronics is really cheap when you're designing an ultrasound system. And small factor -- small form factor is a great advantage, because it offers a great choice for mobile devices, because as you can see in the latest slides, a transducer, an ultrasound transducer, is millimeters. It's really small, so even after using probably an array of transducers, the form factor will not have really blown up. And most important thing, again, is it's outside human perceptible range, so if the device is being used by the user, it won't really interfere with his normal day-to-day activities. Well, works very well in indoor, outdoor, well-lit and you don't need light. So it's as against any other, for example, cameras or something. So it doesn't need illumination. It's active sensing, and it works equally well indoor and outdoor. And array signal processing, basically, can be leveraged to get a 360-degree field of view, which we have in this system. And the responses of these transducers and receivers are not super directional, so we need beamforming to really improve the spatial angular resolution of our sensing, and we can leverage active time of flight sensing. That is basically how the transducers work, to get a range or depth estimate. And again, this is not really available in passive sensing devices, like cameras. And to cap it all, basically, we attempt to use Doppler effect to get even velocity profiles of targets in the field of view, and this is every frame, so this is not across several frames. So every frame, we have an estimate of the velcotiy and the distance. That was the attempt. And so immediate question to ask, again, is why not use cameras? Well, we don't get depth information, and also we need illumination, because it's not active sensing, and okay, we get depth information using stereo cameras, but why not that? So they require global shutter, and this often shoots up the price of the cameras and the entire setup, and stereo vision cameras require precise alignment and calibration. Firstly, this is really time consuming, and even after sufficient calibration, the depth resolution or the depth accuracy is not really fantastic. So one might say to avoid these calibration issues, why not use a commercial stereo camera setup, like Bumblebee or Asus? They actually require very computationally intensive processing algorithms, like sum of absolute differences or sum of squared differences, and these often necessitate a GPU, and that again shoots up the power requirements and the cost. And again, no 360-degree field of view. Well, to get 360-degree field of view, we can use LIDARs, but they're expensive and power hungry. How about Kinect? Kinect manages to solve many of these issues, apart from 360-degree field of view, but 36 watts power consumption, and it requires a fan to keep it from overheating. Keeping all of these in mind, we decided to use ultrasound, and we'll see more about that. So we actually have to be objective and take a look at the other side of the coin, also, so limitations and challenges of ultrasound. I guess the list is pretty long compared to the previous slide, so challenges are more than -- a lot more. Firstly, calibration is still required, and it often demands anechoic conditions. An off-the-shelf researcher who wants to use such an array or such transducers can't really get access to costly anechoic chambers. That's still there. And sensor responses are basically very wide angle, and this kind of affects the special angular resolution, so that necessitates a lot of signal processing. And frequency responses are, again, frequency and temperature dependent, because these were transducers. Reflections are one more challenge, because you have multipath, multiple and specular reflections. Again, that necessitates a lot of signal processing. So we are moving closer and closer to some major issues that we experienced, so basically maintaining good signal tonoise ratio, achieving pratical frame rates and making sure that the device operates over a wide range, so some of the constraints that control these parameters are basically power, pulse width and attenuation of ultrasound in air. So power basically is ultimately encasing the power would increase the distance and the range, but that would again result in overheating, so one can't really crank up the power after some point of time, beyond a point. And pulse width improves SNR, but again, kills both frame rate and range, and of course, we have more than 3 dB per meter of attenuation of ultrasound signals in the air above 100 kilohertz. So, basically, it calls for some sort of an optimization between all of these. So these are basically the questions at hand, how to increase SNR. Actually, good directivity, high frame rate and preent overheating. And again, estimate velocity in a single shot. That has not really been done. >>: So [indiscernible] concept flying around, down in like [indiscernible] bats, basically, so what's their range? >>: Bats, they are in the same range, at 40, 45 kilohertz, and can go five-ish, 10 meters max. >>: Ten meters. That's pretty good. >>: We went further than that. >>: I see. Okay. >> Supreeth Krishna Rao: So I'll just walk you through the organization of this stock very quickly, so we dive into -- we present a literature review and then take a look at the problem formulation and then give us more background about Doppler effect and how we can use it for signal design and estimation, and then we will conclude the talk with presenting the current approach, progress to date and results. So one can actually notice that right starting from early 1950s to 2012, 2013 and now 2015, researchers have invested a lot of time and resources on trying to make use of ultrasound, so it basically tells a story, because man has not yet rejected ultrasound, because for all these years, continuous progress has been happening and more and more applications have been coming up. More recently, you can see that ultrasound imaging was actually considered for HCI tasks like gesture recognition, activity, speaker, gender, age, gait recognition, all of these smart-home environments. So ultrasound has stayed with us for a long time, and micro-Doppler signatures, these are particularly interesting applications wherein you basically -- so every small part of a moving object results in a Doppler shift, and that is quite unique. That is what the researchers have reported and probably feeding the signatures into a deep neural network can help us learn, learn the signatures and do a sound recognition. And from a range detection perspective, again, work has been happening since 1997, until 2014. And in fact, the last one that you see here, that was the prototype developed by the previous intern, and we made use of it sufficiently for our preliminary evaluation and testing, and this was again carried out under the same group under Dr. Ivan Tashev. So let us take a look at the problem formulation that is the Doppler radar. Like I mentioned before, this illustrates the concept of Doppler radar. To get more technical, to be more formal, received signal is basically stretched and compressed when the target is actually approaching or receding, and ultimately, this is what we aim to measure. So typically, a tone or a pulse or a chirp is emitted to ensonify the surroundings, and we make use of a chirp signal that goes from 38 kilohertz to 42 kilohertz, and reflection of a moving target at a velocity is basically represented by SR of T, and you can see that the signal is actually stretched and delayed, so the stretch gives us an estimate of the velocity of the object. Delay gives us definitely the distance, basically, the time of flight, and these stretch factors are calculated as a ratio of C plus V by C minus V, where velocity's direction matters, the plus or minus, and time of delay is basically the time required by the ultrasound to reach the required range and get back. So, essentially, we are trying to estimate these two factors. That is, the stretch factor and the time delay, to determine the range and velocity. So particularly from the literature review, this one particular paper gave us some insights about signal design and estimation, so they proposed something called as the wide-band crossambiguity function, which basically is a 2D coupling. It's basically a 2D representation that couples the delay and stretch factors. So, effectively, as you can see there, it's basically a crosscorrelation between the received signal and time delayed and stretched version of the transmitted signal. And some points to be noted is basically for all practical purposes, the integral is over minus T to plus T by 2, basically, because our system is band limited. So we don't need to do that, do the overall integral. And like I mentioned, it's basically a 2D representation of the correlation, and the ultimate task is to estimate these two coordinates in that 2D representation here. So some insights from the same paper about signal design. We see that LFM stands out, as it gives better resolution for estimating the two parameters. This is the Doppler stretched domain that is the time delay domain, and you can see that the Guassian signal would be very, very ambiguous, as against LFM. Some more results in favor of LFM from the literature, we see that we get better resolution using the LFM. So we implemented the signal, basically, that you see here the time-domain version, the frequency domain. Basically, the frequency sweeps from 38 kilohertz to 42 kilohertz, and the pulse width is about five milliseconds. Some of the parameters that we considered while designing the signal to be transmitted is basically we are sampling at 192 kilohertz. Our center frequency is about 40 kilohertz, because our transducer's resonant frequency is basically around that range, around 40 kilohertz. Max range we are considering is about nine meters, currently. This can be changed, and pulse width is about five milliseconds. So one interesting, interesting thing to note here is the sequence width here. It's 10,076 samples that basically these are the samples that are -- that represent the duration the sound takes to basically do a round trip the max range. And sequence width basically decides the max range, and all of this is basically dependent the speed of sound. So taking a look at the implemented pulse train that we are pulsing out from the -- thank you. >>: Does it work? >> Supreeth Krishna Rao: It kind of does. You'll see the results. Thank you. >>: I can't wait for the answer. >> Supreeth Krishna Rao: So looking at the pulse being transmitted, so this is basically -- this represents one block. We have eight channels, and basically eight channels of transmitters and receivers, and 20 frames, so one, two. That goes up to 20. And we make use of PlayGraph to basically interface with the soundcard. We generate the signal in MATLAB. We interface with the device using PlayGraph to play and record the signals. So going into some more detail into the estimation framework that is the wideband cross-ambiguity function, I just want to remind you that is basically a 2D representation of the transmitted and the received signals. So if you can see here, so this is actually the stretch factor, like I mentioned before. We decided to go for the following stretch factors based on these constraints. We decided to basically measure velocities of objects from plus or minus 20 milliseconds -- meters per second. Extremely sorry for that. That is about 40 miles per hour, and directly deriving from that formula, we get the stretch factors, and with regards to the number of stretches, that basically decides the resolution of this ambiguity function that you saw. So for a sample -- for example, for the number of stretch factors of about 31, this is illustration of the stretched LFM. You can see that across 31 stretches, it is stretched in time, and implementing the ambiguity function and testing it for LFM for the idela case, that is zero Doppler shift and zero delay, this is the response that we got, and as you can see, it's pretty high resolution in this domain. That is in the stretch factor domain. And we tested out initial algorithms on the previous prototype that was available and built in this group, and this is how it looks. Basically, you have an array of transducers, microphones, and testing on this setup gave us our first slide. That is some results. As you can see, the plot to the left is basically the cross-correlation. That gives us the measurement of the distance, so you can see that there is a strong peak at around 0.6 meters, and you can exactly see that this cross here, which represents the maxima of this function, is again at around that range. That is 0.6 meters, and one more thing to be noted is the speed. Since this is both a single-shot estimation of both the velocity and the delay, the range, so the speed is more to the negative side, because the object was really approaching the setup, in this case. So, basically, what I mean is, let us assume that -well, this is actually the setup that we were considering, so there was a Kinect facing the setup, and this plane surface was moved backward and forward in one dimension, across this dimension, and this represents the depth map of the Kinect. So we just measured the -- we monitored the depth value at the center of an ROA that represented our object, and that's how we got this plot. So if you can see, the overlaid data of the Kinect depth estimate and our system's depth report, reported by our system, you can see that the slope, which represents the velocity, is quite consistent. Of course, there is an offset, because there was -- by us, in the arrangement between the Kinect and our setup. That can be corrected for. So diving into the hardware design, basically, we have these are electric transducers that we use, and this is how the prototype looks. As you can see, it's basically a low-form-factor device of about 50 millimeter diameter and height of about 100 millimeters, 110 millimeters, and we have the receiver array, the transducer array here. So taking a look at the system architecture, we have this basically is the system architecture. We have our microphones. We have preamplifiers. We make use of RME to basically mix the signals that are sent and received, and it's interfaced to the system, to the computer, through a Firewire, and that is going into, again, a preamplifier, and there is our speaker array. So this looks more intuitive than the previous one, than the previous slide, so basically this is our test and calibration setup. We have our device, and for calibration, we made use of a B&K microphone, and as you can see, this is the previous prototype that was used. So taking a quick look at the IRGUI, that we made use of that was developed by Dr. Mark Thomas. So we make use of certain regions of this GUI to actually send a test signal at around 40 kilohertz, and then we measure the responses. We update the gains, and this is basically the impulse response recorded and the magnitude response to the right. So how was this actually done? So to calibrate our speakers, we actually kept this microphone at one-meter distance, and we pinged through all of our spekers, and the responses recorded here were basically what you saw for one channel, that is, and to calibrate the receiver setup, we used a transducer from this setup. We again pinged, and then we measured the responses using our receivers, and we recorded the impulse responses. So these are some of the directivity patterns that we observed from the measurements for the transmitter. They don't look really directional, do they? >>: They are directional, but not uniform. >> Supreeth Krishna Rao: Yes, so although they are directional, they are not really directional for about 15 degrees, plus or minus 7.5 degrees, which we are looking at. And you can see the transmitter frequency responses. One more challenge that we face -- by the way, this is after calibration. You can see that they're not very well matched, so this is something that kind of created some delay in our thing, in our progress, but of course you can see that the resident frequency is somewhere close to 40 kilohertz. >>: What area is between 45 and 50 kilohertz? This is where they match in their plot out of the rays analysis. >> Supreeth Krishna Rao: Yes. So the receiver directivity pattern looks something like this. Again, they're quite directional, better than our transmitter mod uniform of the one side. >>: This is due to the separating from the [indiscernible], right? >>: Faster than the natural directivity of the microphones itself. They're omni, all your frequencies, but we're talking even just from the can that it's made from is going to have consequences. >> Supreeth Krishna Rao: So one can note that the receiver -- so this is actually the transmitter that we used from the previous configuration developed by the previous intern, Ivan Dokmanic, and these are the receiver responses. So they are quite well matched. That is pretty good, better than this. So all this necessitates beamforming. That's the long story short. All this necessitates beamforming, because we need better spatial angular resolution, and we made use of the BFGUI tool, developed here, again, by Dr. Mark Thomas, at the Audio and Acoustics Research Group at MSR. And this is basically used for generating the beamforming weights. For example, this represents the method we used for beamforming, MVDR closed form, and this basically shows the array setup for our speakers, and this is basically the measured data, basically the dimensions. So if we can see the directivity pattern, and this is exported and stored for later use, which we will see, and the same goes with receiver beamforming weights. We used the MVDR closed form. We evaluated MVDR closed form method, and this is what it looks like. >>: You used omnidirectional directivity pattern on this? >> Supreeth Krishna Rao: Yes, we used omnidirectional, so that made us go for some other method of beamforming that we'll see in the near future, because firstly, our setup was -- our transducers were not really omnidirectional. So these are the results from the previous two GUI slides that you saw. This is for the receiver. It looks pretty good, but it still has a lot of side lobes that really catch many refelctions, and with respect to transmitter beamforming, I don't need to say much. It's not really -- it's not very great. So all this actually made us choose a different beamforming weights estimation framework that we delivered through numerical estimation. We had our acoustic channel impulse responses, and this was actually the desired beam pattern that we were targeting for the receiver and for the speaker. And we actually did numerical estimation. We basically fit the curves to get this, so we are forcing ->>: This is beam pattern synthesis, right? >>: Based on the measured data, rather than ->> Supreeth Krishna Rao: Yes. >>: Okay. >> Supreeth Krishna Rao: So, as you consume see, the MVDR is represented by green. The desired is blue, and curve fitted response is in the red, so it is quite evident that the red ones perform much better than the green ones. That is MVDR classical estimation of the weights. Same goes with the receiver responses. Pretty good. The red ones perform pretty well, and these are all the 24 beams, by the way. So for beamforming, we are making use of 24 beams rotating at step size of 15 degrees, from 0 to 345 degrees, and these are all the 24 beams that you see. So taking a look at the flow chart of this entire system, this is basically a recap of what we saw. We have the IRGUI ttool. We generated the acoustic channel responses. We get the -- we design and generate the desired pattern. I should probably mention a little bit about that. So this was basically the signal, like I mentioned, that generates this pattern. So we generate that signal, that pattern, and then we take the pseudo -- it's basically a duplication of the pseudo-inverse of the acoustic channel matrix. That is basically H, which is basically acoustic channel response and the desired pattern, so this gives us a curve fitting, and ultimately, we arrive at the beamforming weights for this particular recording. So if we do another recording -- another measurement of the acoustic channel responses, we need to regenerate the weights. So once we get the beamforming weights, let's again traverse from here. We generate the eight-channel 20-block signals in MATLAB. We played through PlayGraph,and then we segment the received signal into a matrix, a 4D matrix of it's basically in the frequency. We convert it to frequency domain, and so NFFD, number of mics, that is eight. Number of speakers, that is eight. Number of blocks, that is 20 in our case, so it's basically a 4D matrix. And we perform band pass filtering on the received signal to basically reject all the noise and signals beyond our frequency range of interest, and after that, we do beamforming for the transmitters, so we have the weights from here, the received signal, and then we basically do offline beamforming. This was something that was observed in the previous internship, basically, where the previous intern, he observed that leveraging the LTI, linear time invariance, nature of the system, he basically suggested that transmitter beamforming also can be performed offline, which significantly improves the frame rate. And then, we perform microphone beamforming, as you saw before, and then we perform a match filtering to basically again get rid of unnecessary reflections and noise, basically. We perform a background reflection, which you will be seeing in the next slide. So some preliminary results from these new devices, something that looks something like this. The raw map of 360 degrees, across 24 beams, and we do postprocessing that is to remove the noise by using a naive approach, which is basically background reflection. So you can see that -- I'll play a quick video here. So this was walking detection test to see if the system really detects a person approaching and slowly walking past the device. Exactly. Thank you, Hannes. Among so much clutter, so basically, you're -- our environment was quite cluttered. Ideally, this should have been done in the anechoic chamber, but we received the device towards pretty much the end of the internship, when about three to four weeks were left, and there were heat sink issues and we had to install a heat sink, so the device was actually going back and forth from the hardware lab to us. So with all this -- so that kind of didn't let us move the entire setup to an anechoic chamber and make better measurements. Well, amidst all this clutter, there is some happy news still. So you can see that, for the testing, the walking testing, this was the response. This was the response for the walking detection. You can see. Of course, some of the postprocessing needs to be done. We are currently just using live background subtraction. We can make use of particle filtering [indiscernible] filtering or any confidence-based tracking approach to basically fix the object of interest. >>: Actually, I think it would be interesting to see the video before background subtraction. >> Supreeth Krishna Rao: Oh, right, right. Do you want me to play the video again? >>: Before background subtraction, the raw video, as well. >> Supreeth Krishna Rao: Right. >>: It's right there. >> Supreeth Krishna Rao: So this is actually how it looks. Before background subtraction, you can notice that there are responses all over. Play that again. So there are responses pretty much everywhere, so finding a major source of reflection is pretty hard, especially given our cluttered environments, like as you saw here, pretty much several objects right around the device are at the same distance, so that is why we get -- that's why we get ->>: So this, you're calling it a naive form of background subtraction. It's only working because the sensor is fixed, correct? >> Supreeth Krishna Rao: Absolutely, absolutely. >>: So if the sensor were moving, you would need much more sophisticated. >> Supreeth Krishna Rao: Yes, we would need a tracking, probably an optical floor, a tracking or particle filtering or something like that. >>: So you say reflections here ensure the positions in meters. What about the speed? Can you detect the speed of the moving object and use this for the help? >> Supreeth Krishna Rao: Speed? Could you please come again? >>: The speed of the moving object, so from the videos we saw, it's just reflections, which is distance, the time delay. >> Supreeth Krishna Rao: Yes. >>: How well does speed detection work? >> Supreeth Krishna Rao: So we were battling down issues with the device, until yesterday, so we probably in a couple of days we might be able to really find out how the ambiguity function behaves on this new device. >>: Because, technically, another best criteria for filtering is, okay, everything. Subtract everything that doesn't move and leave only the moving objects which have speed above a certain threshold, and that will provide a much cleaner image, eventually. >> Supreeth Krishna Rao: Yes, that's exactly what radars make use of, because they have a threshold velocity ranges, and any object that is not moving in that range of velocity is rejected as a noise, so that's what even the radars do. Yes, so all the post -- a lot of postprocessing remains. It is yet to be done, and we'll figure out soon how the ambiguity function works in probably a couple of days. >>: You have one more day. >> Supreeth Krishna Rao: Yes, not a couple. Yes. So it says here thanks to my beloved mentor, Hannes Gamper. He helped me a lot, and none of this would have really happened without his guidance and support, and Dr. Mark Thomas, he gave us several insights during the project to debug several issues with the hardware, for the algorithms. The GUI tools that we saw, the calibration, the impulse response recording, the beamforming tools, all of them were developed by Dr. Mark. And thank you, Dr. Ivan, for your continuous encouragement and ever-smiling approach, and sincere thanks to Dr. David Heckerman. Basically, this project came into being because of him, so he wanted this device to be prototyped and tested and the algorithms to be built, so thank you so much. And especially the hardware lab members, Alex Ching and Jason Goldstein, they really helped us through in and out issues of hardware right from development, testing, all of those. That's about it. Thank you so much. Any questions? >>: So back to the beamformer synthesis for the loud speaker array and the microphone array. Did you guys use the phase information? Did you get any phase information in order to mask complete noise? >>: Yes, so the beam matching actually uses both phase and magnitude. But the question is whether we should perhaps ignore the phase in some regions rather than use it for the whole 360 degrees. >>: I mean the phase response of the transducers. >>: Yes. >>: Another note here is that technically, what is interesting to see is the multiplication of those two. This is the joint directivity pattern of the transmitters and the receivers, the receiver beam, which actually brings another interesting idea. Can we do a joint synthesis of the transmitting and receiving beamformers in a way to maximize the directivity? We don't have that if we have a big sidewalk of the transmitter, if we do have a lot there from the receiver. >>: There was a time when we were deciding what the geometry would be that we would maybe even interlace, have transmitter, receiver, transmitter, receiver, and arrange them in a way so that we could try to control all these aliasing artifacts. It seemed easier to create two independent arrays, because the analysis of those are very much more straightforward. And the other advantage is that by keeping them separate, the microphones could be crammed into 50 millimeters diameter, which ->>: You've got a substantially lower aliasing from the microphone array than on the loudspeaker array. >>: Another thing here is, of course, so we actually measured the impulse responses in the setup he was showing with truncation of the impulse response, obviously, going to that. Anechoic chamber might help for calibration purposes, and if we were to move away from the resonance, then perhaps we would also get ->>: So pretty much, from what I saw, the calibration in your work ranges slightly above the resource frequency, 44 to 50 kilohertz. You equalize the transmitters, you [indiscernible]. The response is far here. This is your work area. It's not from that. It's here. >>: Yes. >> Supreeth Krishna Rao: Yes, we'll definitely take that into account and retest, recalibrate. Any more questions? >>: So I guess I didn't quite understand what is the spatial resolution of objects that you can detect? Are we talking mostly about cars and people and walls or anything smaller than that? >> Supreeth Krishna Rao: Can you increase the volume? >>: Sure. How is the spatial resolution? >> Supreeth Krishna Rao: Oh, you mean ->>: What size objects can you detect? Do you have to look at something the size of a human and larger, or can you look at smaller objects? >> Supreeth Krishna Rao: So several factors come into play, like the range at which the objects are. That then decides the resolution, and a direct straightforward answer, a frank answer, is we still don't know, because we got the device just about 2.5, three weeks back, and to get the device set up, calibrated and get some initial results itself is quite a challenge. But I'm assuming -- so at least the map resolution is 15 degrees, as you saw. And with regards of how many bins of those maps a real-world object occupies, we still need to figure that out. We still need to test that. >>: So a planned 122 kilohertz, one sample times two [indiscernible] is 1.8. Do you record this as your resolution in this particular [indiscernible]? >>: Forty-five kilohertz is about eight [indiscernible]. >>: But bats do it with 45 kilohertz chirps. They can capture insects, so it's possible. By the way, the brain of a bat is not that much more computationally powerful than a modern computer. >>: The advantage, I guess, the bat has is that both the bat and the insect are suspended in air. >>: No reflections, yeah. >>: And we want to do the same thing. >>: Well, if you attach the device to the drone. >> Supreeth Krishna Rao: Well, the drone itself might induce some vibrations. >>: True. The [indiscernible] to the drone. We could certainly take it outside. This was a setup that was necessitated by mostly time constraints, but if we were to take this outside and repeat the experiments there, I think we would get rid of a lot reflections and then would get perhaps a better idea of how this might perform when there's less. >> Supreeth Krishna Rao: Yes, and if it's an open space, we would definitely not get such responses, completely -- there's pretty much an object right around the system at very close ranges, and the problem is the human being is just one of these at the same range. And also, the signal-to-noise ratio is not very great in such a cluttered environment. You don't know whether the human being is reflecting or it's hitting the human being, hitting a closer object and then coming back. By then, the human being might have moved past. >>: You don't need even need open space. You might choose the atrium, which is enough large basically to be equivalent of open space except the floor. >> Supreeth Krishna Rao: Right. >>: Okay, if there are no more questions ->>: Very good work. You made a progress in that direction. >> Supreeth Krishna Rao: Thank you. >>: Thanks for building on top of Ivan's work, and hopefully next year another intern will start to push this further. >> Supreeth Krishna Rao: Right, sir. Okay. >>: Thank you.