1 >> Ivan Tashev: Good afternoon, everyone. It's the traditional five minutes past the hour so we can start. Today, we have Professor Kainam Thomas Wong from Hong Kong Polytechnic University. He's associate professor there, but took his Bachelor's degree in University of California Los Angeles and his Ph.D. from Purdue University. He's going to talk today about acoustic velocity sensors and what kind of beamforming and sound capturing we can provide to them. Without further ado, Thomas, you can have the floor. >> Kainam Thomas Wong: As Ivan mentioned, my talk is about acoustic sensor ray. The objective is we have is beamforming. And specifically, we are going to do frequency invariant beamforming. What I mean by that is, what I mean by this term is that a beamformer weights do not depend on frequency, do not depend on the frequency of the incoming signals, of the incoming interference or of the frequency of the noise. And we can realize this because of a special kind of sensor that we are going to use. Not microphone, not the [indiscernible]. Not uniform circular ray, but a special kind of sensor called the acoustic velocity-sensor triad. So basically, we have three velocity sensors in this system. So this is the presentation of my talk. So first, what is a velocity-sensor triad. What are its advantages for beamforming, and how about some specific algorithm for adaptive beamforming, using the velocity-sensor triad. Now, adaptive beamforming, different people use this term to mean different things. What I mean here is not beam pattern, not space or mat filtering, but some part of the data model is unknown to the algorithm and changes over time. So the algorithm itself adapts to the external environment. And then there will be some data for jury trial, using simulated speech data. This presentation is actually based on a paper just appeared last month in JSA. Okay. So what is a velocity-sensor triad? Now, this is the array manifold of a velocity-sensor triad. So we actually have three components in the triad. This is the measurement model. If the source comes in with elevation angle theta and azimuth angle fie, this will be the measurement. I mean, the array manifold. 2 Now, notice that the UVW turns out to be just the direction cosine along the X axis, along Y axis, along Z axis. So this is just that. This just that. This just that. This is just the property of the velocity sensor, and we have three of them. Now, notice some interesting properties about the array manifold. The array manifold depends on the elevation angle and also depends on azimuth angle. Very importantly, no frequency, no wavelength here. And like the ULA, uniform linear array or uniform circular array. Now, this array manifold is applicable for far afield, the source very far away or the source relatively close by to the sensor itself. But again, there's no R here. The R, the separation between the meter and the sensor, does not appear in the array manifold. Now, this would be -- those properties, the property of F that's not up here and R that's not up here would be really, really advantageous for adaptive beamforming. As we will see in a moment. Let me just skip over this page. Let me show you a picture of the velocities. Okay. So it's [indiscernible]. The velocity sensor for air acoustics is actually commercially available. For example, by a company in Illinois. And this is the spec sheet just downloaded from the company website, vertical axis [indiscernible] response. How [indiscernible] axis is just frequency. So according to this spec sheet, has pretty constant response from 100 hertz to about maybe three to four kilohertz within 3 dB. So this kind of inexpensive velocity sensor would be good enough, perhaps, for speech processing. But that's just one example. There are other kind of commercial models available for velocity sensor. A professor from the University of Illinois Urbana Champaign actually just have a do-it-yourself kind of vector sensor. So this is one, two, three velocity sensors and one microphone. So the X axis is here. Y axis would be here. Z axis would be here. So they do not exactly co-locate, but relative to the wavelength, they're almost co-located. Now, in here, there's an optional pressure sensor here. It is optional. Now, but they are actually commercially, commercial products available for the vector sensor. One doesn't have to do it yourself. This is a company in the Netherlands. This is a picture I downloaded from the company website. According to the company, the frequency range from one tenth of a hertz to 20 3 kilohertz. So the human hearing range for a very young person. That's what the company claims. Within one dB. So the velocity-sensor triad is practical. It is commercially available. It's been implemented. We can actually buy it. What we have been talking about is for air acoustics. But actually, the velocity sensor has a long history, going back over a century, according to some reference, according to some books. But mostly has its roots in underwater acoustics for defense purposes. So this is just underwater acoustic version of the vector hydrophone. This is just external case. So the black thing is just the support. For example, a boat may tow an array of this type of vector hydrophone behind a boat or behind a submarine. Okay. So, so far, we see that the velocity-sensor triad is available if we have money. Just go and buy. So what are its advantages? Well, we hinted on it a while ago. The advantages lies in the simplicity of its array manifold. Simplicity in the sense that it is independent of frequency and independent of R. Now, when we say independent of frequency, this is an idealization when we're looking at that product, [indiscernible] responds. The [indiscernible] does drop off after four kilohertz and so on. But it is independent of frequency in the sense that since the three components are the same, so the drop-off would be sort of similar for the three. In that sense, it is independent of frequency. So there are actually four advantages that I'm going to talk about. The first two are really simple. We already mentioned it. We mentioned the first one already. One unit, the picture from microphone. One tiny unit give us azimuth and also elevation angle. It doesn't have to in ULA spread out on an array grid. It can be very, very compact. It can be very, very compact. So the third advantage is what I would like to dwell on a little bit longer. The array manifold does not depend on frequency. So when we do beamforming, the beamformer waves also do not have to depend on frequency. We can make it to depend on frequency, okay, but that's just optional. But the beamforming waves do not have to depend on frequency. Now, this is in contrast to a ULA, a linear array of uniformly spaced microphones on the directional pressure sensors. If we just consider the first three, if it's just the three element, ULA, then this would be the ULA array 4 manifold familiar to, I think, both of us. The frequency appears through the wavelength here. So if we have the ULA have a look direction of 30 degree and if we set the beamforming weight to be for two kilohertz, that's the speed of sound in air, and then we do -- okay, and then we plot the space of mat filter beam pattern, well, it just -- instead we get two kilohertz, we got a purple one, and that is right, because we get a peak at three degrees. But if the frequency is 500 hertz, very much within the human speech range, the peak might grade significantly to the left such that at 30 degree, you know, it is, actually has an attenuated [indiscernible] response. And if the frequency rises to 3.5 kilohertz at 30 degree, we again may have only 0.75 and so on. So basically, what I try to show here is that for ULA or similar kind of [indiscernible], UCA, URA, uniform rectangular array, things like that, because the frequency appears in the array manifold, we really need to have a lot of signal processing to account for the array manifold's dependency on frequency. This just for an L-shaped array. Similarly, the frequency appears through the wavelength, and I just give this example because for this L-shaped array, we can do the azimuth angle as well as the elevation angle. But still, the frequency appears. So if we have a beam pattern have direction of elevation angle third degree and azimuth third degree at a frequency of two kilohertz, then this would be the beam pattern, but then if using the same beamforming weights and frequency, 500, beam pattern changes quite a bit. And if the frequency now becomes 400, I mean, 4,000 hertz, the beam pattern again quite different. Now, this is the intended location of the peak. But here, where the peak should be we actually get Sullivan now. So the frequency dependency of the array manifold could be a really big problem. >>: [indiscernible]. >> Kainam Thomas Wong: If, right, so the reason we have this kind of phenomena is because the pressure sensors are not co-located, and if, as you mentioned, if we cannot use phase, but if we use time, then it would be wide band processing, and the time and the direction would be coupled together. But for the vector sensor, the look direction, and the frequency axis can be decoupled. So we can handle the two separately. And that give us more versatility. 5 >>: Still, why does -- okay, you have estimated the delay in some weights for the 2,000 hertz, and that's fine. But if you talk phases, this is more processed in the frequency domain. What can estimate the proper weights for every single frequency band and then [indiscernible] always the maximum will be at the desired location. This is not a substantial amount of computations which is a problem for [indiscernible]. >> Kainam Thomas Wong: I would agree with you, that this kind of system can be used for the way you mention it. Now, if it is adaptive signal processing, you know, if some of the interference or sources are moving around and if you want the CPU to be really, really inexpensive and really cheap, so the vector sensor would be one alternative. So how cheap a CPU is cheap enough, I think that's a system development question so, you know, you would much more than I do. I suppose it probably depends on how cheap is cheap enough. And also, how adaptive the [indiscernible] need to be. >>: What this velocity sensor measures is actually the speed [inaudible]. >> Kainam Thomas Wong: Right. So the proper term, I guess, is called acoustic particle velocity filled vector. Particle velocity filled vector. So one way to measure it. Actually, one way to measure -- actually maybe I skip over it. One way to measure it is to have two microphones and do [indiscernible] the difference. That's one way. Another way is do it in, to measure the velocity directly by optical method, by thermal methods and also by other mechanical methods as well. >>: Even though you can't correct with computational, just acquisition can be a problem sometimes. You have to make up a lot more to the dynamic system in the front end to capture the data because of the frequency effects and the [indiscernible] effects there too. There's computational [indiscernible] too. >> Kainam Thomas Wong: And also, for some applications, the interference can be -- I mean, some of the sources may have a known frequency band, maybe just not speed. Maybe something else. For example, one thing this thing is used 6 for is to detect sniper. Or detect an expected sound that would indicate something abnormal or that should not happen is happening. So if part of -- so suppose if [indiscernible] is human speech, but we do want to reject some impulse interference that is frequency band may be some new location, the AVS might give us some additional versatility. >>: [inaudible]. >> Kainam Thomas Wong: So I would totally agree that the AVS is not categorically superior, but it offers some positive and also some negative. But what AVS is [indiscernible] for particular product, it really system development judgment. Okay. So basically, because as we decouple the look direction from the frequency axis, so it could be computationally simpler if we use the vector sensor. So there's actually, Ivan just mentioned that. Okay. So speech, music and background noises, they are really broadband and the bandwidth may be a the unpredictable locations and that bandwidth varies over time, and typically a [indiscernible] unknown. So if we can decouple the look direction coordinate from the frequency coordinate, it might give us some additional advantage. So the velocity-sensor triad, which I sometimes call the vector sensor, its beam pattern does not depend on F and also does not depend on R. So here's another advantage. That if we look at the ULA, if it's a near field, and if R is the distance between the [indiscernible] and the sensor, the array would depend on R. But for the velocity-sensor triad, R does not appear. So that's another advantage is that the same beam pattern can be used regardless of how far away or how close the meter is from the sensor system. Okay. So how about using the velocity-sensor triad for adaptive beamforming. I guess many people are actually beamforming experts so I'll just go through it really quick. This is a scenario in my subsequent discussions of conference room. There are potentially up to six simultaneous active speakers, potentially. So the sensor in this case is put at the top of the ceiling. This is just the simulation scenario I have for the subsequent pages. So this is the collected 7 data at time T. So that's a particular speaker is the desired speaker, and then I have on that diagram [indiscernible] would be 5. Five interfering speakers and some noise. So I formed a three times three spatial covariance matrix. Why three? Well, why three times three? Because we have a triad. We have three velocity sensors so we have three times three. So this is just the minimum power, distortionless response beamformer. Some people may be a little bit unfamiliar with this terminology, but it's basically minimum barriers distortionless response. The only difference is that in MVDR, this is supposed to be the actual real data -- I mean, this is supposed to reflect the real statistics in MVDR. But in MPDR, this is just the empirical collected data. But this, the idea behind it is basically the same as MVDR. So in MVDR beamforming, the way vector here will ensure no distortion, this look direction, the tuned direction, while minimizing the beamformer's overall output power. Now, remember, in the triad, we only have three sensors. We have a very stressful situation here. We have to desired user, the blue, purple thing here. This is the azimuth angle. That is the elevation angle, and we have five interferers. So we actually have six people talking together. At the same time. So this is a very stressful situation, and the left-hand side, the right-hand side are basically the same thing, but it's just a contour map. Now the SOI admittedly is not at a peak, but if you look at them now, if you look at the now here, another now, the two nows are placed sort of nearby the five interference. So the beamformer would actually have the SINR, to signal the interference noise ratio. I emphasize this is a very stressful situation, because we have more emitters than we have sensors, which is three. Now, I also have to admit, you know, my [indiscernible] scenario, and this is the best looking one. Okay. This is just to summarize the advantage of using velocity-sensor triad. One set of beamforming weight for all frequencies. One set of beamforming weights without regard to how close by the speaker is. One set of beamforming weight regardless of the interfering sources' distance from the sensor system. No need for any prior information of the time frequency structure of the signal and -- of the [indiscernible] and of the interference. So simplicity would be 8 its primary advantage. Now where in simplicity is worthwhile in any particular system, there will be a system development kind of judgment. Okay. Now, how about if the look direction is unknown, well, there's something called MUSIC. MUSIC has nothing to do with music. It's just an acronym. It's a parameter estimation method. So through this kind of method, we can actually estimate the desired look direction. The assumption behind this method is that a desired speaker is actually loudest. Which I hope is the case, unless somebody try to shout out the desired speaker. So we can see that at the vertical axis is the estimation bias in degrees. this is one degree. So in this particular scenario, if the SINR, there's interference present, if the SINR is sort of like 10, 15 dB, then the estimation could be correct within one degree. So Now, how about if that MUSIC method does not estimate too precisely, or if the velocity-sensor triad is tuned by some machine or human, but tuned imprecisely, what would happen? Well, that's the problem of mispointing the beamforming. If the beamformer is mispointed, I'm sorry I don't have the numbers with me on this slide, but I have the numbers in the paper. Then the desired user could, indeed be nulled. But there's a method, a signal processing method called diagonal loading. It's a very simple method, just add an extra identity matrix here, scaled by the loading factor, gamma. Then the desired speaker will not be nulled anymore. >>: [indiscernible]? >> Kainam Thomas Wong: According to the papers I have read, and I read most of the papers there, it's kind of ad hoc. They don't have a theory for it. So I guess if it's a certain kind of application scenario, people will try ahead of time what range would typically work for this, for a certain class of scenario. And, of course, the right-hand side diagram, have tried many, many ways and finally this would look the best. >>: But still, the question is if used, [indiscernible] system uses given set of sensors and makes a mistake, isn't it better to make the same mistake when 9 you do the capturing so if the vocalizer thinks the sound source is right there, most [indiscernible] not the correct location, most probably the beamforming should point where the system thinks is because some reasons the sensors are not identical [indiscernible] capture the sound. >> Kainam Thomas Wong: I think the pointing error may come from many different causes. Maybe the cause that you mentioned is that the array itself is not calibrated such that when you say incorrect, maybe, I don't know, maybe what you mean is the correct is the nominal if the array is perfectly calibrated, its ideal is ULA or the sensors are isotropic, [indiscernible] gain and set a space a half wavelength or whatever, so it's a perfect idealized version. Then it has beamforming vector and then the actual physical ULA that we have, which would be uncalibrated, the microphones may not have equal gain. They may not be located at a correct nominal position, then that ideal -- that non-ideal ULA would give us another set of beamformer weights. So we should not try to make it too equal. That, I would totally agree. That I would totally agree. But the mispointing here is a different kind of mispointing. It is that it doesn't matter at this time mispointing would exist whether we have the ideal ULA or we have the uncalibrated, imperfect ULA. But just somehow, because the music algorithm or other kind of parameter estimation algorithms has bias. Has bias. And the bias could be quite significant if the SINR is bad enough. Or if the tuning is actually done by a person and the person would be a little bit sloppy and do not tune manually to the correct direction. So I'm talking about a different kind of mispointing. Not because the array is calibrated or not. So even if the array is calibrated, the algorithm could have estimation bias. Because of noise, because of interference, because of other kind of imperfections. Okay. Now, the jury trial is not very realistic, I would be first to admit. did not actually have the money to buy a microphone for that system. >>: Which costs, by the way, $15,000. >> Kainam Thomas Wong: And my graduate student trying to build it, but is still building it. So the amplifier is not that easy to do, apparently. So what we do, for the sake of publishing the paper, is that we just download I 10 some file from the internet, people reading news, people reading some book or something, and basically just put them in to the data model here and then to the signal processing. Now, the jury is real. They are actual humans. So what's the jury evaluation system is from a scale of zero to ten. So it's 11 possible marks. If it is zero, one, two, or three, it would be totally unintelligible, if it's seven, eight, nine, ten, it would be totally intelligible. But, of course, you know, a ten speech would be -- would have very good sound quality beyond being intelligible. So here, I have a 15-member jury. Vertical axis is the average score. [indiscernible] axis is the SINR in, in to the beamformer. I have got three curves. The ISO is the isotropic single sensor. Just not ULA. Just one sensor. The black one. This thing is the AVS, but using spatial map filtering. Just spatial map filter to have a look direction, look at the desired speaker. The interference would not affect the SMF beamforming weights. The interference will affect the SMF beamformer output, but not the beamformer weights. The MPDR, as expected, would have the best performance. As a matter of fact, when the single microphone and the mat filter AVS or both basically serial one, the speech is already intelligible, because it's seven. Now, this is the case for three speakers. Remember, we have a triad, so three speakers. Three speakers, including the desired speaker. So desired speaker plus two interference. This case, we have desired speaker and five interference. So it's very stressful. Still, the MPDR give us a little bit of help by about a score of one. But because the situation is really stressful, so the gain over the spatial match filtering is just a score of one. Could remind you, it's a very stressful situation. Six speakers simultaneously active. >>: What kind of noise [indiscernible] do they measure it with? >> Kainam Thomas Wong: Just white Gaussian. Edited white Gaussian noise. >>: And in the corporate generation, you guys would [indiscernible] so we can say that this is a no reverberation case. 11 >> Kainam Thomas Wong: No reverberation. So it is really not that convincing. But this is sort of a, just to show that at least under this ideal situation, there could be some improvement. >>: Do you have an audio example? May we hear what it sounds like? >> Kainam Thomas Wong: Unfortunately, when I was coming today, I just remembered that I should put the sound samples in it, because we are talking about speech. I actually do not have the proper sound samples on my laptop. And also because this is done in Hong Kong, the samples are in Chinese, because the students with limited English fluency so I don't want the English, you know, language problem be affecting to it. And it's much easier to find people to listen to Chinese speech in Hong Kong. So unfortunately, I don't have it. I should have it, but I don't. I'm sorry. But this is really a toy scenario, I totally agree. >>: [inaudible] not have a source that's the same as your original one you're trying to focus on [inaudible]. >> Kainam Thomas Wong: I totally agree. That's a very good criticism. I mean, academic, so my piece is just publishing papers. My publisher be happy with it. So part of the reason I come to Microsoft is perhaps to learn from you guys what kind of realistic problem I should deal with, yeah, for reverberation to develop a system. We really need to look into that. For publishing a paper, this have toy scenario could get through it fortunately. On the other hand, the reverberation, you know, in a way, it is a somewhat different dimension, because the purpose of this paper is just to demonstrate that this kind of velocity-sensor triad can separate the frequency dimension and the radial dimension from the look direction. That was actually the only thesis in this paper. So there are a lot of questions not addressed by this paper. >>: I totally agree. So in general, you used a theoretical model of the gradient sensors. 12 Assuming that they are perfectly identical, how robust or sensitive is this particular implementation of MPDR [indiscernible]. >> Kainam Thomas Wong: Right. I have a research work in progress with a mathematician to look into that problem. If the channel mismatch, the gain, the phase and the location, I mean, we have an array of such triads. The triad may not be at its nominal location. And if I have two triads how about if the orientation not identical. I mean, just having to shift by one or two degree, I mean, it's possible. Then how would it degrade performance? So I have a research ongoing with a mathematician, and we model that mismatch statistically, that if the gain, the gain mismatch has, say, bias of zero, but that mismatch is stochastic but is Gaussian distributed, then how would it affect the direction finding. How would it affect the beamforming. So we were trying to devise a nice and beautiful equation to show us what the exact degradation is. So we're working with that. >>: One note on this experiment, even for the six speaker, so the left shows that you have a correct implementation at the beamformer and with three microphones, it can place [indiscernible] towards the desired direction and [indiscernible] desired speakers. Perfect. But then once you go to multiple speakers, even ->> Kainam Thomas Wong: We have multiple speaker here. >>: The six speaker scenario, then even in this case, I would go with processing [indiscernible] estimation the MVDR beamformer [indiscernible]. The reason for this is that speech is a very sparse signal, and you can have six speakers talk, but it's unlikely to have six speakers in the same frequency beam. So each frequency beam processes separately with frequency dependent weights will adaptively place the [indiscernible] towards those two speakers which are [indiscernible] the speaker scenario into the results of the other speaker scenario. >> Kainam Thomas Wong: Right. I would agree that would be an alternative. I mean, that be would a very good method. The trade-off is very simple here in terms of the computation. So is that simplicity worthwhile? You know, it depends on the system development philosophy, I guess. 13 So it may not be worthwhile. specification. Or it might be worthwhile, depends on the >>: It is not much of a CPU required. I think one mobile phone could dink and drive at least ten of those beamformers, easily, realtime. >> Kainam Thomas Wong: >>: It's dual core CPU, which is today's telephones. >> Kainam Thomas Wong: >>: Okay. Okay. And, of course, the [indiscernible]. >> Kainam Thomas Wong: So other comments? >>: More questions. I think [indiscernible] to begin with, so feel free to interject and ask questions, please. >>: I have a question about the low frequency performance if you were to make a [inaudible]. As a measure of the velocity, it's measuring -- it's [indiscernible] gradient depression, which makes it much more sensitive to if you had a source that could produce a constant pressure aptitude and sweep it across frequency, it's going to be much more sensitive at the high frequency because you have a much greater pressure gradient. If you wanted to make a wide band beamformer, it may be seven or eight octaves. That means somewhere along the chain, you have to have the compensation of maybe 50 dB. In other words, you've got to boost the low frequencies by 50 dB, relative to the very high frequencies. So that's going to limit your low frequency performance, and it's also going to be limited by the noise. Now, as I understand it, these devices aren't made by -- each is measuring the [indiscernible] loss? >> Kainam Thomas Wong: I'm not too sure how the transducer works. >>: It doesn't have to be a noisy process, which means that you're [indiscernible] on your low frequency performance. Do you have any comments on 14 how practical these are? >> Kainam Thomas Wong: I really don't have any comment on that. My background is signal processing so I don't know much about those implementation issues, yeah. So that's why ->>: So [indiscernible] I have seen, is it pretty much ten millimeters pipe with a tiny wire going through, which is [indiscernible] through curved and then they measure the [indiscernible], which means how cool or how is the wire, that's it. >>: That's proportional with the pressure gradient. >>: Yes. And [indiscernible] frequency response as they show, you have to [indiscernible] filter. >>: Which is a heck of a lot for a [indiscernible]. So it's not really frequency variant. It's frequency variant over a bandwidth. There are going to be [indiscernible]. >> Kainam Thomas Wong: Right, right. Yeah, thanks for the comment. I really don't know much about the physical [indiscernible] behind it, but I think that, you know, that would be a very interesting research topic for a signal processing person like me, how to correct for it by signal processing. >>: I think it's for this reason that people are using hysterical microphone arrays where you're not measuring the pressure in the point, but you're measuring it at some distance away from the central system. And that helps to deal with the low frequency problems. >> Kainam Thomas Wong: Okay. Thanks. Thanks for the comment. >>: So still considering the sensor, per se, if you have a sensor switch out place or increase distance, let's say, [indiscernible] four or five centimeters so you can have them a couple centimeters, the sensors, away, then in this case, you can use the differences in the time of arrival in the phases. With this particular design, you put the sensors close to each other, which means that you the only cue you have for the direction of arrival is the magnitudes. Because the sensors come from very specific patterns. Can you 15 comment to this? What will you gain and what you lose. >> Kainam Thomas Wong: I haven't really compared the two different approaches. One obvious thing is that the time difference of arrival would need to have more than one location. At that point is very obvious. For computation power, you mentioned it's not a big factor at all. I don't know if the AVS -- I mean, this method would save a little bit of computation power. Now however important that saving might be. >>: The way you put them at the distance, we can use both the differences in magnitudes and in the phase. From the moment you put them together, the only cue you have is the magnitude, that's it. And this pretty much limits whatever directivity pattern you do with the first order directivity pattern. So the best you can do is [indiscernible] the directivity pattern [indiscernible] isotropic ambient voice fields. That's it. While with the [indiscernible] microphone array, you can go a little bit further than that. >> Kainam Thomas Wong: Right. That directivity pattern here, I don't know what you might have in mind with the spatial mat filter pattern. With eigen structure of signal processing, we can have, in adaptive beamforming, we can have the main loop to be much narrower than the spatial mat filter's main loop. So with some adaptive signal processing beamforming techniques, we can actually make the main loop to be much, much, much sharper than this. This is spatial match filtering, kind of. >>: So this is equivalent of the [indiscernible] beamformer. It's pretty much a delay beamformer. And you cannot have it go a little bit beyond that [indiscernible]. And that's for directivity pattern. That's it. Four sensors. >> Kainam Thomas Wong: The kind of beamforming you mentioned, I mean, this would be a subclass of the kind of beamforming that you mentioned. Even with just the delay in some beamformer, if we take the delay and the weight properly, we can make the main beam width to be much, much sharper than the mat filtering kind of beam width. >>: [indiscernible] the sensitive to any one of the velocity sensors is a figure of eight [indiscernible]. And you're never going to get a directivity 16 that's any sharper than a figure of eight. That's your limit. sharpest you can do is to create a [inaudible]. >> Kainam Thomas Wong: In fact, the Right. >>: That's the physical limit. to get sharper directivity. You need to have a higher order sense in order >> Kainam Thomas Wong: The kind of thinking that I have, maybe it is not correct, but just for brainstorming is that yeah, that is the [indiscernible] of an individual sensor, but we have several of them. And if we take the summing weight wisely, the composite, the entire arrays' composite beam pattern can be much sharper. It's a little bit like the ULA in the [indiscernible] isotropic sensors, each of the individual sensor has no directivity at all. But if we have a ULA and take the beam weights wisely, the entire array can have a very sharp -- can have a somewhat sharper main loop. So I don't know if we're talking about the same thing. >>: We're actually going to So if you've got these three where you've got an incoming at the nulls of two of them. information is the one whose limit you're directed to. >> Kainam Thomas Wong: have a chat later so maybe we can go through that. figure of eights that are -- say you've got a case wave that is coming from this direction. You're So the only sensor that's going to give you any main loop is alive [indiscernible]. That's the But how about if -- >>: The other two sensors become useless in that case. >>: Okay. >>: Then you get -- >>: Yeah, and that's it, you go to six degree. If you have the only mic, you can go a little bit farther. >>: That's why he's [indiscernible], because if he has another one, so the interference is going to have signal of three. >>: That's correct. 17 >>: Puts a null there, essentially. be huge. >>: There's a place where the gain's going to You're absolutely -- you've got to do better than that. >>: But I'm saying like if you still have the figure of eight beam [indiscernible] in such a way that the null is in your interference ->>: Okay, yeah. The null is steeper than the loop, yeah. >>: So [indiscernible]. >>: Assuming [indiscernible], yeah. If you have one, an interfering source that you want to point a null to, then yeah, that's different. If you had one source that -- if you wanted to measure the direction of the resulting system, it's never going to be any greater than the directive [indiscernible]. >> Kainam Thomas Wong: >>: Right, right. [indiscernible] have you steer it. >> Kainam Thomas Wong: Correct, right, right. My question is even for the ULA, an are ULA of isotropic sensors, even if we have one source and we want to stay towards the source by wisely picking the beamformer weights, we can still have a very high gain towards the source look direction. >>: Yes. >> Kainam Thomas Wong: So the kind of beamformer, if we also use some kind of -- I mean, if we also use some kind of beamformer and pay the beamformer weights wisely for the three components in the velocity-sensor triad, can we actually have sharper beam towards the [indiscernible] direction? But for the isotropic. >>: [indiscernible]. >>: [inaudible]. >> Kainam Thomas Wong: For the ULA case -- 18 >>: [inaudible]. >> Kainam Thomas Wong: Right, but that ULA beam would still be narrower than any one individual isotropic sensors gain. So for the ULA case, so by doing the signal processing wisely, we can add a little bit more gain beyond the gain of an individual sensor. >>: So you're talking about a rate of this type of sensors, right? >> Kainam Thomas Wong: Just one of them. >>: I think the triad is on an orthogonal coordinate systems, the ULA sensors are the same. So if you combine multiple things on the same coordinate axis, you can get benefits by combining them if they're orthogonal. >>: So if they have spatial diversity, then this doesn't happen. >> Kainam Thomas Wong: But actually, two spatial dimensions. There are actually two -- it's the azimuth and elevation. So there are actually two independent coordinates, spatially speaking, azimuth and elevation. And we have three of them, three sensors. Now, I would agree with you, if the [indiscernible] happened to be parallel to the X axis, Z axis, yeah, the other two basically would give us zero response. So that's basically no matter what beamformer weights we peg, it's basically no effect. But how about a more general situation. If it does not [indiscernible] to the X axis, Y, Z axis, we have three sensors but two direction of arrival coordinates. Azimuth and elevation. >>: But you add another degree to your set. You've added another degree of freedom to your source. So just because it's in 3D doesn't make it any -imagine you had two figures of eight. You can steer [indiscernible], whatever, you can steer that to any angle you like. But the directivity index doesn't change with angle. It doesn't matter if it's completely in line with one or completely in line with the other. The directed index stays the same. And that directivity index is limited by the directivity path note of any one of the sensors. 19 >> Kainam Thomas Wong: But when we do beamforming with the velocity-sensor triad, we are just rotating it. We are also making one of them larger relative to the other ones. This is not ->>: It's the same thing. rotation. >> Kainam Thomas Wong: You put a normalization term, and that's the Okay. I need to look more into it. But -- >>: How do you achieve more [indiscernible] directivity when you only have first order direction [indiscernible] if you have them in two different locations, but this is at one point in space? >> Kainam Thomas Wong: My problem with the discussion right now is that I would need to have a mathematical definition of defect directivity, because, I mean, I need to have the precise definition. My impression is that we might be using the same term slightly differently. >>: So the pattern you gain is a function of the direction and the duration? >> Kainam Thomas Wong: Yes, yes. I understand that. But when we talk about the triad, then the directivity of the entire triad and the directivity of an individual velocity sensor, you know, is -- then I don't quite follow your reasoning, but I'll think more about that. >>: Maybe you can [indiscernible] this afternoon. >> Ivan Tashev: Any more questions? >>: Yes. You showed a whole bunch of commercial velocity triads. processing do they use compared to the method you presented? What >> Kainam Thomas Wong: Actually, I only showed one commercial one, the microphone. And I think they basically just make the product. They are not into devising beamforming algorithm, as far as I know. There are some papers that use the microphone in some few testing, open air, indoor testing. And those people, some of them are related to microphones, some of them are not related. Offhand, I don't remember what kind of algorithm they use. For the UIUC, University of Illinois Urbana Champaign one, that professor 20 basically, and his graduate student basically just build a system. build a system, not to -- They just >>: Those underwater microphones, they seemed to be quite well developed or well thought about. Do you know what processing they use? >> Kainam Thomas Wong: >> Ivan Tashev: MVDR. Most of them. More questions? >> Kainam Thomas Wong: Thank you. Let's thank our speaker today.