16107 >> Zicheng Liu: Good afternoon. It's my pleasure...

advertisement
16107
>> Zicheng Liu: Good afternoon. It's my pleasure to introduce the last speaker for today. It's
Associate Professor Thushara Abhayapala from Australian National University. And the talk he's
going to present is non-spherical microphone arrays for spherical harmonical analysis of 3-D
special sound fields. And also this talk will be Thushara himself and his student Ashta Gupta
[phonetic]. You can have the floor.
>> Thushara Abhayapala: Thank you. So, first of all, I have to say that this work is for myself
and my student Ashta Gupta. And unfortunately she couldn't make it. She got sick and she had
a surgery, so she wasn't fit enough to fly on the due date.
So I'm from ANU, Australian National University, which is located in Canberra, Australia. The talk
is on non-spherical microphone arrays for spherical harmonical analysis of 3-D special sound
fields. It's a rather long title. So let me introduce bit by bit.
So if I look at the outline, first off, I will introduce the spherical harmonical analysis of sound field,
and then I'll go to the spherical microphone arrays, and then see what is wrong with, why we can't
use sometimes spherical arrays and then go into the non-spherical concept.
Okay. So I guess the basics first. I'll just try to explain a little bit of spherical harmonics. We're
looking at spherical coordinate systems. So the thing here is we've got a point in space, say,
given by three coordinates. I mean, three things, RT and 5. So the T is the evaluation coming
from said axis to the point and the azimuth is the XY, the angle from the XY and the array is the
distance from the origin to the point.
So what are the spherical harmonics? So spherical harmonics are functions of angles, two
angles, elevation and azimuth. They're denoted by say Y and [inaudible]. And they just like the
full -- exponential functions which you use to represent a function over time.
So these spherical harmonics -- actually, they are the equal and 1 and 3-D on a sphere. So the
spherical harmonics are auto normal on a sphere. So N and M are integers. N is always
positive. And N goes from minus N to plus N. And these functions are simply a basis set. And
they're autonomal on the sphere. So the relationship the autonomal relationship is given by that
equation there.
And, actually, what they are, they are just the functions of -- I mean, basically think of arithmetic
functions. So the last equation you see the definition of spherical harmonics.
So this function is called associated [inaudible] function and the rest is just the exponential
function. And then you get this normalization term, so that when you get this autonomal
relationship to different indices when you integrate over a sphere you get the autonomal
relationship.
Okay. So simply they are just a set of functions for the 3-D on a sphere. They just do the same
thing as the exponential functions do for a period of signals on a one dimension.
Okay. So why these things are good, because any function on a sphere or 3-D, like if [inaudible]
can be expressed by these basic set. So these are spherical harmonics. They act as just a basic
function set.
So having a sum of these guys with the coefficients, you can represent basically any arbitrary
function on a sphere or on 3-D. And then using that autonomal relationship, you can basically
calculate these coefficients by just multiplying your function by the complex conjugate of this
spherical harmonic and then integrate over this sphere.
So that pretty much similar to the full series for a one-dimension signal drawing. So if I just look
at a couple of spherical harmonics, this is like the two zero, that means the second order and the
zero degree. And this one is the real part and this one is the imaginary part. And then there are
series of these things.
I'll just show you another one. This is like the fifth or the third order, spherical harmonic, which
look like that. So here this is azimuth and this is elevation. And then this just the real baud and
an imaginary baud. So what basically this spherical harmonic does they have some shapes on
3-D.
So if you give me any function on 3-D you can basically combine these things with the
appropriate weights to make any shape, any function on the three. So in one sense they are
pretty much basis functions.
Okay. So before going into the more detailed talk, just show you something else, which is
spherical base functions. So the spherical base functions are defined by using normal base of
functions, which are -- but they're half integer order. So the order is N plus half and then there is
this factor in front of the base function. So one thing I want you to notice is these, I mean these
are just plot a few of the spherical base functions here.
So the zero spherical base function starts on zero, comes down and goes up and then oscillates
around the zero.
So they pretty much, when the argument goes to infinity, they're just like a sinusoidal. And the
first one starts from zero, goes up, comes down and oscillates around the X equals zero and the
next one starts zero, again from zero and comes up.
So the one thing you should notice is high order, they start later on. So this zero and then comes
up late. So if I say N equals 10, they start very late and so on.
So this is one of the properties we will use later on in the talk. Okay. So the purpose of just
introducing spherical harmonics and the spherical base functions are, I want to talk a bit more on
the propagation. Why the propagation is important, because in all this acoustic audio, everything
propagate. They are basically propagating signals.
So they are actually governed by the wave equation [phonetic]. Generally wave equation is
valued for homogenous field. Homogeneous field is equal to same media. Having the fraction
and reverberation is okay because as long as, after the reverberation, when you look at some
region where you have the same media, then still the solution to the equation does valid, it is
valid.
So what we could do is what we're doing is basically look at the basic solution to the wave
equation. And then if we use it as just a set of building blocks to solve the other problems.
Okay?
The basic solution to the wave field are given by this term. So the second term is spherical
harmonics. And the first term is base function [phonetic]. So this is if you have incoming mode.
What does that mean is you define some region on space, and if all the sources are scattered
outside of that region, then you get solution in this mode.
And if you got some sources inside, and if you're looking at outside, then you got this second
mode. Okay? But if you have a mixed thing, so that means there are sources outside and then
inside and you're looking some region in the middle, then you have terms which combine -- I
mean, the first one and the second one all together.
Now, the other thing is the K is the real number. So the real number is given by 2 pi times
frequency divided by C, which is the speed of sound propagation. So the thing I want you to
notice from here is where the equation does have basic solution or fundamental solution, they are
either the first one here or the second one down below.
>>: What is H?
>> Thushara Abhayapala: H is the spherical [inaudible] function.
>>: Is it off?
>> Thushara Abhayapala: Okay. Thanks. H is the spherical [inaudible] function, which are
defined by the spherical basal plus the spherical Newman function, the conjugate of Newman
function. Right.
So, generally, because in the slide before I showed that these terms are the basic solution. So
because wave fields are, if you can make a complicated wave field by all these scatters and
everything.
So any general sound field could be represented by the sum of these guys with appropriate
ratings. And this is the case we are if the source is outside and this is the case if the source is
sinusoid and you look outside. In this case you just have a hitch in spherical [inaudible] function
rather than function rather than the [indiscernible] function. The weights are frequency
dependent. It could be a broadband signal. The K represents the frequency, because the real
number is proportionately the frequency.
Okay? So as an example, if you look at a plane view, we know we all use this to model a plane
view. Here Y hat is a unique vector along the direction of propagation from a particular origin and
X is a vector to a point on this -- I mean, from the origin. And the second factor they just give you
the exponential term. So we are all familiar with this thing to represent a plane here.
So if I just look at this part, which is depending on the delay, in a sense, with respect to the origin,
CDY is just unique detector Y. It's a particular direction. We can express the plane view using
that set of basis functions.
So here you've got this bit, which is my mode. So the basis function, which has the spherical
harmonics and basal function and then you've got the LF and M which is the weights in this case
because it's a plane it was just given by that.
So the message is plane view you can be represented by this framework. Okay. How about a
point source or a near fielding? In a near field, what we have is a spherical wave sound. Which
is given by E to the product of JK, say the distance from the source to the observation point
divided by the distance from the source to the observation point line.
So, again, there's a mathematical identity to express this using that expansion. Here again
you've got the summation and this is my basis set and then that's the coefficient, right? So
whether it's a plane view or point source, you can use the same representation to express your
sound field.
So if you have more than one plane view, a lot of plane views, lot of point sources, still you can
use it because supervision does work. You just use them.
So the message is basically we've got these modes, the basis functions and you can represent
anything using that.
>>: So why would you represent it that way? Just because it's a base solution?
>> Thushara Abhayapala: One thing with the ->>: Once you get in the series.
>> Thushara Abhayapala: I'll come back to that infinite series thing in a second. So why you do
that is because these are orthogonal to each other. These basic sets are orthogonal. And
anything, if you decompose it to something orthogonal, then you can do things with it. You can
do clever things with those things, right?
So the story so far is spherical harmonics, together with that circle base spherical function, there
forms a basis function for the sound field. That's point number one. And point number two is any
arbitrary sound field, far field, near field, anything, you can decompose into the basis sets, which
is a nice thing.
So if you look at error signal processing problem such as informing DOA spatial sound field
recording for reconstruction later on, with inside that you can analyze the problems using these
spherical harmonics, because what you do is you get the wave field, use a bunch of microphones
or sensors and record the sound field.
So whatever the wave field. If you can decompose them in the orthogonal basis set, than rather
than applying your direct algorithm to the microphone output, you can apply things to these
coefficients of your basis set, because now you've got the orthogonal set. Initially you don't have
orthogonal set. By doing that you can do clever things.
Normally these techniques are calling the literature as spherical harmonical analysis. And we
believe it's a great deal of signal processing algorithm. It's basically a framework, a nice
framework to do this.
So in the literature, if you look at, you can find numerous work which does solve this [inaudible]
deestimation type problems or even some field recording and reconstruction problem using
spherical harmonic analysis.
So, for example, in conventional beam forming, you just have these weights of filters there and
then sum them and get your output.
But you can do the same thing with the spherical harmonic analysis by first converting these
signals into these harmonic domain or you can say Eigen space domain and do the processing
there. Why can you do that? Because any arbitrary beam pattern actually -- because it's a
function of space, you can express as them some of spherical harmonics. So because you can
do that, once you know these coefficients, you can multiply the output of your beam and
microphones or spherical harmonical co-efficients by the corresponding beam pattern coefficients
to get any beam pattern.
>>: You would actually be to these modes as opposed to doing your beam forming in the
frequency or something.
>> Thushara Abhayapala: You can do, first, you're decomposing to these modes. These modes
could be frequency dependent as well. Then you do beam forming there, because suppose you
have a particular share pattern. You can represent that by a sum of number of these modes or
spherical harmonics. So initially your incoming signal, if you're decomposing to these modes,
what you have to do is now each mode is now multiplied by these decomposing modes of beam
patterns, you get one, you can do free forming or beam frequency very easy in this work.
So what's the problem? So that's all great. I mean, people have done these things before in the
literature and there is a lot of papers out there how to do -- how to solve these problems using
spherical harmonics analysis.
But to do that, you need to effectively, given a sound field using microphone arrays, you need to
calculate or decompose your wave field into the spherical harmonics. So the one way of doing -so the question arises how do you do that? What's the best way of arranging your sensors to do
this? And so on.
So the one way of doing that is use spherical microphone arrays. So why spherical microphone
arrays? Because we're dealing with spherical. I mean, I have now I think probably sets of
spherical 500 times now, right? Because it's a sphere and you've got a spherical symmetry, you
should be able to use that.
If I look back at this equation again, this is my sound field. And these are the coefficients of the
sound fields. So in a sense these coefficients pretty much characterize all your sound field within
a given region. If you mathematically, say, multiply this side by this orthogonal functions and
integrate over sphere you can calculate the coefficients, provided you don't have any zeros here.
I'll come back to that point later.
So to do this, I mean, of course, in practice you need to approximate this integration by some kind
of a sampling or in a need to know the signal over the sphere. So what you do in a spherical
microphone array, you put microphones on a sphere at different positions and then approximate
that integration by a summation to calculate your spherical harmonic coefficients.
So there are two main types of spherical arrays. The one is based on open sphere, which myself
and Darren Moore proposed in 2002 where you simply put the microphone in like a free space.
So then there is less scattering and things like that.
The other approach is given by Yensma [phonetic] and Gary [inaudible] in the same conference.
Their approach is based on a registered sphere. So the Eigen mic is based on that work. And so
what they have a rigid sphere and then rather than having a open sphere, they have a scattering
included in the term. And so the open sphere, the basis zeros are a problem. What basis zero
means is open sphere you get this term. If this term goes to zero, you can't basically calculate
this coefficient because there is no power in that mode. But in the rigid spheres you get a
scattering term in there and then have a nonzero term there.
>>: I think it was Wassau 2005, they actually proposed someone else proposed a hemisphere on
a hard surface.
So it's like doubling of sensors.
>>: Thushara Abhayapala: Yeah, [inaudible] that's right. So, okay, so open series has problem
and rigid spheres, the problem is for low frequencies you need larger area sign, and you can't
build large areas. I mean, then for the high frequencies you need to have a small area and so on,
so forth. You need to have a strict orthogonal condition. Remember, I showed that orthogonal
condition for spherical harmonics, to satisfy that, that means this integration by summation, you
need to have certain points on the sphere and then you need to have a certain geometry. And
sometimes it's pretty hard to ->>: It's [inaudible].
>> Thushara Abhayapala: Then once you have this calibration and stuff like that. I mean, look,
in a sense -- spherical microphone arrays are a nice thing, right? In a sense I proposed such a
thing before and I like that whole thing. But it has a touch of inflexibility built in, because all this
you need to have a sphere. So that's kind of comes to the main theme of this talk.
>>: So you want $30,000...
>> Thushara Abhayapala: So the other thing is, okay, as a solution to these two problems, I
mean people -- [inaudible] proposed concentric spheres, so you have spheres in the middle and
open sphere in the outside. And so on.
So you can get rid of some of these problems. But, still, you need to have a spherical array. So
the aim of this talk is pretty much can we do this thing or can we calculate spherical mark
spherical harmonical coefficients for this, so a bit more flexibility in the design. So now I'm going
into that part.
So let's look at a circular picture. So this is the same, the field on a particular circle, position and
RQ to QTQ [phonetic], that means you have an elevation angle TRQ and distance RQ. And you
got a circle there. So if I have that circle I can still write this equation, it's the same equation but
then I now fix RQ and TQ and five is my variable within the circle.
And the other thing is instead of spherical harmonics, I just expanded using EM-5 which is I
denoted by this. This is just the exponential function and this is the legend function which is given
by that.
Because now I'm fixing TRQ, I want to write it out in that way.
So going back to that question, the truncation, because remember previous we had infinite
summation. And actually this could be truncated because there is this basal function there. And
then the basal functions for the high orders, they're pretty much zero here. So depending on
what you have in the argument, you can say, okay, up to this more, up to, say, N equals 5 you
need to consider and all the other things are pretty much 0.
So using that fact you can truncate this series. Okay. So with the circular array, that's the
equation I have, I can multiply both sides by E to the bar minus N5, which is just the exponential
function, and then integrate to get this equation, right. So this is sort of the key equation I'm
going to use for the rest of the talk.
So the right-hand side is just equal to your signal on the circle, multiplied by exponential function
and then integration.
So if you look at this equation, what I have is in the, actually the right-hand side of this equation,
left-hand side of this equation is weighted sum of sound field coefficient, this alpha and MK. So
given a sound field what I need is to calculate these guys, right?
So this idea is just a weighted sum of these coefficients, and we can provide this equation for a
number of M from N minus N to plus N and then capital N is the truncation, right?
So what we tried to do is extract these coefficients. Somehow carefully placing these circles on
3-D. Okay. So this section is on sampling of circles where I just say the circles, right, but you
need to know the sound field on these circles, the best way to do the sampling; but, again,
because of the properties of the basal function, there is this particular RQ. So the RQ it can
truncate a given, for that particular radius you can truncate that series by MQ. That means the
highest EM exponential function you have there is NQ. So that means you can just use the
sampling theorem to say that you can sample this circle by using, say, 2 NQ plus 1 number of
points.
So pretty much that's what I do here. So let's keep that one. Okay. So I titled this one saying
uninspired leave squares [phonetic]. So the reason for that is so I've got now these circles, right?
And I have number of microphones so I have all the modes coming from those circles. So if I
want to calculate my alphas, what I might do I can just throw in a couple of circles arbitrary, right,
at different TRQ and different RQ, and then set the leave square solution [phonetic]. So we need
to design N into it, we need to know N plus one spherical harmonical coefficients. If I throw in
number of circles and for each circle we can write this equation, and then I can set up this same
multi-equation, right?
But the problem is then to solve my coefficients alpha M -- actually, this has to be alpha M. Alpha
M, I need to invert this J. So invert J and J, I mean it has to be a nonsingular metrics. But if you
just put arbitrary circles, I mean some of these terms could go to zero, right? If that happened,
this is going to be a singular metrics.
Now, if you have a singular metrics, then this thing doesn't work. Basically if you try to do
arbitrary, this is not going to work. So that's why I said it's the uninspired leave square solution.
So the goal of this work is trying to do an informed decision on where to place these circles or
trying to be more clever.
So as I said here, I mean if you do the uninspired leave squares you just calculate the alpha by
that. But because we're choosing an arbitrary, this could have similarities.
Okay. So why you can have similarities, because in this one I just plot these associated legend
functions. This is in DB. Magnitude, and then this is just the angle theta, which is the variable.
So as you can see, the sum of these legend functions goes to zero at some point, some angles.
So if you happen to have your circle on those particular angles, you basically don't get any signal
on the particular mode. And then that means those basal metrics has zero elements there. And
then if you happen to have more zero elements in these things and these metrics could be
singular and then you might not be able to invert that.
The same way the basal functions could go to zero as well and then these terms could go to zero
again and then there will be similarities.
So the M is, okay, how can we do a better job then just putting circles everywhere? So to do that,
we again inspect this equation. This equation is we got the sound field on a circle and then
integrate, multiply both sides by the exponential function to get this equation right.
Consider a single thing at the origin. So in this case we could argue TRQ equals 00. At the
origin, these basal functions are 0 except the 0 total basal function. So that means at the origin
you just have zero more harmony. Alpha zero zero and all the other harmonics are 0. So if you
choose origin there with respect to that origin, you basically have a single mode, which is the DC
term mode. That's obvious, right? So you can just calculate that thing from the single
measurement at the origin.
Okay. That's the first point. The second point is here I plot, again, PNMT, that means associated
agenda function. For the left-hand side one I plot the associated legend functions where N plus
M are even and then this one is N plus M resolved.
So if you look at that, at 90 degrees, when N plus M is old, it has zero. And all of them has a
zero. What does that mean? That means that the azimuth is 90. That means XY plane, right?
For the whole XY plane, you don't get any of the odd harmonics. Odd means N plus M odd. So if
I have a microphones on the XY plane, you don't get any an edge on the old harmonics.
Whereas, for the even harmonics, you get very good integer sign that pretty much have high
energies.
So that means putting a microphone on an XY plane, you can get only even harmonics, can't
have any of the odd harmonics. In a sense it's good or bad. Bad because you don't get the old
ones but good because you get only even ones, because the whole idea is to extract coefficients.
So if you put microphones on the XY plane you don't have to worry about the odd harmonics,
okay. So that's the second point. The third point is when theta equals 0 degrees, that means up
there, that means S axis, this agenda functions are 0 for M not equal to 0.
They're not equal to 0 if M equals 0. That means on all the points on said axis, there are only
harmonics available when M equals 0. So the three key three points I'm showing there is at the
origin you just have the DC term or the 0 mode. On the XY plane you just have even harmonics,
not odd harmonics. In said axis, you have only M equals 0 harmonics. So these are the three
things. What we're trying to do is trying to exploit these three properties because previous work
in the spherical areas you need to have a sphere.
So the idea is rather than having a sphere, just try to use these three key properties some smart
way to design a spherical way which is equal to a spherical array.
So just to, again just to summarize these key three points. If you have a zero -- so this is the
order of the array you want to have. That means up to, say, fifth order you want to calculate the
number of spherical harmonical coefficients. So up to the fifth order you need 36 coefficients,
and from these 36, 21 of them you can calculate from even and 15 of the calculations, sorry, of
coefficients are odd.
So this is not like I mean normally you say, okay, there is a equal amount of even and odd things,
but in this case, because of that structure, N goes from zero to capital N and M goes from minus
N to plus N. It turned out there are more even coefficients than odd coefficients. Even means M
plus M even. Odd means N plus N even, because you simply have this triangular structure when
you look at these numbers more even things.
So, okay, so how are we going to use these three properties to exploit, how to exploit these
properties ways to design a hybrid area. So what we could do is we can start with the origin.
So since microphone array on a origin you get a DC term, which is good. Easy. The next one
we're going to have a number of circles on XY plane, right? Because now you can calculate even
coefficients.
So what we do is, for an N odd system, you plus N by two, if it's even, or N plus 1, odd number of
circles on XY plane [phonetic].
We choose the radii according to some rule, because the smaller the radius, you don't get the
high orders. In one sense it's good you can guarantee that the lower order is pressed, but not the
high orders. You can easily calculate those coefficients. We use this role, saying RQ is equal to
say 2 over K node. K node is some nominal frequency, because we want, at the end we want to
have the area operate over a broadband. But here we choose RQ according to a particular
frequency which is within the band.
I got a paper and I work with, described this method and also there's a [inaudible] journal version
as well. I'll probably skip the detailing how to choose the radii, but the idea is you choose it in
such a way that for a particular radius you've got all the lower orders available there but not the
high orders.
So the next radius you get more higher orders coming up. So that way you can actually control
what's happened between the metrics which we need to invert.
Okay. So this is all about choice of radius again using that basal plot. So with the XY plane, now
we have this equation. Now, previously the uninspired leave squares, we put these circles
everywhere and then write our equation. But here we now only putting circles on XY plane, so
we can now actually write this for XY plane. And because on the XY plane there are no odd
coefficients, we can add coefficients there because they're not there.
So you can write this equation and then calculate the alpha ME mean, meaning for even. Alpha
ME is just the each second coefficient which are even.
So basically you can solve this to find the alpha M even or even coefficient. So the trick there is,
by choosing -- I mean now this agenda functions are non-0 and then also choosing that radii, we
make sure that each row at least have one non-0 element and then in a sense the way we
choose this metrics is going to be closer to a lower triangular metrics. So in that way you can
invert that as well.
And also there are values for all these numbers in the lower triangle side of thing. I mean, you
might leave things up there but that's okay as long as it has energy on these other points.
Okay. So what I'm saying here is, yeah, by doing that on an XY plane you can make sure that
that matrix you need to invert does have good transition number. So for the M equals 0 case, of
course you can write that equation again and do your math again.
Right. So that's pretty much covers the, from the total number of spherical harmonics that we
want to evaluate, we can get the odd/even coefficients.
Now to get some more coefficients, we're going to put sensors on the said axis. So remember on
the said axis, spherical harmony components of M equals 0 is nonzero and the others are 0. So
using that property we're going to put sensors on the said axis to calculate some, few harmony
coefficients.
And, again, there is a rule we are to put the sensors from the origin, again, to make sure that you
have enough signal energy for those points for the particular harmonic you need to calculate.
And then because you already know some of the coefficients, you can put them in the other side
of the equation and then you can form your metrics equation and solve this.
Okay. So that's all good. So you've got XY plane. Number of circles on that. You've got some
sensors on your said axis, right? Now, that means there a few more coefficients that you need to
calculate. So just send the other one. So for a given order, say N or order four, fourth array, you
need total number of 25 coefficients. From XY plane you can calculate 15 of them and from the
said axis you can calculate two more. That leaves us eight more to be calculated, right? So to
do that, the rest of the modes or the rest of this spherical harmonical components, you need to
have more spherical -- I'm sorry, circular areas which are not on the XY plane or not on the said
axis, okay.
So that's where we're going to put the circles on the parallel planes. So the question is, where
can I place these parallel circles? Again, this one I plot the data function or the associated
agenda function or the odd runs, and then you can see, okay, there's a 0 there but then there is
another 0 there for some of them and there's another one there.
So you need to somehow avoid those points. So maybe you can put it here. Right? To be a bit
more precise, in this one I actually have done something else. Here I plot the associated
legendary functions where N minus N is equal to 1 and where M minus 1 is equal to three. You
can see this is a set of group and this is another set of group and minus M five you've got another
group. Of course that's a very high order and then to build such area it could be more impractical
as well.
So for here actually if you draw a line say from here minus 2-D B you can't find a range of angle
where you can actually put these circles on. Because if you put say within that range, you pretty
much go into cover all these modes within that circle. So that main server energy bill the same in
the other one as well.
So, for example, I mean this table shows to catch all the modes of where the difference is one,
you need to have circles either within that range or this range and this basically covers this, these
are the harmonics like two camera two three four five so on. When M equals -- M minus N is
equal to three, these are the possible ranges and then these are the components you need to
cover.
Now, remember, if you're looking for a high order microphone array, only you have to worry about
this. The higher difference. Because if you're just looking at third order array, you just need to
worry about these things.
Okay. So once you decided where to put your circle, then you can look at, again, what's the
radius I'm going to put over. What's the distance from the origin to that particular circle? Again,
you could make an educated choice so that you guaranteed that there wouldn't be any basal zero
coming into the picture.
So using that, as before, you can write this equation for the harmonics and then you can set up
this system of linear equations. And then the idea is that so by doing that, choosing that circle so
that they're not in zeros, you can make sure that this metrics does exist. I mean you can invert
that by choosing that way.
Okay. So this summary so far is, the whole method is based on placing circles. And the
performance of the system is based on how to choose these angles and then the distances from
the origin such that you don't get an nonsingular metrics to be inverted.
So when you choose the radii, we related that thing back to a nominal frequency K node. So the
question is how well this matrices behave within a desired bandwidth. Because you need to have
a broadband operation.
So suppose we decide bandwidth is an octave. Octave means a doubling of frequency. So, for
example, say K to 2KL, so recall we had RQ in terms of K node. So the challenge is whether this
is going to be operatable over one octave. So, again, we look at this basal function so that the
thing is we choose K node, so most of the time we choose RQs inverse proportional to some
number divided by K node. So that means that in a sense for a particular order we actually are
operating on this basal function. So it has to have values which are greater than, I mean well
over the noise threshold.
So, I mean, if the threshold is here, you need to operate in that range. That's the kind of effective
bandwidth you will get.
So you need to choose K node such that in effect the leverage you have on the basal function,
each of the basal function is within an octave. So what actually we've shown in those papers is if
you choose K node is equal to the lower end multiplied by exponential one over two. So
exponential one is equal 2.7. So that means 1.3 five times KL, then you pretty much have an
array which operate over a octave. So this one just putting some numbers on the two ends of
that band so that it's actually operate on that thing, so that means you don't get any zeros, but
you are well about the noise threshold so you can actually calculate the harmonic coefficients for
those particular frequencies as well.
Okay. So let's see an example. So in this one we sort of look at the simulation, we haven't built
this array. So we look at the fifth area, because in the leverage of this spherical harmonical
arrays, looking at all these number of points on this sphere, people look at up to third order. So
they haven't looked at the fifth order. So we can do the third order as well. But here I'm just
showing a fifth order run, which is more than than, of course, third order. So in this case we
placed three circles. So okay, first of all, for the fifth order area you need 36 coefficients.
So we place three circles on XY plane at 2 over K node and four over K node and third one, five
over K node. Then we put seven, 11 and 13 microphones on each of them. The other thing I
want to mention is the design is actually, can work for any octave, doesn't matter which octave it
was. But the number of microphones you need is the same.
It could be say from 3,000 or 6,000 to 12,000. So the design work for whatever the thing. But, of
course, depending on the actual band you need to have a different size area because, of course,
when the frequency goes up, you've got to, this is more the dimensions but the number stays the
same.
Okay. So we've got three sensors on the XY plane, five sensors on the said axis and two more
circles parallel to the said axis at 60 degrees and 30 degrees.
So the number of, total number of sensors are 54. So here you've got seven, 11 and 13 on the
XY plane and five sensors on the said and 11 and 13 on the other two circular arrays. I'm using
two more sensors than minimum number necessary on these circles so you get better results,
basically. So we need 54 sensors for this fifth node area. It's pretty big because in a sense
you're looking for 36 coefficients anyway. So I mean absolute bear minimum should be 36
coefficients if you don't consider noise and everything else. So this queued would operate over
one octave.
So for the particular simulation we choose 3,000 to 600 hertz. So that means, and then the
speed of propagation needs 340. And then the simulation done for the sum of those things at
40-DB signal to noise ratio at each sensor, at each sensor we add 40-DB noise compared to the
signal at the sensor.
So to test the arrays, for example, one is you have a plane wave coming into the array. What we
did this, we rotated the plane array over 360 degrees. For each angle we calculated the
harmonic coefficients and we stored it for all 360 degree. When you plot that you should get
basically spherical harmonics or the spherical harmonic decomposition of a plane wave. So here
I'm just showing you the fifth order and the fifth degree results. So this is alpha five, five, the
waiting for a plane wave. So the azimuth evaluation. So this is the theoretical one. If you don't
have any noise, this is what it should look like if the array had done a perfect job.
Now, this is at 3,000 hertz. At 40-DBSNR and this is at 4,500 hertz at 40-DBSNR as well and this
is at 6,000 hertz at 40-DBSNR. So as you can see, there's a little bit of degradation at the low
end and there would be more at the high end as well. And this is somewhat in the middle
frequency.
And I mean it's the same way -- this is the fifth order and fifth mode which is the most difficult to
estimate, because you don't get enough energy for the lower circles and so on and so forth. But
the lower order modes are more accurate than these ones.
So for -- to apply the whole thing for a beam forming, we calculate these weights. We have the
design, so we can calculate the wave. And then we can do a simple beam form into the wave.
Our beam has direction of 90 degrees, comma 90. So that means T top 90 or elevation 90 or
azimuth 90. This is the response at actually 3,000 hertz. At DSNR 40-DB. This is what it looked
like.
And this is at 4,500 hertz. This has less disturbance compared with 3,000 hertz because,
remember, that K node we choose pretty much close to this band, it's less than mid-band. And
then at 6,000 hertz this is what you get. So plotting them together so 3,000, 4,000, 5,000, 6,000,
all in our full DB using a 55 microphone fifth order array.
So to analyze the noise impact, just look at a particular elevation angle at 90 dress. And we run
the simulation for three different SNR levels and plot on top of each other. So I think particularly
for that particular angle there's less disturbance due to the noise.
So the conclusion is or kind of the take-home message is spherical harmonics are a great tool to
analyze array signal processing problems, whether it's DAO or informing or some field analysis.
And but to do that you need to calculate or decompose a given wave field into coefficients or the
spherical harmonical coefficients. Circle microphone arrays is a good way to do that, but it does
have some inflexibility. So in this, what we've done is show some alternate way of placing circles
on XY plane and some sensors on the said axis and another pair of circles to do this same thing.
>>: Zicheng Liu: Thank you. Questions?
>>: Actually, I have a couple. If you go back to the beam form of shape, here, the other one, for
4,000 and 500 hertz. If we just abide the straight role of itself, how are we to look like, worse than
this? And the sound capturing of video such also, my microphone big number of elements.
>> Thushara Abhayapala: I think you pretty much get the same thing for that particular
frequency. But with the delay in some beam forming, it's pretty hard to do, say, the frequency
beam with forming. And this one would be model domain, spherical harmonical domain. You can
easily do the frequency beam forming. It's, of course, for a particular thing, you can do delay
beam forming, right. But this one, because you have parameters, you can do adaptive beam
forming as well, because you need to just adapt these things, these coefficients rather than just
doing a delay in beam forming. So you get more flexibility when you look at the whole array
signal processing problem in the model domain or circle harmonic domain.
>>: I want to know if it's a good framework to design in, actually the position of the microphones.
In this particular case, how many elements will you actually need to cover the [inaudible] of that,
the five octaves.
>> Thushara Abhayapala: Yes. I mean, exactly, so in this one, you know, we got this symmetric
structure. So for this particular one octave you need 54 microphones, but for the, to add one
more octave you don't need another 54 because you can reuse some of these circles again. So
it's a nested area concept, because say that you do a next design for a next update, right, but
then there will be overlapping circles and then you can use, say, a couple of those circles again.
>>: I had just one question. So the basic idea is that you actually design the beam form by
placing circles on the different type of geometries. So, for instance, you have a sphere. You
have a circle, so on, according to what's in that system you span your weight equation into.
>> Thushara Abhayapala: Yeah. In a sense. I mean --
>>: You measured to go up and you'll see [inaudible] coming in to place you do your design
menu by the sampling. And then you naturally, of course, you need you to have some writing of
those samples. So this means that you build your model essentially on the way, the creation, like
normally we just throw elements and usually on a linear array and then we just went to give them.
But you do, you can look from a [inaudible] point of view and the design needs some placement,
planning on the placement and their placements.
>> Thushara Abhayapala: In a sense the designs on the placements is to decompose. After you
decompose, you can forget about all that. The other part is decoupling your D design, all the
tricks and things you apply, all the techniques you have applied for normal array processing
cannot apply for these coefficients.
So it's kind of two operations. Here you just simply decompose. Once you get is simply a beam
space framework. So the beam frame you can't apply anything else. I mean, in a sense they're
decoupled, right, you get is some orthogonal like [inaudible].
>>: Can I ask my question? And then how sensitive are you to these when you've taken those
samples, because you can't calibrate every other mental facing wave. You have a lot of
variations and the sides of three or four dbs. And a lot of variations again. So how sensitive it
gets when you map those, because that's really --
>> Thushara Abhayapala: Sure, all these placements now based on these functions, they're not
like repeatedly going down or up. They go slowly. So that means small variations is not a big
deal.
So it's not just we are just sitting on the high end and then the next point it's a low. To go to a low
point, it's sort of going through a curve. So moving a little bit here and there doesn't add a lot of --
of course, that will add a few things, I mean, if you were looking from the next coefficient and so
forth.
But it doesn't destroy the whole thing, yeah. That adds a bit of robustness to the collaboration
areas. Yeah.
>>: Zicheng Liu: More questions? So let's thank our speaker for the talk.
[Applause]
Download