>> Mark Thomas: Good morning everyone. I'd like... Tahvanainen from Aalto University. She completed her M.Sc in...

advertisement
>> Mark Thomas: Good morning everyone. I'd like you to help me welcome Henna
Tahvanainen from Aalto University. She completed her M.Sc in audio and acoustics technology
in 2012 and she is currently a PhD student in the Virtual Acoustics Team where she works on
the measurement and analysis of concert halls. Without further ado, you have the floor.
>> Henna Tahvanainen: All right. Thanks, Mark. First of all, good morning, everyone and
thanks for having me here. My name is Henna and I'm going to talk about my research group at
Aalto University and then I'm going to talk about perceptual evaluation of concert halls. And
then I'm going to talk about PhD topic which is actually the seat dip effect in concert halls.
Before I start, I'm just going to explain a bit who I am. I have a couple of passions. One of my
passions is music. I play the Finnish string instrument, kantele. That's actually what drove me
into acoustics in the first place. My other passion is science or physics and I wanted to find a
way to combine these two, so I did my Master's thesis on the simulation of this instrument and
now I continue with my PhD in concert hall acoustics. The third passion I have is disseminating
information, so I'm also a teacher at Aalto University and the Secretary-General for the
Acoustical Society of Finland among other things. I also organize gatherings where students can
meet representatives from the acoustics industry in Finland. Very briefly on Aalto University. It
was established in 2010 when three universities with long histories decided to join forces and
create a university that would have technology, economics and arts together. It's located right
next to Helsinki, the capital of Finland in Espoo and the student enrollment is about 20,000,
22,000 students. Acoustic related research is actually done in two departments at the
University due to historical reasons, but there are six professors altogether and around 40
people working on acoustic related issues. I represent the department of computer science
where we have the virtual acoustics team which consists of two professors. We have a couple
of postdocs and seven PhD students and a few masters students. Our main goal is to try to
understand how the quality of sound is modified by room acoustics. We have done concert
halls, auditoriums, home theaters, studios, cars, also recently. We would like to find links
between architecture and perception of sound and possibly new metrics to measure acoustic
qualities of concert halls, for example. We all know that there are standards for evaluating the
concert hall sound, but from perceptual studies, more recent ones, we have found out that
they don't actually very well correlate the objective parameters with the perception, and I'll be
talking about that a bit later. The third goal of the team is actually developing room acoustics
modeling methods, mainly finite difference time domain methods and we used them both in
rooms and we used them more recently now for HRTF modeling. The two professors, the first
is my professor, Tapio Lokki and he has a slightly bigger team. We have two postdocs working
on measurement and analysis of concert halls and more in general, microphone arrays and also
the prediction of nonlinear aspects in the concert hall acoustics. Also, the three first PhD
students, we kind of work on the early reflections of the concert halls trying to understand the
physics and perceptual relevance of those. We have one PhD student who is actually working
more on audio augmented reality and navigation applications, and we also have two masters
students working on different things. For example, we're trying to find out if we could use heat
cameras in concert halls to try to see how strongly people feel about music, for example, in
certain spaces. The second team, mostly folks in the room acoustic modeling is slightly smaller,
but one of the interesting things that the team recently released is an open source GPU space
finite difference time domain solver which works especially for small rooms, so if you are into
that I highly recommend checking it out. It is available in GitHub. That's the team and the
research done in the virtual acoustics team, more or less. And now I'm going to move into a bit
more detail about what my group is doing and also where my research is related into, so
perceptual evaluation of concert hall acoustics. In a concert we can have up to 100 musicians
on the stage and 2000 listeners. Of course, when we want to build a new hall, we want to build
it so people love to come and listen to music. We have all of these objective parameters that
we can use while we are designing the hall, but they don't always correspond to the perception
of the concert hall. Already, started by [indiscernible] around 50 years ago, he started
classifying concert halls based on questionnaires. This work was also continued by Baron in the
'90s with British concert halls. The problem with asking questions is that you can only compare
one concert at a time and one hall. But we all know that the auditory memory is a bit shady at
times, so if you evaluate a different concert the next day you might not actually be able to
compare them very well. Of course, there is a question of interpretation of different questions.
It's very important how you design the questionnaire for the listener. Our idea is that it might
be very nice to hop in real time from one hall to another, have the same orchestra play all the
time so there wouldn't be different members playing on different days, have the same music
and have the same playing style timing and level. Oftentimes, orchestras practice in a certain
hall and they learned the acoustics of the hall and then they adjust their playing according to
that as well. We would like to have this kind of standardized way of listening to concert halls.
We have roughly 2 options. We could provide a device for teleportation for people to hop on
and off between different concerts or we can try to simulate a symphony orchestra and then
bring that to a listening room and have people listen to that. This is our approach. Our main
research tools that we use are the spatial impulse responses that we have recently been
recording all over Europe with our loudspeaker orchestra. Here you can see the orchestra on
stage in one of the halls. We use a six probe microphone to record the spatial learning impulse
responses. Then we need some musical instruments for our orchestra, so we have anechoic
recordings of 14 musical instruments and we have a listening room in our lab that has 24
channels in 3-D with a kind of self-made absorption all over the room. If I talk a bit about the
tools, I can show you a small video about the loudspeaker orchestra and how the measurement
setup looks like. We went around Europe on our concert tour with the loudspeaker orchestra
and this is the setup in the [indiscernible] music [indiscernible]. We have 32 loudspeakers set
on the stage in the positions that would simulate a symphony orchestra and then we have also
manipulated the activities over the loudspeakers by combining two loudspeakers to some of
them on the floor so that they would better respond to the activities of the instruments. The
loudspeaker orchestra needs to musical instrument as well, so what we did is we took 14
professional musicians one by one in an anechoic room. They had the conductor video for
tempo of playing and then they had the sheet here, of course. And they could listen to a piano
track on open headphones while they were playing. Of course, we took only one instrument
per instrument section, so we had to also create some methods for creation of section sound
for the string instruments. For example, we did some live tracking of [indiscernible] motions of
violin players in an orchestra and you can manipulate the bit by changing the tuning of the
violin little bit, so you get sensation of different kinds of musical instruments. The last tools
that we needed were analysis and visualization tools and here is one example of a plot that we
frequently use in our work. We basically sum all of the loudspeakers on the stage so that we
can mimic how the symphony orchestra would be at the receiver position. This is the impulse
response, the time frequency development of the impulse response in the room. We take a
rectangular window of the impulse response at a certain time window and then we take the
frequency response of that. Here we start 20 milliseconds after the direct sound and then we
increase up to 30 milliseconds after the direct sound and then we increase the time window 10
milliseconds at a time and then the red curve you see here is the frequency response of the full
impulse response. We can quite nicely see the development of the frequency response over
time here in the hall and then that helps us in the analysis and I will show a bit later some
examples of that. The second tool that we use often, we need since we are recording with a six
prong microphone, we want to know also the spatial temporal response of a room. For that we
use our own method which is called spatial composition method where you basically just
estimate the direction of arrival for each sample in an impulse response. Here is one example
of a single source at a stage, a single loudspeaker. And then, again, we have the different time
windows for the impulse response so you can see quite nicely how the spatial temporal impulse
response developing here. And you can quite nicely see where some individual reflections are
coming from. This is the direct sound and then you have some reflection of the wall, the back
wall where there is some kind of construction in this concert hall that gives you this reflection.
These are the analysis tools, but then when we perceptual relation we also need tools to create
the listening tests. One of the ways to do direct comparisons with room acoustic qualities is
actually taken from the food industry. You can listen to concert halls a bit like you are tasting
wines. You can do this kind of vocabulary elicitation which I will explain in a bit. There is a very
nice overview article of this in Physics Today. If you don't have time for anything else have a bit
of time and read this. It's very informative and very nicely written. Yeah, so tasting music is like
wine. One of the techniques that you use in wine tasting and in other food tasting or consumer
product evaluation is called individual vocabulary profiling. This is also what we use for the
concert halls. It's very nice for the assessors because you can create your own attributes for
listening. You don't have complicated questionnaires or instructions of what to do. You can
just listen and decide where you hear differences. First you listen and you develop your own
attributes. You say I think this one is loud. This one sounds muddy. This one sounds like it has
a lot of reverberation or this one sounds like it's very distant, for example. You can develop
these attributes. Then in the second phase you take these attributes and you take the concert
halls and you compare the concert halls with your own attributes. What we did is we had a
simple AB list with different concert halls and different positions and we asked which one of the
samples has more of the muddiness that you described. From there on we can do an analysis
and some clustering which I will show in a bit. This approach assumes that there is some kind
of common characteristics in the halls that are perceived the same way or in a similar manner
by the assessors, but they might describe it with different words, so this is where the grouping
in the clustering and different kinds of clusters for the attributes come in. Here's an example of
how we created the workflow for the listening test using the IVP. First of all, it's altogether
eight hours of listening which is kind of tough for the listener, so they do it in chunks. First
there's a preference test. This is a simple preference test to get to know the samples and then
we screen the people for audiometry. Then in the second phase you create these attributes
and one of the ways to do that is to do an AB test where you ask the listener or the assessor to
find the sample it's different and then describe the difference in their own words. Then there
might be a lot of different attributes that come up from different assessors, so at some point
they have to discuss with the examiner a bit and then they decide on a group of two to four
attributes and then they do the AB comparison between the different halls. This is just one way
of running this. Here you can see that I have two different music samples in this test that I will
actually describe now. This is one of our latest works. It's actually now currently in review. We
did this wine tasting test with six concert halls in Europe, three shoebox halls and three nonshoebox halls with a vineyard type of all, the Berlin Philharmonie and the Music Center in
Finland and then a more of a fan shaped hall, the Philharmonie in Cologne. We took three
different seating positions, one of the front, one in the part over to the side and then one in the
balcony. Where there wasn't a balcony it was just further back at the same distance. We had
two different musical samples, a excerpt from Bruckner with a lot of horns, a lot of brass
instruments in general, very powerful. And then excerpt from Beethoven's Seventh Symphony
which had more like strings and it was played on piano, so that we could fish out the
differences between the concert halls, and the music does make a difference whether you like
the acoustics are not. We took 28 subjects, professional musicians, amateur musicians and
active concertgoers. We had six halls like I said before. That amounts to 15 pairs and because
we had three seats in each hall and we wanted to compare all of them together so we had 45
pairs altogether for the listeners. Here you can see pictures of the positions that we chose.
One thing to note is when we did reverberation time response measurements the rooms were
unoccupied, so in some cases this altered the properties of the concert hall compared to one
when it is completely occupied. We noticed in particular in the Musikverein in Vienna that
because the seats are hard wood the reverberation time was a lot longer than it is actually
when the hall is fully seated, so that may have affected the results. But otherwise you see that
the seats have some [indiscernible] materials so they tried to also imitate the absorption of
humans in a way. The front stall position and then one in the parterre off to the side, you can
also see quite interesting how these have exactly the same physical distance, but they look
completely different. And then there is no one at the balcony. It doesn't look exactly like a
concert hall, our listening room, but nevertheless, we have covered the loudspeakers with a
curtain so there are no visual cues and you use a touch screen to maneuver the test. Of course,
we didn't tell the subjects exactly what they were listening to. We just explained to them the
instructions and that was it. Here I have an example of how the paired comparison looks. Here
you can hear the sound. [music]. You can quite seamlessly hop between the samples and listen
and then decide for yourself in this case which one you like better. Which attributes did people
actually come up with and how would we group them? We did this kind of classification with
the help of maximum likelihood. Of course, when we asked people which one you like better or
which one has more of that and we just get the binary data. For this we can get scale values
out and then we can do classification based on maximum likelihood. The groups that showed
up in both samples were loudness order bass, so how much sound there is, how bass it sounds.
Reverberance so the subject is the impression of reverberation or how wide the sound is. And
then clarity, so how well you can hear the different instruments, for example. Proximity, which
is how intimate, how close the sound is for you or alternately how distant is. And then finally
brightness, so how brilliant the sound is for you. You can see that the groups or the terms
differ a bit within the samples but this is, of course, completely dependent upon the music
style. But these groups are very similar to groups that have been found in previous studies as
well. We could say that, we could argue that concert hall acoustics can be described with these
more or less five groups. We also found some correlation between the groups and the
detected parameters as you can see on the top. Mostly, these parameters or these attributes
can be explained by some part of the loudness or the strength parameter at different
frequencies and also the lateral energy that you have within the room. Then clarity can
somewhat be explained by ratio between the early and the late sound, the C80, frequencies.
What is interesting is that there is no, and this has also been found previously, that there is no
clear objective parameter at the moment that would describe what proximity, what does it
consist of in the objective sense. Some years back we also did a similar kind of listening test
with a different methodology for nine Finnish concert halls and we ended up with very similar
groups with clarity, reverberance, loudness, bassiness, proximity, so very similar terms. This
was done with hierarchical clustering, so the method was a bit different. We found these
attributes, but what about preference? Which called the people actually like? Which ones had
the best acoustics? If we look at all of the data together of all of these people, we find that
people most likely prefer the Berlin concert hall which is a kind of a standard shoebox house
with quite a lot of bass. Interestingly, this preference can be correlated mostly with the
attribute of proximity, which we cannot really explain in the objective terms yet, so this is
something that we will definitely look into in the future. What is it actually acoustically that
defines proximity? However, since we have untrained subjects and we also know that the
preference might not be straightforward, we actually also did a kind of latent class
segmentation. This was a method of trying to find out whether there are different groups
within the data. We did find three preference groups. The first one is about 23 percent of the
data and we couldn't find these people or this preference group. They liked the Berlin concert
house the most, but we couldn't find any correlation between the attribute group. It's the kind
of a group that we really can't explain that well. But the second group with this more
interesting thing is that they seem to, you see here on the left you have all the shoebox halls, so
they seem to prefer the shoebox halls over the non-shoebox halls. For this we found a
significant correlation with the attributes of loudness with reverberation and proximity. This
group tends to like loud, wide and enveloping reverberant sound that feels like it's close. Then
the third group, here we can see that this group clearly prefers the column philharmonie, but
also the other non-shoebox halls. For this we found that there is significant correlation
between clarity and definition, so how well you can distinguish the different elements from the
stage. And we have found significant negative correlation with reverberance, loudness and
width, so these guys like a clear sound. In essence we have two preference groups. We have
the people that like the shoeboxes which is loud and reverberant which is a brutally honest
sound. The shoebox, you have a loud and wide sound with a lot of enveloping reverberation
and you also have more bass and more high frequencies, which are somehow related to the
intimacy of the sound and it feels like you are inside the music somehow. A lot of people
describe it that way. But then the people that prefer the non-shoeboxes, they feel like they are
looking at the music. The sound is clear and defined and less reverberant and there is less bass
and less high-frequency, so the sound is also a bit more distant. This is what we gathered also
from the attributes. Also, from the acoustics point of view the shoebox halls have a lot more
lateral reflections which contribute to the loud and wide sound. I will actually talk about that a
bit more now because I am moving into my research. This is kind of the main thing that we do
in a way with this winetasting, but I am looking at a smaller subset of phenomena with the
concert hall acoustics, which is the seat dip affect. It's actually very much related to these two
differences between the shoebox and the non-shoebox halls. The seat dip affect, how does it
manifest itself in frequency response or in the concert hall? First of all, you see that this is the
time frequency plot of two different concert halls, shoebox hall, the Berlin concert house and a
vineyard all, the Berlin Philharmonie. Here you can see the frequency response is 20
milliseconds after the desired sound and this is the frequency response of the full impulse
response and here you can see the time frequency development of both of these halls. If we
look at the 20 milliseconds of direct sound, we notice that there is a dip in the frequency
response, mostly at low to mid frequencies. In an average shoebox hall the dip looks a bit like
this. It's quite steep and it's wide. It means that it actually takes away a lot of the bass and
especially mid frequencies in the direct sound. In the non-shoebox halls it typically looks like
this. It's a very narrow dip and quite steep but usually at a lower frequency than in the shoebox
halls. The main attenuation frequency and then also the width, this is how it manifests itself.
Then we see that there is also something which helps us be able to correct the seat dip affect.
We see in the final response there is no longer seat dip here at least.
>>: What is the [indiscernible] of the one on the right?
>> Henna Tahvanainen: This is the 20 millisecond and then this is every 10 milliseconds until
200 milliseconds and then the full which is around 2 seconds. We see that there is some kind of
increase in the energy at the low frequencies and typically it's a lot higher in the shoebox halls
than in the non-shoebox halls. In the non-shoebox halls it tends to be this dip that stays in the
full response as well. In a way you can, if you think about what I said about the brutally honest
sound for the non-shoebox halls, you can see it here that already in the 20 millisecond the
frequency response kind of has its shape already and the energy just keeps on increasing, but
the shape stays the same, so the frequency content is already there in the direct sound in a
way. But in the non-shoebox halls you see that the 20 millisecond, the shape of the 20
millisecond frequency response is actually quite different and the final one. That means that
some frequency content add up over time, so in a way the sound is more likely in that sense as
well. This is all started off by the seat dip affect. Where does it actually come from, this dip at
20 milliseconds? It has to be something that is almost immediate to the direct sound. What it
is actually is the direct sound is a destructive interference between the direct sound and the
reflection that comes from the sound that bends between the seats. It's essentially a
destructive interference between the signal and it's delayed copy. This happens when the
sound from the source travels at a very low angle to the seat, to the level formed by the seat
tops. Then it will bend between the seats and then it will reflect off the floor and leave the
direct sound in the receiving position. Here you can see an example. Here is the frequency
response of the direct sound and the 20 milliseconds after the direct sound. This is in the stalls
and here you have one in the front seat of the balcony where you don't see this dip. It
definitely has to do with the seats.
>>: When people are seated in the hall do you see a reduce of it or an increase?
>> Henna Tahvanainen: Actually, because we are dealing with such low-frequency here the
audience doesn't have a big effect on the main attenuation at around 100 to 200 Hz.
>>: It doesn't really matter if people are there or not?
>> Henna Tahvanainen: No. In this case it doesn't matter. The main attenuation actually
depends on the affected seat height. So when we have the direct sound and the reflection of
the, from between the seats distance difference between the direct sound and the reflection, it
should be half of a wavelength longer than the direct sound, so the seat height should
correspond to one fourth of the attenuation frequency. And then at higher frequencies we also
because the distance difference is not so long, we also get reflections from the seat tops and
this causes some destruction in the higher frequencies like at 1 kHz or something like that.
There aren't different seat designs in the concert hall and, of course, you can then change the
attenuation frequency depending on your seat design. An open seat like the ones you are
sitting on now, so there is some air between the seats, so actually then the sound reflects from
the bottom of the seat back as well as from the seat floor. Then there are these kind of close
seats where the seatback extends all the way up to the floor and then you just get reflections
like this between the seats. In some cases where you have a raked floor the rake will block the
seat back a bit and you will have reflection from here. The main attenuation frequency will
depend on the seat height.
>>: Does that mean closer is better?
>> Henna Tahvanainen: I can show that here. The close seats what will happen is it will cause
this kind of narrow dip because when you look at this as a close seat, this is an example of a
close seat reflection. The path that the sound can travel are more or less of the same length.
Imagine that this actually has to be a raked floor. I don't have a picture of that right now. But
the raked floor that has a reflection like that and then the reflected sound actually is directed
more upwards, so it doesn't go through the second bending, so to speak. So you end up having
a lot of paths for the destructive sound that are the same length so you end up having a very
narrow dip like that. Whereas, when you have an open seat, especially with a flat floor, like
most of the shoebox halls are, you get a lot of different path lengths and then it means that the
destructive interference is also spread over different frequencies. But yes, to answer your
question of which one is better, is this better than this? We cannot, we actually rated the
whole effect. This is what I'm trying to also explore a bit. The answer to your question, there
has been throughout time since the effect was discovered in '64 was to try to completely
remove the effect. You can raked the floor and you can move the frequency like I showed you.
You can move the main attenuation at a lower frequency which, maybe it's not that bad. We
are not that sensitive at low frequencies. Maybe that would be a good idea or since we know
that it's due to reflections from the floor between the seats, what if we had absorbers or
resonant pits between the seats, maybe that would remove the effect altogether. Okay, yeah.
That's very nice. It works to some extent. It will move it to a different frequency, but there is
actually evidence that, perceptual evidence that we want the seat dip affect to be there. We
want the direct sound to lack some low frequencies so that they can come in later and they will
sound more clear. They will not sound as muddy when they come a bit later than in the direct
sound. I have here to shoebox halls that have this kind of wide seat dip effect and two nonshoebox calls that have a more narrow band effect. I did a listening test. I put low-frequency
instruments, which I assumed would be heavily affected on the seat it effect. I put them to play
in these concert halls and I invited people to do a listening test and I asked them to do this
typical comparison test that we went through previously and I asked them about the level of
base. Which one has more base? And if we look here you can see the answers. We see that
the two shoebox halls are almost always chosen to have louder base, which you wouldn't
immediately think so with such a heavy lack on the bass on the direct sound. So actually trying
to move the seat dip affect to a lower frequency may not be such a good idea after all. We
might actually want to have something like this. There's also another piece of evidence that
shows this in other research where you had an echo that was or the direct sound of noise that
was missing the low frequencies covering approximately this kind of a shape. And then another
echo you could hear that was broadband. But when you played this together actually, the echo
was perceived is heavily low path so you could hear the bass in the echo even though it was
broadband. This kind of suggests that we might actually have an enhanced perception of bass if
it's missing in the direct sound.
>>: [indiscernible] frequency has been down on the very [indiscernible] lack of [indiscernible]
>> Henna Tahvanainen: Yeah, here, definitely. This may not also be what we want, in fact. At
the very low frequencies we don't want this to happen.
>>: [indiscernible]
>> Henna Tahvanainen: Yeah. They can. I'm not sure what answer works with that question.
Yes, they can. There is this whole missing fundamental thing that, for example, here if I was
missing the fundamental of the double bass I might get it from the harmonics. But in this case
you have to consider the whole acoustics of the concert hall. Here you just get, there is just so
much more energy coming at the later stage than here that I guess that overrules when you
compare them. This is actually something that also has an implication on how the orchestra
should play in different kinds of concert halls. What I'm looking into it now is kind of
asynchronous, for intentional asynchronous playing. I have different instrument groups. I have
the low frequencies instruments which are the double bass and the tuba, and then I have the
triple, so the cello, mostly. And then I have the middle. And then I have the trouble which is all
of the other violins and horns, all that stuff. I manipulated our loudspeaker orchestra and I put
them to play all synchronously and then also asynchronously so the base would play first and
then the middle and then the trouble and then also the other way around. I did an online
listening test with the ordered samples and I asked people which playing they heard the best.
And then I also took two different concert halls. One hall that has this shoebox hall where the
low frequencies arrive a bit later and then a non-shoebox hall where the shape of the frequency
response is already the same, so you have all of the frequency content in the direct sound
already. That I asked people which condition they prefer the best. Of course, I didn't tell them
what I was asking. What we found out was that in the shoebox hall most people prefer the
option where the base instrument play a bit before everything else. If they play completely
synchronously the same time, you notice that the base actually comes in a little bit later, so it
might be masked more by what is already there. If they play a bit before you can actually get a
bit more base out on the sound. Whereas in the non-shoebox hall here where the tape of the
frequency sound doesn't change that much, there wasn't a significant difference between the
synchronous and the bass playing first condition. Of course, people didn't really like that the
treble instruments or violins played first and after that you get the symphony and the double
bass. This is not a natural condition per se, but I just wanted to have it as a reference point.
Now I'm running more, listening just on this in our lab now with the spatial loudspeaker system
because this was just online in a brief like introduction to it.
>>: [indiscernible] while they are playing.
>> Henna Tahvanainen: There is actually a very nice detail about I don't know if you know
[indiscernible]. He is already deceased that he was the conductor for the [indiscernible] for
harmony. In that video when he explains the orchestra how to play, he actually tells the basses
to enter a bit earlier than the rest of the instruments. So there is some kind of anecdotal
evidence that conductors are doing this already in a way, but this is not linked with the concert
hall acoustic properties.
>>: I would imagine the results [indiscernible] everything on the stage, right? They are not
[indiscernible] around it?
>> Henna Tahvanainen: No.
>>: So it could just be from experience, I guess.
>> Henna Tahvanainen: Yeah. That's what they oftentimes do as well. And there might be also
other reasons why they play, why they advised the bass to play in advance a little that might
not just be this effect. Okay. I think that concludes my part of the presentation. The seat dip
effect is still a very ongoing research process. We're not exactly sure what would be the best
thing or what kind of recommendation we should give the architects, what kind of seats to
build and things. But we are working on that. And we are also working towards more wine and
other tasting methods for concert halls. If you have any more questions?
>> Mark Thomas: Thank you. [applause].
>>: I've got two questions.
Download