>> Mark Thomas: Good morning everyone. I'd like you to help me welcome Henna Tahvanainen from Aalto University. She completed her M.Sc in audio and acoustics technology in 2012 and she is currently a PhD student in the Virtual Acoustics Team where she works on the measurement and analysis of concert halls. Without further ado, you have the floor. >> Henna Tahvanainen: All right. Thanks, Mark. First of all, good morning, everyone and thanks for having me here. My name is Henna and I'm going to talk about my research group at Aalto University and then I'm going to talk about perceptual evaluation of concert halls. And then I'm going to talk about PhD topic which is actually the seat dip effect in concert halls. Before I start, I'm just going to explain a bit who I am. I have a couple of passions. One of my passions is music. I play the Finnish string instrument, kantele. That's actually what drove me into acoustics in the first place. My other passion is science or physics and I wanted to find a way to combine these two, so I did my Master's thesis on the simulation of this instrument and now I continue with my PhD in concert hall acoustics. The third passion I have is disseminating information, so I'm also a teacher at Aalto University and the Secretary-General for the Acoustical Society of Finland among other things. I also organize gatherings where students can meet representatives from the acoustics industry in Finland. Very briefly on Aalto University. It was established in 2010 when three universities with long histories decided to join forces and create a university that would have technology, economics and arts together. It's located right next to Helsinki, the capital of Finland in Espoo and the student enrollment is about 20,000, 22,000 students. Acoustic related research is actually done in two departments at the University due to historical reasons, but there are six professors altogether and around 40 people working on acoustic related issues. I represent the department of computer science where we have the virtual acoustics team which consists of two professors. We have a couple of postdocs and seven PhD students and a few masters students. Our main goal is to try to understand how the quality of sound is modified by room acoustics. We have done concert halls, auditoriums, home theaters, studios, cars, also recently. We would like to find links between architecture and perception of sound and possibly new metrics to measure acoustic qualities of concert halls, for example. We all know that there are standards for evaluating the concert hall sound, but from perceptual studies, more recent ones, we have found out that they don't actually very well correlate the objective parameters with the perception, and I'll be talking about that a bit later. The third goal of the team is actually developing room acoustics modeling methods, mainly finite difference time domain methods and we used them both in rooms and we used them more recently now for HRTF modeling. The two professors, the first is my professor, Tapio Lokki and he has a slightly bigger team. We have two postdocs working on measurement and analysis of concert halls and more in general, microphone arrays and also the prediction of nonlinear aspects in the concert hall acoustics. Also, the three first PhD students, we kind of work on the early reflections of the concert halls trying to understand the physics and perceptual relevance of those. We have one PhD student who is actually working more on audio augmented reality and navigation applications, and we also have two masters students working on different things. For example, we're trying to find out if we could use heat cameras in concert halls to try to see how strongly people feel about music, for example, in certain spaces. The second team, mostly folks in the room acoustic modeling is slightly smaller, but one of the interesting things that the team recently released is an open source GPU space finite difference time domain solver which works especially for small rooms, so if you are into that I highly recommend checking it out. It is available in GitHub. That's the team and the research done in the virtual acoustics team, more or less. And now I'm going to move into a bit more detail about what my group is doing and also where my research is related into, so perceptual evaluation of concert hall acoustics. In a concert we can have up to 100 musicians on the stage and 2000 listeners. Of course, when we want to build a new hall, we want to build it so people love to come and listen to music. We have all of these objective parameters that we can use while we are designing the hall, but they don't always correspond to the perception of the concert hall. Already, started by [indiscernible] around 50 years ago, he started classifying concert halls based on questionnaires. This work was also continued by Baron in the '90s with British concert halls. The problem with asking questions is that you can only compare one concert at a time and one hall. But we all know that the auditory memory is a bit shady at times, so if you evaluate a different concert the next day you might not actually be able to compare them very well. Of course, there is a question of interpretation of different questions. It's very important how you design the questionnaire for the listener. Our idea is that it might be very nice to hop in real time from one hall to another, have the same orchestra play all the time so there wouldn't be different members playing on different days, have the same music and have the same playing style timing and level. Oftentimes, orchestras practice in a certain hall and they learned the acoustics of the hall and then they adjust their playing according to that as well. We would like to have this kind of standardized way of listening to concert halls. We have roughly 2 options. We could provide a device for teleportation for people to hop on and off between different concerts or we can try to simulate a symphony orchestra and then bring that to a listening room and have people listen to that. This is our approach. Our main research tools that we use are the spatial impulse responses that we have recently been recording all over Europe with our loudspeaker orchestra. Here you can see the orchestra on stage in one of the halls. We use a six probe microphone to record the spatial learning impulse responses. Then we need some musical instruments for our orchestra, so we have anechoic recordings of 14 musical instruments and we have a listening room in our lab that has 24 channels in 3-D with a kind of self-made absorption all over the room. If I talk a bit about the tools, I can show you a small video about the loudspeaker orchestra and how the measurement setup looks like. We went around Europe on our concert tour with the loudspeaker orchestra and this is the setup in the [indiscernible] music [indiscernible]. We have 32 loudspeakers set on the stage in the positions that would simulate a symphony orchestra and then we have also manipulated the activities over the loudspeakers by combining two loudspeakers to some of them on the floor so that they would better respond to the activities of the instruments. The loudspeaker orchestra needs to musical instrument as well, so what we did is we took 14 professional musicians one by one in an anechoic room. They had the conductor video for tempo of playing and then they had the sheet here, of course. And they could listen to a piano track on open headphones while they were playing. Of course, we took only one instrument per instrument section, so we had to also create some methods for creation of section sound for the string instruments. For example, we did some live tracking of [indiscernible] motions of violin players in an orchestra and you can manipulate the bit by changing the tuning of the violin little bit, so you get sensation of different kinds of musical instruments. The last tools that we needed were analysis and visualization tools and here is one example of a plot that we frequently use in our work. We basically sum all of the loudspeakers on the stage so that we can mimic how the symphony orchestra would be at the receiver position. This is the impulse response, the time frequency development of the impulse response in the room. We take a rectangular window of the impulse response at a certain time window and then we take the frequency response of that. Here we start 20 milliseconds after the direct sound and then we increase up to 30 milliseconds after the direct sound and then we increase the time window 10 milliseconds at a time and then the red curve you see here is the frequency response of the full impulse response. We can quite nicely see the development of the frequency response over time here in the hall and then that helps us in the analysis and I will show a bit later some examples of that. The second tool that we use often, we need since we are recording with a six prong microphone, we want to know also the spatial temporal response of a room. For that we use our own method which is called spatial composition method where you basically just estimate the direction of arrival for each sample in an impulse response. Here is one example of a single source at a stage, a single loudspeaker. And then, again, we have the different time windows for the impulse response so you can see quite nicely how the spatial temporal impulse response developing here. And you can quite nicely see where some individual reflections are coming from. This is the direct sound and then you have some reflection of the wall, the back wall where there is some kind of construction in this concert hall that gives you this reflection. These are the analysis tools, but then when we perceptual relation we also need tools to create the listening tests. One of the ways to do direct comparisons with room acoustic qualities is actually taken from the food industry. You can listen to concert halls a bit like you are tasting wines. You can do this kind of vocabulary elicitation which I will explain in a bit. There is a very nice overview article of this in Physics Today. If you don't have time for anything else have a bit of time and read this. It's very informative and very nicely written. Yeah, so tasting music is like wine. One of the techniques that you use in wine tasting and in other food tasting or consumer product evaluation is called individual vocabulary profiling. This is also what we use for the concert halls. It's very nice for the assessors because you can create your own attributes for listening. You don't have complicated questionnaires or instructions of what to do. You can just listen and decide where you hear differences. First you listen and you develop your own attributes. You say I think this one is loud. This one sounds muddy. This one sounds like it has a lot of reverberation or this one sounds like it's very distant, for example. You can develop these attributes. Then in the second phase you take these attributes and you take the concert halls and you compare the concert halls with your own attributes. What we did is we had a simple AB list with different concert halls and different positions and we asked which one of the samples has more of the muddiness that you described. From there on we can do an analysis and some clustering which I will show in a bit. This approach assumes that there is some kind of common characteristics in the halls that are perceived the same way or in a similar manner by the assessors, but they might describe it with different words, so this is where the grouping in the clustering and different kinds of clusters for the attributes come in. Here's an example of how we created the workflow for the listening test using the IVP. First of all, it's altogether eight hours of listening which is kind of tough for the listener, so they do it in chunks. First there's a preference test. This is a simple preference test to get to know the samples and then we screen the people for audiometry. Then in the second phase you create these attributes and one of the ways to do that is to do an AB test where you ask the listener or the assessor to find the sample it's different and then describe the difference in their own words. Then there might be a lot of different attributes that come up from different assessors, so at some point they have to discuss with the examiner a bit and then they decide on a group of two to four attributes and then they do the AB comparison between the different halls. This is just one way of running this. Here you can see that I have two different music samples in this test that I will actually describe now. This is one of our latest works. It's actually now currently in review. We did this wine tasting test with six concert halls in Europe, three shoebox halls and three nonshoebox halls with a vineyard type of all, the Berlin Philharmonie and the Music Center in Finland and then a more of a fan shaped hall, the Philharmonie in Cologne. We took three different seating positions, one of the front, one in the part over to the side and then one in the balcony. Where there wasn't a balcony it was just further back at the same distance. We had two different musical samples, a excerpt from Bruckner with a lot of horns, a lot of brass instruments in general, very powerful. And then excerpt from Beethoven's Seventh Symphony which had more like strings and it was played on piano, so that we could fish out the differences between the concert halls, and the music does make a difference whether you like the acoustics are not. We took 28 subjects, professional musicians, amateur musicians and active concertgoers. We had six halls like I said before. That amounts to 15 pairs and because we had three seats in each hall and we wanted to compare all of them together so we had 45 pairs altogether for the listeners. Here you can see pictures of the positions that we chose. One thing to note is when we did reverberation time response measurements the rooms were unoccupied, so in some cases this altered the properties of the concert hall compared to one when it is completely occupied. We noticed in particular in the Musikverein in Vienna that because the seats are hard wood the reverberation time was a lot longer than it is actually when the hall is fully seated, so that may have affected the results. But otherwise you see that the seats have some [indiscernible] materials so they tried to also imitate the absorption of humans in a way. The front stall position and then one in the parterre off to the side, you can also see quite interesting how these have exactly the same physical distance, but they look completely different. And then there is no one at the balcony. It doesn't look exactly like a concert hall, our listening room, but nevertheless, we have covered the loudspeakers with a curtain so there are no visual cues and you use a touch screen to maneuver the test. Of course, we didn't tell the subjects exactly what they were listening to. We just explained to them the instructions and that was it. Here I have an example of how the paired comparison looks. Here you can hear the sound. [music]. You can quite seamlessly hop between the samples and listen and then decide for yourself in this case which one you like better. Which attributes did people actually come up with and how would we group them? We did this kind of classification with the help of maximum likelihood. Of course, when we asked people which one you like better or which one has more of that and we just get the binary data. For this we can get scale values out and then we can do classification based on maximum likelihood. The groups that showed up in both samples were loudness order bass, so how much sound there is, how bass it sounds. Reverberance so the subject is the impression of reverberation or how wide the sound is. And then clarity, so how well you can hear the different instruments, for example. Proximity, which is how intimate, how close the sound is for you or alternately how distant is. And then finally brightness, so how brilliant the sound is for you. You can see that the groups or the terms differ a bit within the samples but this is, of course, completely dependent upon the music style. But these groups are very similar to groups that have been found in previous studies as well. We could say that, we could argue that concert hall acoustics can be described with these more or less five groups. We also found some correlation between the groups and the detected parameters as you can see on the top. Mostly, these parameters or these attributes can be explained by some part of the loudness or the strength parameter at different frequencies and also the lateral energy that you have within the room. Then clarity can somewhat be explained by ratio between the early and the late sound, the C80, frequencies. What is interesting is that there is no, and this has also been found previously, that there is no clear objective parameter at the moment that would describe what proximity, what does it consist of in the objective sense. Some years back we also did a similar kind of listening test with a different methodology for nine Finnish concert halls and we ended up with very similar groups with clarity, reverberance, loudness, bassiness, proximity, so very similar terms. This was done with hierarchical clustering, so the method was a bit different. We found these attributes, but what about preference? Which called the people actually like? Which ones had the best acoustics? If we look at all of the data together of all of these people, we find that people most likely prefer the Berlin concert hall which is a kind of a standard shoebox house with quite a lot of bass. Interestingly, this preference can be correlated mostly with the attribute of proximity, which we cannot really explain in the objective terms yet, so this is something that we will definitely look into in the future. What is it actually acoustically that defines proximity? However, since we have untrained subjects and we also know that the preference might not be straightforward, we actually also did a kind of latent class segmentation. This was a method of trying to find out whether there are different groups within the data. We did find three preference groups. The first one is about 23 percent of the data and we couldn't find these people or this preference group. They liked the Berlin concert house the most, but we couldn't find any correlation between the attribute group. It's the kind of a group that we really can't explain that well. But the second group with this more interesting thing is that they seem to, you see here on the left you have all the shoebox halls, so they seem to prefer the shoebox halls over the non-shoebox halls. For this we found a significant correlation with the attributes of loudness with reverberation and proximity. This group tends to like loud, wide and enveloping reverberant sound that feels like it's close. Then the third group, here we can see that this group clearly prefers the column philharmonie, but also the other non-shoebox halls. For this we found that there is significant correlation between clarity and definition, so how well you can distinguish the different elements from the stage. And we have found significant negative correlation with reverberance, loudness and width, so these guys like a clear sound. In essence we have two preference groups. We have the people that like the shoeboxes which is loud and reverberant which is a brutally honest sound. The shoebox, you have a loud and wide sound with a lot of enveloping reverberation and you also have more bass and more high frequencies, which are somehow related to the intimacy of the sound and it feels like you are inside the music somehow. A lot of people describe it that way. But then the people that prefer the non-shoeboxes, they feel like they are looking at the music. The sound is clear and defined and less reverberant and there is less bass and less high-frequency, so the sound is also a bit more distant. This is what we gathered also from the attributes. Also, from the acoustics point of view the shoebox halls have a lot more lateral reflections which contribute to the loud and wide sound. I will actually talk about that a bit more now because I am moving into my research. This is kind of the main thing that we do in a way with this winetasting, but I am looking at a smaller subset of phenomena with the concert hall acoustics, which is the seat dip affect. It's actually very much related to these two differences between the shoebox and the non-shoebox halls. The seat dip affect, how does it manifest itself in frequency response or in the concert hall? First of all, you see that this is the time frequency plot of two different concert halls, shoebox hall, the Berlin concert house and a vineyard all, the Berlin Philharmonie. Here you can see the frequency response is 20 milliseconds after the desired sound and this is the frequency response of the full impulse response and here you can see the time frequency development of both of these halls. If we look at the 20 milliseconds of direct sound, we notice that there is a dip in the frequency response, mostly at low to mid frequencies. In an average shoebox hall the dip looks a bit like this. It's quite steep and it's wide. It means that it actually takes away a lot of the bass and especially mid frequencies in the direct sound. In the non-shoebox halls it typically looks like this. It's a very narrow dip and quite steep but usually at a lower frequency than in the shoebox halls. The main attenuation frequency and then also the width, this is how it manifests itself. Then we see that there is also something which helps us be able to correct the seat dip affect. We see in the final response there is no longer seat dip here at least. >>: What is the [indiscernible] of the one on the right? >> Henna Tahvanainen: This is the 20 millisecond and then this is every 10 milliseconds until 200 milliseconds and then the full which is around 2 seconds. We see that there is some kind of increase in the energy at the low frequencies and typically it's a lot higher in the shoebox halls than in the non-shoebox halls. In the non-shoebox halls it tends to be this dip that stays in the full response as well. In a way you can, if you think about what I said about the brutally honest sound for the non-shoebox halls, you can see it here that already in the 20 millisecond the frequency response kind of has its shape already and the energy just keeps on increasing, but the shape stays the same, so the frequency content is already there in the direct sound in a way. But in the non-shoebox halls you see that the 20 millisecond, the shape of the 20 millisecond frequency response is actually quite different and the final one. That means that some frequency content add up over time, so in a way the sound is more likely in that sense as well. This is all started off by the seat dip affect. Where does it actually come from, this dip at 20 milliseconds? It has to be something that is almost immediate to the direct sound. What it is actually is the direct sound is a destructive interference between the direct sound and the reflection that comes from the sound that bends between the seats. It's essentially a destructive interference between the signal and it's delayed copy. This happens when the sound from the source travels at a very low angle to the seat, to the level formed by the seat tops. Then it will bend between the seats and then it will reflect off the floor and leave the direct sound in the receiving position. Here you can see an example. Here is the frequency response of the direct sound and the 20 milliseconds after the direct sound. This is in the stalls and here you have one in the front seat of the balcony where you don't see this dip. It definitely has to do with the seats. >>: When people are seated in the hall do you see a reduce of it or an increase? >> Henna Tahvanainen: Actually, because we are dealing with such low-frequency here the audience doesn't have a big effect on the main attenuation at around 100 to 200 Hz. >>: It doesn't really matter if people are there or not? >> Henna Tahvanainen: No. In this case it doesn't matter. The main attenuation actually depends on the affected seat height. So when we have the direct sound and the reflection of the, from between the seats distance difference between the direct sound and the reflection, it should be half of a wavelength longer than the direct sound, so the seat height should correspond to one fourth of the attenuation frequency. And then at higher frequencies we also because the distance difference is not so long, we also get reflections from the seat tops and this causes some destruction in the higher frequencies like at 1 kHz or something like that. There aren't different seat designs in the concert hall and, of course, you can then change the attenuation frequency depending on your seat design. An open seat like the ones you are sitting on now, so there is some air between the seats, so actually then the sound reflects from the bottom of the seat back as well as from the seat floor. Then there are these kind of close seats where the seatback extends all the way up to the floor and then you just get reflections like this between the seats. In some cases where you have a raked floor the rake will block the seat back a bit and you will have reflection from here. The main attenuation frequency will depend on the seat height. >>: Does that mean closer is better? >> Henna Tahvanainen: I can show that here. The close seats what will happen is it will cause this kind of narrow dip because when you look at this as a close seat, this is an example of a close seat reflection. The path that the sound can travel are more or less of the same length. Imagine that this actually has to be a raked floor. I don't have a picture of that right now. But the raked floor that has a reflection like that and then the reflected sound actually is directed more upwards, so it doesn't go through the second bending, so to speak. So you end up having a lot of paths for the destructive sound that are the same length so you end up having a very narrow dip like that. Whereas, when you have an open seat, especially with a flat floor, like most of the shoebox halls are, you get a lot of different path lengths and then it means that the destructive interference is also spread over different frequencies. But yes, to answer your question of which one is better, is this better than this? We cannot, we actually rated the whole effect. This is what I'm trying to also explore a bit. The answer to your question, there has been throughout time since the effect was discovered in '64 was to try to completely remove the effect. You can raked the floor and you can move the frequency like I showed you. You can move the main attenuation at a lower frequency which, maybe it's not that bad. We are not that sensitive at low frequencies. Maybe that would be a good idea or since we know that it's due to reflections from the floor between the seats, what if we had absorbers or resonant pits between the seats, maybe that would remove the effect altogether. Okay, yeah. That's very nice. It works to some extent. It will move it to a different frequency, but there is actually evidence that, perceptual evidence that we want the seat dip affect to be there. We want the direct sound to lack some low frequencies so that they can come in later and they will sound more clear. They will not sound as muddy when they come a bit later than in the direct sound. I have here to shoebox halls that have this kind of wide seat dip effect and two nonshoebox calls that have a more narrow band effect. I did a listening test. I put low-frequency instruments, which I assumed would be heavily affected on the seat it effect. I put them to play in these concert halls and I invited people to do a listening test and I asked them to do this typical comparison test that we went through previously and I asked them about the level of base. Which one has more base? And if we look here you can see the answers. We see that the two shoebox halls are almost always chosen to have louder base, which you wouldn't immediately think so with such a heavy lack on the bass on the direct sound. So actually trying to move the seat dip affect to a lower frequency may not be such a good idea after all. We might actually want to have something like this. There's also another piece of evidence that shows this in other research where you had an echo that was or the direct sound of noise that was missing the low frequencies covering approximately this kind of a shape. And then another echo you could hear that was broadband. But when you played this together actually, the echo was perceived is heavily low path so you could hear the bass in the echo even though it was broadband. This kind of suggests that we might actually have an enhanced perception of bass if it's missing in the direct sound. >>: [indiscernible] frequency has been down on the very [indiscernible] lack of [indiscernible] >> Henna Tahvanainen: Yeah, here, definitely. This may not also be what we want, in fact. At the very low frequencies we don't want this to happen. >>: [indiscernible] >> Henna Tahvanainen: Yeah. They can. I'm not sure what answer works with that question. Yes, they can. There is this whole missing fundamental thing that, for example, here if I was missing the fundamental of the double bass I might get it from the harmonics. But in this case you have to consider the whole acoustics of the concert hall. Here you just get, there is just so much more energy coming at the later stage than here that I guess that overrules when you compare them. This is actually something that also has an implication on how the orchestra should play in different kinds of concert halls. What I'm looking into it now is kind of asynchronous, for intentional asynchronous playing. I have different instrument groups. I have the low frequencies instruments which are the double bass and the tuba, and then I have the triple, so the cello, mostly. And then I have the middle. And then I have the trouble which is all of the other violins and horns, all that stuff. I manipulated our loudspeaker orchestra and I put them to play all synchronously and then also asynchronously so the base would play first and then the middle and then the trouble and then also the other way around. I did an online listening test with the ordered samples and I asked people which playing they heard the best. And then I also took two different concert halls. One hall that has this shoebox hall where the low frequencies arrive a bit later and then a non-shoebox hall where the shape of the frequency response is already the same, so you have all of the frequency content in the direct sound already. That I asked people which condition they prefer the best. Of course, I didn't tell them what I was asking. What we found out was that in the shoebox hall most people prefer the option where the base instrument play a bit before everything else. If they play completely synchronously the same time, you notice that the base actually comes in a little bit later, so it might be masked more by what is already there. If they play a bit before you can actually get a bit more base out on the sound. Whereas in the non-shoebox hall here where the tape of the frequency sound doesn't change that much, there wasn't a significant difference between the synchronous and the bass playing first condition. Of course, people didn't really like that the treble instruments or violins played first and after that you get the symphony and the double bass. This is not a natural condition per se, but I just wanted to have it as a reference point. Now I'm running more, listening just on this in our lab now with the spatial loudspeaker system because this was just online in a brief like introduction to it. >>: [indiscernible] while they are playing. >> Henna Tahvanainen: There is actually a very nice detail about I don't know if you know [indiscernible]. He is already deceased that he was the conductor for the [indiscernible] for harmony. In that video when he explains the orchestra how to play, he actually tells the basses to enter a bit earlier than the rest of the instruments. So there is some kind of anecdotal evidence that conductors are doing this already in a way, but this is not linked with the concert hall acoustic properties. >>: I would imagine the results [indiscernible] everything on the stage, right? They are not [indiscernible] around it? >> Henna Tahvanainen: No. >>: So it could just be from experience, I guess. >> Henna Tahvanainen: Yeah. That's what they oftentimes do as well. And there might be also other reasons why they play, why they advised the bass to play in advance a little that might not just be this effect. Okay. I think that concludes my part of the presentation. The seat dip effect is still a very ongoing research process. We're not exactly sure what would be the best thing or what kind of recommendation we should give the architects, what kind of seats to build and things. But we are working on that. And we are also working towards more wine and other tasting methods for concert halls. If you have any more questions? >> Mark Thomas: Thank you. [applause]. >>: I've got two questions.