A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology Joe Hayes Chief Technology Officer Acoustic3D Holdings Ltd joe.hayes@acoustic3D.com Jon South Acoustician Acoustic3D Holdings Ltd jon.south@acoustic3d.com Malcolm Duffield CEO Vastigo Limited malcolm.duffield@vastigo.com Professor Dr Phil Graham Queensland University of Technology p.graham@qut.edu.au Abstract In this paper we present a model of ‘treated sound’ (sound played back through loudspeakers) that diverges from conventional psychoacoustic understandings. We show that if sound is diffused into a listening space, using a mathematically ‘pure’ diffusion, it will effectively prevent the creation of specular reflections that might carry false but meaningful (i.e. audible and functional) acoustic cues – or what is commonly called “room noise”. Reducing the processing demand on the listeners’ audiology enables the listener to better identify and locate sources of sound in space. Because the acoustic energy presented to the ear has been simultaneously Acoustic Wavelet-encoded whilst being so diffused, it forms a contiguous phase-sound field that we argue better suits the integration period of human audiology, thereby enhancing sonic perception of recorded material. The nature of Acoustic Reflections In the following section, we analyse measurements obtained from both conventional and A3D loudspeakers. The speakers were energised with a unit impulse source (UIS) signal, and microphone measurements made of the resulting direct sound and early reflections from nearby surfaces. For the scope of this paper, these measurements will be referred to as Speaker Impulse Reflection Responses (SIRR). The reference listening room was a medium sized ‘L’ shaped long room – chosen to represent a typical home listening environment. The speakers were located adjacent to a wall near the apex of the ‘L’. A sample of 2 meters length* was captured by an instrument microphone (Audio Technica 4031) at tweeter height on both systems, at identical distance (of 1 meter), on-axis. *In an effort to relate measurements to spatial environments, time based measurements are converted to the corresponding distance sound travels at 343m/s. e.g. 5.8ms = 2m. Contrast Figures 1 through to 6, over the next two pages, compare and contrast a conventional loudspeaker (Rogers BBC Monitor) and an A3D loudspeaker (NewAudio AS8) in terms of their performance in this ‘typical listening space’. Figure 1 shows the SIRR for the conventional Rogers loudspeaker. The UIS response is seen localised at 0cm, with three clearly visible specular reflections (at approximately 80, 105 and 155 cm after the click). Whilst these would not necessarily be heard as echoes, our audiology would still have to process (and elect to ignore or incorporate) such sounds as cues in order to establish our spatial perception. Figure 1 - Rogers BBC Monitor loudspeaker SIRR as measured in the reference room Research Paper A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology Figure 2 depicts a Wavelet Energy Transform (WET) of the above SIRR.Mathematically, we are performing a Continuous Wavelet Transform (CWT). It allows us to look at the wavelet decomposition of the signal, and the energy content in the wavelet domain. From it you will see the energy reflected back into the listening space as three stalactite-like structures, each having an apex at approximately 80, 105 and 155 cm respectively after the click. The click itself also has a stalactitelike structure that informs us of an approximately 10cm mis-alignment in the excitation force between the low frequencies (up to approximately 3,200Hz) and the high frequencies (above 3,200Hz). This is due to the offset between the woofer and the tweeter on a flat front panel in a classic, passive-crossover, design such as that of the Rogers. Figure 2 - Rogers BBC Monitor loudspeaker Wavelet Energy Transform (WET) measured in the reference room Figure 3 being a view of the WET slice through 11,142Hz, clearly indicates the reflected energy at approximately 80, 105 and 155 cm after the click. The 80cm reflection contains the most energy whilst the 105 and 155cm reflections contain about equal energy. The distinct reflections in the time domain are typical of the coloration added by a listening room. Figure 3 - Rogers BBC Monitor loudspeaker WET at scale 7 (11,142 Hz) as measured in the reference room Acoustic3D Holdings Ltd Commercial in confidence Version 1B Page 2 of 6 Research Paper A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology Figures 4 through to 6 however show the same SIRR measurement for the A3D speaker, with its ‘Diffusion at Source’ design. Early specular reflections, as seen with the Rogers, might be expected at 80cm, 105cm and 160cm after the incident arrival (0cm), but are almost completely absent. Thus the listener’s audiology doesn’t have to process them as cues (false or otherwise). This, and listeners’ stated opinions of A3D speakers as ‘clearer’, would appear to support both the first and second parts of our ‘diffusion’ hypothesis. Figure 4 - A3D speaker SIRR measured in the same room Figure 5 depicts a WET that has a consistently darker, ‘less-correlated’, area after the original click representing the diffuse residual energy. This diffuse energy enables to room to remain ‘live’ sounding. Figure 5 - A3D speaker IRR measured in same room Figure 6: The WET profile centred on 11,142Hz, shows the absence of reflected energy at approximately 80cm, 105cm, and 155cm. Thus the hypothesised psychoacoustic contamination effect of the listening room appears to have been successfully removed. There is a much smaller energy presence at approximately 30cm that likely correlates to a small amount of internal reflection coming back through the cone of the source itself, given the internal width of the Satellite used in the A3D system is of the order of 297mm. Acoustic3D Holdings Ltd Commercial in confidence Version 1B Page 3 of 6 Research Paper A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology Figure 6 - A3D speaker IRR measured in same room In the above Figure 6 there are no obvious audible room reflections in the SIRR. The WET (Figure 5) also shows how diffuse (i.e. non-correlated) the residual energy is coming from the listening room (evidenced by the WET’s much darker bars). A significant difference between the Rogers and A3D speakers is the form that the radiated click energy takes going in to the listening space. The Rogers speakers, and other conventional speakers, rely on the uncontrolled electro-mechanical response of its components in the presentation of sound to the acoustic environment. It is proposed that A3D speakers present a controlled, wavelet encoded response to the acoustic environment. As wavelets offer superior timefrequency localisation, they produce a complete and contiguous phase-spatial image that conventional speakerscannot. Morlet wavelets are particularly well suited for this application: as Gaussians they have a simple analytic form and they work well reproducing complex energy paths in the reproduced sound field. These complex energy paths in the sound field, along with the antispecular-reflection properties, contribute significantly to the perception of a ‘real’ construction of an acoustic sound field when the audio is reproduced on A3D equipped speakers. Integration in human perception "A 1,000-Hz tone sounds like 1,000 Hz in a 1-second tone burst, but an extremely short burst sounds like a click. The duration of such a burst also influences the perceived loudness. Short bursts do not sound as loud as longer ones... A pulse 3 ms long must have a level about 15dB higher to sound as loud as a 0.5 second (500 millisecond) pulse. Tones and random noise follow roughly the same relationship in loudness vs. pulse length. The 100-ms region is significant... Only when the tones or noise bursts are shorter than this amount must the sound-pressure level be increased to produce loudness equal to that of long pulses or steady tones or noise. This 100 ms appears to be the integrating time, or the time constant, of the human ear.1 In simple terms the A3D wavelet encoded sound appears to give acoustic energy the ‘time to make its presence felt’. We might deduce from Everest’s observations - and those of others - that our audiology, possibly the Basilar membrane itself, needs some time to respond to the presence of patterns in sound. It would also appear likely that our audiological processes derive the zero time (or centre time) of such wave packets so as to be able to derive spatial information from such cues. 1 The Master Handbook of Acoustics - F. Alton Everest Acoustic3D Holdings Ltd Commercial in confidence Version 1B Page 4 of 6 Research Paper A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology The above being the case would appear to further support listeners’ anecdotal evidence that A3D speakers provide greater spatial perception (through clearer and contiguous consistent cues?). But ultimately we don't listen to clicks, we listen to sound and music. Impulse responses are important though, because they allow us to characterise Linear Time Invariant Systems, like loudspeakers and acoustic environments (generally). This is clearly an area that would benefit from further research - but it is also clear that this kind of research is the domain of neuroscience NOT loudspeaker design. A3D Alignment Figure 7 - 11,142Hz Wavelet Energy Transform Plot Figure 7 above shows the detailed view of the 11,142Hz WET for the A3D Satellite. This response is typical of a zero-phase wavelet, also known as a ‘wave packet’. We can see that the regression of the Morlet Wavelet is a very good fit. This energy coherency is what provides the contiguous energy patterns within the reverberant sound field created by A3D. As the cues are coherent, both in time and in their position in space, human audiology can now integrate a longer passage of cues thus better matching our priori maps of what ‘real’ sound is like. With the extension of the perception integration of a singular event, we suspect the listener will increase the attribution of ‘reality’ to that sound. The effect of listening room reflections, we hypothesise, is to ‘interrupt’ the contiguous listening window (100ms - Everest) effectively cutting short the ability of the human ear to completely integrate any passage of an event train of cues. Consequently the perception of loudness is reduced and the ‘reality’ in a spatial context is compromised. Conclusion A3D is a technology platform for loudspeaker designers to re-engineer the way a loudspeaker energizes an acoustic space. The authors hypothesize that by using zero-phase wavelets two major transformations occur; A - The ‘contamination’ effect created by the listening space reflections is significantly reduced, giving the listener the ability to decipher longer integrals of contiguous ‘as recorded’ temporal cues. Acoustic3D Holdings Ltd Commercial in confidence Version 1B Page 5 of 6 Research Paper A3D Contiguous time-frequency energized sound-field: reflection-free listening space supports integration in audiology B - Because wavelets are localized in both time and frequency, they produce a contiguous spatial image over time and space, whereas conventional speakers do not. Applications are likely two fold: To provide a clearer listening experience for consumer audio devices placed in a listening space. Devices such as TVs, In-Car Audio, HiFi, Home theatre systems, and public address systems, can benefit from improved clarity. Application to more ‘personal’ consumer electronics (smartphones, tablets, etc.) would produce a contiguous energy sound field that takes on the qualities of the audio signal, even when the listening space contribution may be insignificant. This is, of course, also applicable to the first group of products. It is further likely to have applications in VR and AR given their likely need for accurate spatial cues. Footnotes: A3D is a range of patented technologies created and owned by Acoustic3D Holdings Ltd. Academic enquiries should be directed to Joe Hayes. Vastigo Ltd is the Licensor of A3D technologies. Commercialisation enquiries should be directed to Malcolm Duffield. A3D equipped newaudio ‘Emergence AS8’ loudspeakers can be found online at http://www.newaudio.com.au Acoustic3D Holdings Ltd Commercial in confidence Version 1B Page 6 of 6