See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/289595500 Subjective evaluation of high resolution recordings in PCM and DSD audio formats Conference Paper · April 2014 CITATION READS 1 2,934 4 authors, including: Toru Kamekawa Tokyo University of the Arts 32 PUBLICATIONS 56 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Technical Ear Training View project Factors differentiating the 22.2- and 2-channel reproduced sound fields through an acoustic modeling of three listening rooms View project All content following this page was uploaded by Atsushi Marui on 29 December 2017. The user has requested enhancement of the downloaded file. Audio Engineering Society Convention Paper 9019 Presented at the 136th Convention 2014 April 26–29 Berlin, Germany This Convention paper was selected based on a submitted abstract and 750-word precis that have been peer reviewed by at least two qualified anonymous reviewers. The complete manuscript was not peer reviewed. This convention paper has been reproduced from the author’s advance manuscript without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request and remittance to Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the Audio Engineering Society. Subjective Evaluation of High Resolution Recordings in PCM and DSD Audio Formats Atsushi MARUI1 , Toru KAMEKAWA1 , Kazuhiko ENDO2 , and Erisa SATO2 1 Faculty of Music, Tokyo University of the Arts, 1-25-1 Senju, Adachi, Tokyo, 120-0034, Japan 2 TEAC Corporation, 1-47 Ochiai, Tama, Tokyo, 206-8530, Japan Correspondence should be addressed to Atsushi MARUI (marui@ms.geidai.ac.jp) ABSTRACT High-resolution audio production and consumption are increasing attraction supported by releases of the relatively affordable audio recorders from multiple manufacturers and broader bandwidth of the Internet. However, differences in audio quality between high-resolution audio formats are still not well known, especially between the different formats available for the audio recorders. In order to evaluate the differences between subjective impression of the sounds recorded using high resolution audio formats, three audio formats —PCM (192 kHz/24 bits), DSD (2.8 MHz), and DSD (5.6 MHz)— recorded with multiple studio-quality audio recorders were evaluated in a double-blind A-B comparison listening test. Six sound programs evaluated by forty-six participants on eight attributes revealed statistically significant differences between PCM and DSD but not between the two sampling frequencies (2.8 MHz and 5.6 MHz) of DSD. 1. INTRODUCTION While the music industry actively releasing perceptually coded versions for almost all new music releases, high-resolution audio production and consumption also is increasing attraction supported by broader bandwidth of the Internet realizing the music distribution over the Internet and production of relatively affordable high-resolution capable sound recorders from several manufacturers. Nevertheless, in spite of the use of high-resolution formats in the industry, differences in audio quality between highresolution audio formats are still not well known, especially between the different audio formats available for the sound recorders. Meyer and Moran [1], reported that they were unable to reject the null hypothesis of the listeners Marui et al. Subjective Evaluation of PCM and DSD of SACD and 44.1 kHz/16 bit could not differentiate between them. The source materials used in the test are not well described that it is difficult to know whether the result was from the difference in the playback formats. The similar experiment but in different approach was done by Woszczyk, et al. [2], and concluded that higher sampling rate (8-times the sampling rate of CD rate) was chosen to have higher degree of fidelity to the analog reference. While the two reports compare the different formats and/or sampling rates, Blech and Yang compared PCM and DSD having the same bit rate (2.8224 MHz against 176.4 kHz/24 bit) to find the listeners not able to discriminate between the two systems [3]. These research results are obtained from discrimination tasks where listeners choose which sound stimulus is different or the same to the other stimuli considering global impression of the stimuli including spatial, spectral, and temporal aspects. These aspects can be evaluated independently, but it was not done for the focus of their research was on simply to discover whether the listeners are able to discriminate between different formats. Our aim in this paper is to document the stimuli and the method used in evaluating the differences between subjective impression of the sounds recorded using different high resolution audio formats, especially between PCM and DSD. Three audio formats recorded with multiple studio-quality sound recorders were assessed in a subjective listening test on multiple attributes. The formats used in the test were PCM (192kHz/24bits), DSD (2.8MHz), and DSD (5.6MHz). Those three formats were chosen because they are some of the highest resolutions available in most of the consumer or professional audio recorders currently existing in the market. Also, the number of bits per second are comparable to each other; DSD (2.8 MHz) being the lowest among the three at 2,822,400, DSD (5.6 MHz) being the highest at 5,644,800, and PCM (192 kHz/24 bit) in the middle at 4,608,000. However, authors are aware that the processes of converting analog to digital and vice versa are different in PCM and DSD, and direct comparison of the number of bits is not very helpful to understand the differences in audio quality. The sound sources were recorded simultaneously using four recorder models and comparison between different recorders were done in the test, but a portion only relevant to the comparison of audio formats is discussed in this paper since benchmarking is not our intention. In Section 2, preparation of stimuli, listening environment, participants, and test method are explained. Section 3 describes the experimental setup and methods used in the test. Section 4 presents the result of the statistical analysis of the data, followed by the conclusion in Section 5. 2. STIMULI 2.1. Recording Devices and Formats The recorders used in the recording session were TASCAM DA-3000, TASCAM HS-2000, and KORG MR-2000S for PCM (192kHz/24bit), TASCAM DA3000, TASCAM DV-RA1000HD, and KORG MR2000S for DSD (2.8MHz), and TASCAM DA-3000 and KORG MR-2000S for DSD (5.6MHz) (also shown in Table 1). Although the actual model names of the recorders used in the test are revealed here for the readers to know the exact procedure we used in the stimuli preparation, we do not discuss about the test results of comparison between different recorder models of the same audio format since it is beyond the focus of this report. Only the data relevant to the comparison of the three audio formats recorded using the same recorder model will be discussed in the following sections. Since there are two such models in this study (DA-3000 and MR-2000S), only sources recorded using DA-3000 were used. Figure 1 shows the frequency and time responses of the analog-to-digital-to-analog converters for three formats. The responses were measured with a sweptsine signal (Optimized Aoshima’s Time-Stretched Pulse [4]) using RME Fireface UC audio interface operated with Apple Logic X software in 192 kHz/24 bit. The response of the audio interface is compensated. 2.2. Recording of Source Material Recording took place on April 27th, 2013, in Studio A and Studio B in Senju Campus of Tokyo University of the Arts. Jazz musicians (a trio of piano, bass, and drums), vocalists (two females and one male), voice actors (two females and one male), and AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 2 of 10 Marui et al. Subjective Evaluation of PCM and DSD Format PCM (192 kHz / 24 bit) DSD (2.8 MHz) DSD (5.6 MHz) Recorder TASCAM DA-3000 TASCAM HS-2000 KORG MR-2000S TASCAM DA-3000 TASCAM DV-RA1000HD KORG MR-2000S TASCAM DA-3000 KORG MR-2000S Table 1: Eight audio recorders used for recording music/speech performances. 1.5 1 2 DSD (2.8MHz) 1.5 0.5 Amplitude Power (dB) 0 -0.5 -1 PCM (192kHz/24bit) 0.5 DSD (5.6MHz) 0 -0.5 -1.5 PCM (192kHz/24bit) -1 -2 -1.5 -2.5 -3 DSD (2.8MHz) 1 DSD (5.6MHz) 16 31.5 63 125 250 500 1k 2k Frequency (Hz) 4k 8k 16k 32k -2 -60 -40 -20 0 20 Time (samples, Fs=192kHz) 40 60 Fig. 1: Left panel shows frequency responses of the audio recorder for three formats used in the test: DSD (2.8 MHz) (top), DSD (5.6 MHz) (middle), and PCM (192 kHz/24 bit) (bottom). The abscissa is frequency (Hz) and the ordinate is power (dB). Curves for DSD (2.8 MHz) and PCM (192 kHz/24 bit) are shifted by 1 dB for readability. Right panel shows time responses of the audio recorder for three formats used in the test: DSD (2.8 MHz) (top), DSD (5.6 MHz) (middle), and PCM (192 kHz/24 bit) (bottom). The abscissa is time in sample (192 kHz sampling rate) and the ordinate is amplitude. Amplitudes were scaled to [−1, +1) range. Curves for DSD (2.8 MHz) and PCM (192 kHz/24 bit) are shifted by 1 for readability. AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 3 of 10 Marui et al. Subjective Evaluation of PCM and DSD a classical pianist participated in the recording. The performers were recorded in separate sessions. All performers and the recording engineer are of high quality professionals and were paid for participation. The microphones (AKG C414, C451, D112, DPA 4006, 4011, 4015, Neumann U87Ai, Royer R-121, Sanken CO-100K, Shure SM57, and Sony C-38B) were connected to either a microphone amplifier of Trident S80 mixing console, Millenia HV-3D-8, John Hardy M-1, or API 512c and mixed to two channels on Trident S80 mixing console. The resulting two channel mix was paralleled to eight two-channel lines and sent to eight recorders. The recorders were operated to record the music/speech performance simultaneously. The calibration was done to −16 dBFS with an input of +4.0 dBu 1kHz sinewave achieving the equal recording levels within ±0.1 dB error measured with NTI XL2 Sound Level Meter. No effect processing was applied and no digital processing was used in the course of recording except for the analog-to-digital conversion in each recorder. 2.3. Stimuli Preparation Although there were 19 recorded sources (2 jazz trio performances, 6 percussion, 6 speeches, 4 vocals, and 2 piano performances), only six of them were used in the following subjective listening test: • drums solo • triangle solo • speech (male, in Japanese language) • vocal solo (female) • jazz trio • classical piano These six were selected on the basis of having wide spectral and temporal varieties that were thought to reveal the differences between the audio formats. Stimuli such as speech are added for their familiarity to the listeners. Among these, triangle solo and speech were recorded using only one microphone (Sanken CO-100K for the triangle and Neumann U87 for the speech), and thus two channels in the recordings are identical. The recorded materials were edited to be about 10 to 15 seconds each in the respective audio recorder without level adjustment and fade in/out applied to minimize the effect of signal processing. Any processing except for this trimming was done on none of the sounds in the course of making the stimuli. The frequency responses and amplitudes of the stimuli are shown in Figures 3 and 4. 3. LISTENING TEST 3.1. Listening Environment Listening tests were conducted in two separate sites: a listening room in TEAC Corporation (Site A) and Sound Production Studio in Senju Campus of Tokyo University of the Arts (Site B). Site A is a room used for critical listening and products evaluations with fairly less reverberation. Site B is a mixing studio conforming to ITU-R BS.1116 [5] used often for listening tests and evaluation of audio materials. Two TASCAM DA-3000 (from the same production lot with the same firmware version installed) were used for playback of all the stimuli. They were set to master- and slave-mode for playback synchronization. Hence, the same digital-to-analog converter was used for all stimuli played back. Outputs from DA-3000 were sent to a remote controllable monitor switcher (operates in analog domain) which enabled a listener to switch between one of the two playback sources. Two loudspeakers were positioned in the standard stereo playback according to ITU-R BS.775 [6], with 2.70 m (≈ 8.86 feet) from the listening position (Figure 2). Two Genelec 1032A were used at Site A and Genelec 8050A were used at Site B. A stereo volume controller was installed as a precaution for loud noise exposure to human subjects. Because no loud noise was emitted by accident, the level was kept at constant level throughout the experiment. Esoteric C-02 preamplifier was used at Site A and Tomoca TCC-100ST was used at Site B for the volume controller. 3.2. Participants Total of 46 listeners (30 and 16 people at Site A and B, respectively) with normal hearing participated in the test. Participants in Site A were selected from AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 4 of 10 Marui et al. Subjective Evaluation of PCM and DSD cascade out cascade in DA-3000 (master) L DA-3000 (slave) R L • temporal separability, • overall quality, and R • overall preference. switcher All attributes were provided in the listeners’ native language of Japanese. The choices were made on all eight attributes for one pair of stimuli before moving on to the next pair. The stimuli pairs were presented in a different random order for each of the participants. Stimuli in a pair were also assigned randomly to two playback systems as well. volume controller lo spe udake r t) fee 86 2.7 60° In order to reduce the duration of the test, only 10 pairs each for a given source material were done. The comparisons included in the test (also shown in Table 2) are: (8. e 6f 0 met) 8 (8. 2.7 0m dlou ker a e sp remote Fig. 2: Signal path in listening test setup. the people not involved in the development or evaluation of the recording devices. Participants in Site B are students and professors in Sound Recording program with timbral ear training experiences. 3.3. Test Design A double-blind two-intervals two-alternatives forced choice method (pairwise A-B comparison) was used for the listening test. A listener was presented with a pair of two stimuli and asked to listen carefully to the similarity and dissimilarity while freely switching between them, and asked to choose which of the two stimuli has higher sensation or impression related to a given attribute. The eight attributes used in the test are: • three pairs among the three recorders in PCM (192kHz/24bit): DA-3000, HS-2000, and MR2000S, • three pairs among the three recorders in DSD (2.8MHz): DA-3000, DV-RA1000HD, and MR2000S, • a pair of the two recorders in DSD (5.6MHz): DA-3000 and MR-2000S, and • three pairs among three formats PCM (192kHz/24bit), DSD (2.8MHz), and DSD (5.6MHz) on DA-3000. Ten comparisons each for six programs resulted in 60 trials. • image width, The test began after the instruction and a training session were given. The listeners were allowed to take a break at any time. The test was done individually for each participant and took approximately 1.5 to 2 hours each. Listening tests were done between August 5th to 30th, 2013, and the listeners were compensated for their participation. • image depth, 4. RESULTS AND DISCUSSION • image definition, • timbral brightness, • timbral richness, For the reasons discussed earlier, only the results of comparisons between three audio formats are presented in this section. AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 5 of 10 Marui et al. Subjective Evaluation of PCM and DSD Condition PCM (192 kHz) DSD (2.8 MHz) DSD (5.6 MHz) DA-3000 Stimulus 1 DA-3000 DA-3000 HS-2000 DA-3000 DA-3000 DV-RA1000HD DA-3000 DSD (5.6 MHz) DSD (2.8 MHz) DSD (5.6 MHz) Stimulus 2 HS-2000 MR-2000S MR-2000S DV-RA1000HD MR-2000S MR-2000S MR-2000S PCM (192 kHz) PCM (192 kHz) DSD (2.8 MHz) Table 2: Ten comparison pairs in the test. For each comparison, six programs were presented resulting in 60 trials. No noticeable differences were found in responses data between the two sites, therefore the data from two sites were summed. Binomial test was used to analyze the test results. p-values from the binomial test for each combination of source, comparison pair, and attribute are shown in Table 3. A p-value shows the probability of how likely that a left hand side format on “comparison” column is chosen to have the same level of sensation or impression on a given attribute to the right hand side format on the same row. Smaller the p-value is, statistically more significant that the left hand side has a higher level of sensation or impression on the given attribute. For example, Drums stimulus of DSD (5.6 MHz) has p = .001 when compared against PCM (192 kHz/24 bit) in overall preference. This suggests that DSD version was statistically significantly preferred over the PCM version. The last row (“combined”) shows the result of the binomial test of all sources combined. In the following discussion, statistical significance level α = .01 is used. It is indicated with two or three asterisks in Table 3. Spatial attributes, width, depth, and definition, were not significantly different for stimuli with monophonic contents (triangle and speech). Comparisons on these attributes was statistically significant for Vocal and Jazz Trio stimuli, and a subset of the attributes were found to be significantly different for Drums and Piano stimuli. The result is somewhat obvious that monaural stimuli cannot reveal the differences between the two formats. For timbral attributes, richness had significant dif- ferences for all comparisons between DSD and PCM, but only one significant difference with brightness between DSD (5.6 MHz) and PCM for Drums stimulus. Although the authors’ expectation was that brightness can be used to discriminate between the formats, opposite result was obtained. “Sharpness,” a synonymous attribute to brightness, is the attribute related to spectral centroid with weight on high frequency [7, 8]. The results suggest that participants were not able to hear the differences between the spectral differences in high frequency regions. On the other hand, although very subtle, difference in frequency curves of DSD and PCM is larger below 31.5 Hz compared to that of high frequency ranges above 16 kHz disregarding the spectral noise above 32 kHz in DSD (2.8 MHz) (Figure 1). Participants may have relied on the low frequency contents to discriminate the formats, and it is supported by Triangle stimulus which has less low frequency contents was not being highly significant in the comparison. The attribute temporal separability was found to be not significant for all stimuli. Overall quality and preference showed similar tendency of participants being chosen DSD (5.8 MHz) more than PCM (192 kHz/24 bit) for Drums, Speech, Vocal, and Jazz Trio stimuli. Recall that the participants were asked to choose which of the two stimuli has the higher sensation or impression in a given attribute. Therefore, DSD was chosen to have higher quality and preference in most of the attributes than PCM. AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 6 of 10 Marui et al. Subjective Evaluation of PCM and DSD There were no significant differences between DSD (2.8 MHz) and DSD (5.6 MHz) for all attributes in any of the source materials under α = .01 level. Combined result show the result of binomial test with response data from all six stimuli summed. Statistically significant differences in all attributes between PCM and DSD for both 2.8 MHz and 5.6 MHz are seen. On the other hand, significant differences between 2.8 MHz and 5.6 MHz of DSD were not found. 5. CONCLUSION In order to evaluate the differences between subjective impression of the sounds recorded using high resolution audio formats, three audio formats recorded with multiple studio-quality audio recorders were evaluated in a double-blind AB comparison listening test. Three formats are PCM (192 kHz/24 bits), DSD (2.8 MHz), and DSD (5.6 MHz). They were chosen because they are some of the highest resolutions currently available in most of the consumer or professional audio recorders. The three formats were compared by 46 participants on six sound programs and eight attributes. From the result of binomial test applied on the data from pairwise comparison experiment, statistically significant differences between PCM and DSD but not between the two sampling frequencies (2.8 MHz and 5.6 MHz) of DSD. Although there were stimuli (such as monaural sounds like Triangle and Speech) and attributes (such as brightness and temporal separability) that were not applicable to discriminate between the formats, stimuli having broad spectra and clear temporal transients (such as Vocal, Jazz Trio, and Piano) and attributes such as spatial width, spatial depth, timbral richness were able to be used to discriminate between DSD and PCM. Overall quality and preference showed similar tendency of in favor of DSD (5.6 MHz) over PCM (192 kHz/24 bit). 6. REFERENCES [1] E. Brad Meyer and David R. Moran. Audibility of CD-standard A/D/A loop inserted into highresolution audio playback. Audio Engineering Society, 55(9):775–779, September 2007. [2] Wieslaw Woszczyk, Jan Engel, John Usher, Ronald Aarts, and Derk Reefman. Which of the two digital audio systems best matches the quality of the analog system? In Proceedings of AES 31st International Conference, London, UK, June 2007. Audio Engineering Society. [3] Dominik Blech and Min-Chi Yang. DVD-Audio versus SACD: Perceptual discrimination of digital audio coding formats. In Proceedings of 116th Convention, Berlin, Germany, May 2004. Audio Engineering Society. [4] Nobuharu Aoshima. Computer-generated pulse signal applied for sound measurement. Journal of Acoustical Society of America, 69(5):1484–1488, May 1981. [5] International Telecommunication Union. Rec. ITU-R BS.1116-1: Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems, October 1997. [6] International Telecommunication Union. Rec. ITU-R BS.775-3: Multichannel stereophonic sound system with and without accompanying picture, August 2012. [7] G. von Bismarck. Timbre of steady sounds: A factorial investigation of its verbal attributes. Acoustica, 30:146–159, 1974. [8] G. von Bismarck. Sharpness as an attribute of the timbre of steady sounds. Acoustica, 30:159– 172, 1974. Authors were careful in preparing the stimuli and in conducting the experiment. Nevertheless, of which physical aspects participants were listening to when discriminating the formats are still not fully understood. It is our hope that this presentation serves for understanding the qualities of the high-resolution audio formats. AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 7 of 10 Marui et al. Subjective Evaluation of PCM and DSD Vocal Solo (female) 90 80 80 70 70 60 60 Power (dB) Power (dB) Drums Solo 90 50 40 50 40 30 30 20 20 10 10 0 0 16 31.5 63 125 250 500 1k 2k Frequency (Hz) 4k 8k 16k 32k 16 31.5 63 125 250 500 1k 2k Frequency (Hz) 90 80 80 70 70 60 60 50 40 16k 32k 4k 8k 16k 32k 4k 8k 16k 32k 50 40 30 30 20 20 10 10 0 0 16 31.5 63 125 250 500 1k 2k Frequency (Hz) 4k 8k 16k 32k 16 31.5 63 125 250 500 1k 2k Frequency (Hz) Speech (male) Classical Piano 90 90 80 80 70 70 60 60 Power (dB) Power (dB) 8k Jazz Trio 90 Power (dB) Power (dB) Triangle Solo 4k 50 40 50 40 30 30 20 20 10 10 0 0 16 31.5 63 125 250 500 1k 2k Frequency (Hz) 4k 8k 16k 32k 16 31.5 63 125 250 500 1k 2k Frequency (Hz) Fig. 3: Frequency responses of six stimuli used in the listening test. Power on vertical axis is not in a physical scale, but relative levels of six stimuli are preserved. Plots were generated from PCM (192 kHz/24 bit) version of the stimuli. AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 8 of 10 Marui et al. Subjective Evaluation of PCM and DSD Vocal Solo (female) 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 Amplitude Amplitude Drums Solo 1 0 -0.2 0 -0.2 -0.4 -0.4 -0.6 -0.6 -0.8 -0.8 -1 0 2 4 6 8 10 Time (sec) 12 14 -1 0 16 2 4 6 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 -0.4 -0.6 -0.8 -0.8 4 6 8 10 Time (sec) 12 14 -1 0 16 2 4 6 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 -0.2 -0.4 -0.6 -0.8 -0.8 6 8 10 Time (sec) 14 16 12 14 16 0 -0.6 4 12 -0.2 -0.4 2 8 10 Time (sec) Classical Piano 1 Amplitude Amplitude Speech (male) 1 -1 0 16 -0.2 -0.6 2 14 0 -0.4 -1 0 12 Jazz Trio 1 Amplitude Amplitude Triangle Solo 8 10 Time (sec) 12 14 16 -1 0 2 4 Fig. 4: Amplitude plots of six stimuli used in the listening test. (192 kHz/24 bit) version of the stimuli. 6 8 10 Time (sec) Plots were generated from PCM AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 9 of 10 View publication stats Spatial Timbral Width Depth Definition Richness Brightness 0.001 *** 0.024 * 0.005 ** 0.226 0.005 ** 0.011 * 0.024 * 0.024 * 0.002 ** 0.024 * 0.146 0.024 * 0.146 0.011 * 0.146 0.013 * 0.441 0.151 0.052 . 0.027 * 0.092 . 0.092 . 0.441 0.006 ** 0.151 0.231 0.671 0.559 0.987 0.231 0.092 . 0.006 ** 0.027 * 0.151 0.151 0.231 0.001 *** 0.908 0.013 * 0.027 * 0.231 0.769 0.849 0.671 0.151 0.001 *** 0.001 *** 0.002 ** 0.000 *** 0.231 0.000 *** 0.002 ** 0.013 * 0.000 *** 0.151 0.027 * 0.671 0.441 0.849 0.441 0.000 *** 0.000 *** 0.006 ** 0.000 *** 0.151 0.006 ** 0.092 . 0.151 0.001 *** 0.052 . 0.994 0.973 0.908 0.948 0.671 0.001 *** 0.002 ** 0.001 *** 0.329 0.329 0.000 *** 0.092 . 0.000 *** 0.151 0.092 . 0.329 0.092 . 0.559 0.329 0.908 0.000 *** 0.000 ** 0.000 *** 0.000 *** 0.000 *** 0.000 *** 0.000 *** 0.000 *** 0.000 *** 0.007 ** 0.034 * 0.244 0.475 0.858 0.172 Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 Temporal Separability 0.087 . 0.226 0.011 * 0.151 0.092 . 0.329 0.052 . 0.151 0.151 0.092 . 0.441 0.329 0.441 0.231 0.769 0.151 0.151 0.973 0.003 ** 0.010 ** 0.093 . ‘’1 Overall Quality Preference 0.000 *** 0.001 *** 0.005 ** 0.011 * 0.048 * 0.048 * 0.231 0.231 0.092 . 0.013 * 0.052 . 0.441 0.027 *** 0.006 ** 0.013 * 0.027 * 0.441 0.441 0.000 *** 0.000 *** 0.027 * 0.013 * 0.329 0.441 0.001 *** 0.002 ** 0.092 . 0.052 . 0.849 0.908 0.052 . 0.092 . 0.092 . 0.027 * 0.769 0.559 0.000 *** 0.000 *** 0.000 *** 0.000 *** 0.058 . 0.142 Table 3: p-values from binomial test for each combination of source, comparison pair, and attribute. A p-value shows the probability of how likely that a left hand side format on “comparison” column (e.g., DSD5) is chosen as to have the same level of sensation on a given attribute to the right hand side format (e.g., PCM) on the same row. Smaller the p-value is, statistically more significant that the left hand side has higher level of sensation on the given attribute. The symbols “DSD5,” “DSD2,” and “PCM” each denote DSD (5.6 MHz), DSD (2.8 MHz), and PCM (192 kHz/24 bit), respectively. The last row (“combined”) show the result of binomial test with all sources combined. Combined Piano Trio Vocal Speech Triangle DSD5 DSD2 DSD5 DSD5 DSD2 DSD5 DSD5 DSD2 DSD5 DSD5 DSD2 DSD5 DSD5 DSD2 DSD5 DSD5 DSD2 DSD5 DSD5 DSD2 DSD5 Drums PCM PCM DSD2 PCM PCM DSD2 PCM PCM DSD2 PCM PCM DSD2 PCM PCM DSD2 PCM PCM DSD2 PCM PCM DSD2 Comparison Source Marui et al. Subjective Evaluation of PCM and DSD AES 136th Convention, Berlin, Germany, 2014 April 26–29 Page 10 of 10