Uploaded by alexandre quessada

20140426 paper

advertisement
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/289595500
Subjective evaluation of high resolution recordings in PCM and DSD audio
formats
Conference Paper · April 2014
CITATION
READS
1
2,934
4 authors, including:
Toru Kamekawa
Tokyo University of the Arts
32 PUBLICATIONS 56 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Technical Ear Training View project
Factors differentiating the 22.2- and 2-channel reproduced sound fields through an acoustic modeling of three listening rooms View project
All content following this page was uploaded by Atsushi Marui on 29 December 2017.
The user has requested enhancement of the downloaded file.
Audio Engineering Society
Convention Paper 9019
Presented at the 136th Convention
2014 April 26–29 Berlin, Germany
This Convention paper was selected based on a submitted abstract and 750-word precis that have been peer reviewed
by at least two qualified anonymous reviewers. The complete manuscript was not peer reviewed. This convention
paper has been reproduced from the author’s advance manuscript without editing, corrections, or consideration by the
Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request
and remittance to Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see
www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct
permission from the Journal of the Audio Engineering Society.
Subjective Evaluation of High Resolution
Recordings in PCM and DSD Audio Formats
Atsushi MARUI1 , Toru KAMEKAWA1 , Kazuhiko ENDO2 , and Erisa SATO2
1
Faculty of Music, Tokyo University of the Arts, 1-25-1 Senju, Adachi, Tokyo, 120-0034, Japan
2
TEAC Corporation, 1-47 Ochiai, Tama, Tokyo, 206-8530, Japan
Correspondence should be addressed to Atsushi MARUI (marui@ms.geidai.ac.jp)
ABSTRACT
High-resolution audio production and consumption are increasing attraction supported by releases of the
relatively affordable audio recorders from multiple manufacturers and broader bandwidth of the Internet.
However, differences in audio quality between high-resolution audio formats are still not well known, especially between the different formats available for the audio recorders. In order to evaluate the differences
between subjective impression of the sounds recorded using high resolution audio formats, three audio formats —PCM (192 kHz/24 bits), DSD (2.8 MHz), and DSD (5.6 MHz)— recorded with multiple studio-quality
audio recorders were evaluated in a double-blind A-B comparison listening test. Six sound programs evaluated by forty-six participants on eight attributes revealed statistically significant differences between PCM
and DSD but not between the two sampling frequencies (2.8 MHz and 5.6 MHz) of DSD.
1. INTRODUCTION
While the music industry actively releasing perceptually coded versions for almost all new music releases, high-resolution audio production and consumption also is increasing attraction supported by
broader bandwidth of the Internet realizing the music distribution over the Internet and production of
relatively affordable high-resolution capable sound
recorders from several manufacturers. Nevertheless,
in spite of the use of high-resolution formats in the
industry, differences in audio quality between highresolution audio formats are still not well known,
especially between the different audio formats available for the sound recorders.
Meyer and Moran [1], reported that they were unable to reject the null hypothesis of the listeners
Marui et al.
Subjective Evaluation of PCM and DSD
of SACD and 44.1 kHz/16 bit could not differentiate between them. The source materials used in
the test are not well described that it is difficult
to know whether the result was from the difference
in the playback formats. The similar experiment
but in different approach was done by Woszczyk,
et al. [2], and concluded that higher sampling rate
(8-times the sampling rate of CD rate) was chosen
to have higher degree of fidelity to the analog reference. While the two reports compare the different formats and/or sampling rates, Blech and Yang
compared PCM and DSD having the same bit rate
(2.8224 MHz against 176.4 kHz/24 bit) to find the listeners not able to discriminate between the two systems [3]. These research results are obtained from
discrimination tasks where listeners choose which
sound stimulus is different or the same to the other
stimuli considering global impression of the stimuli including spatial, spectral, and temporal aspects.
These aspects can be evaluated independently, but
it was not done for the focus of their research was
on simply to discover whether the listeners are able
to discriminate between different formats.
Our aim in this paper is to document the stimuli and the method used in evaluating the differences between subjective impression of the sounds
recorded using different high resolution audio formats, especially between PCM and DSD. Three audio formats recorded with multiple studio-quality
sound recorders were assessed in a subjective listening test on multiple attributes. The formats used in
the test were PCM (192kHz/24bits), DSD (2.8MHz),
and DSD (5.6MHz). Those three formats were chosen because they are some of the highest resolutions
available in most of the consumer or professional audio recorders currently existing in the market. Also,
the number of bits per second are comparable to each
other; DSD (2.8 MHz) being the lowest among the
three at 2,822,400, DSD (5.6 MHz) being the highest at 5,644,800, and PCM (192 kHz/24 bit) in the
middle at 4,608,000. However, authors are aware
that the processes of converting analog to digital and
vice versa are different in PCM and DSD, and direct
comparison of the number of bits is not very helpful
to understand the differences in audio quality. The
sound sources were recorded simultaneously using
four recorder models and comparison between different recorders were done in the test, but a portion
only relevant to the comparison of audio formats is
discussed in this paper since benchmarking is not
our intention.
In Section 2, preparation of stimuli, listening environment, participants, and test method are explained. Section 3 describes the experimental setup
and methods used in the test. Section 4 presents the
result of the statistical analysis of the data, followed
by the conclusion in Section 5.
2. STIMULI
2.1. Recording Devices and Formats
The recorders used in the recording session were
TASCAM DA-3000, TASCAM HS-2000, and KORG
MR-2000S for PCM (192kHz/24bit), TASCAM DA3000, TASCAM DV-RA1000HD, and KORG MR2000S for DSD (2.8MHz), and TASCAM DA-3000
and KORG MR-2000S for DSD (5.6MHz) (also
shown in Table 1).
Although the actual model names of the recorders
used in the test are revealed here for the readers
to know the exact procedure we used in the stimuli
preparation, we do not discuss about the test results
of comparison between different recorder models of
the same audio format since it is beyond the focus
of this report. Only the data relevant to the comparison of the three audio formats recorded using
the same recorder model will be discussed in the following sections. Since there are two such models in
this study (DA-3000 and MR-2000S), only sources
recorded using DA-3000 were used.
Figure 1 shows the frequency and time responses of
the analog-to-digital-to-analog converters for three
formats. The responses were measured with a sweptsine signal (Optimized Aoshima’s Time-Stretched
Pulse [4]) using RME Fireface UC audio interface operated with Apple Logic X software in
192 kHz/24 bit. The response of the audio interface
is compensated.
2.2. Recording of Source Material
Recording took place on April 27th, 2013, in Studio
A and Studio B in Senju Campus of Tokyo University of the Arts. Jazz musicians (a trio of piano,
bass, and drums), vocalists (two females and one
male), voice actors (two females and one male), and
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 2 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
Format
PCM (192 kHz / 24 bit)
DSD (2.8 MHz)
DSD (5.6 MHz)
Recorder
TASCAM DA-3000
TASCAM HS-2000
KORG MR-2000S
TASCAM DA-3000
TASCAM DV-RA1000HD
KORG MR-2000S
TASCAM DA-3000
KORG MR-2000S
Table 1: Eight audio recorders used for recording music/speech performances.
1.5
1
2
DSD (2.8MHz)
1.5
0.5
Amplitude
Power (dB)
0
-0.5
-1
PCM (192kHz/24bit)
0.5
DSD (5.6MHz)
0
-0.5
-1.5
PCM (192kHz/24bit)
-1
-2
-1.5
-2.5
-3
DSD (2.8MHz)
1
DSD (5.6MHz)
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
4k
8k
16k 32k
-2
-60
-40
-20
0
20
Time (samples, Fs=192kHz)
40
60
Fig. 1: Left panel shows frequency responses of the audio recorder for three formats used in the test: DSD
(2.8 MHz) (top), DSD (5.6 MHz) (middle), and PCM (192 kHz/24 bit) (bottom). The abscissa is frequency
(Hz) and the ordinate is power (dB). Curves for DSD (2.8 MHz) and PCM (192 kHz/24 bit) are shifted by
1 dB for readability. Right panel shows time responses of the audio recorder for three formats used in the
test: DSD (2.8 MHz) (top), DSD (5.6 MHz) (middle), and PCM (192 kHz/24 bit) (bottom). The abscissa is
time in sample (192 kHz sampling rate) and the ordinate is amplitude. Amplitudes were scaled to [−1, +1)
range. Curves for DSD (2.8 MHz) and PCM (192 kHz/24 bit) are shifted by 1 for readability.
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 3 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
a classical pianist participated in the recording. The
performers were recorded in separate sessions. All
performers and the recording engineer are of high
quality professionals and were paid for participation.
The microphones (AKG C414, C451, D112, DPA
4006, 4011, 4015, Neumann U87Ai, Royer R-121,
Sanken CO-100K, Shure SM57, and Sony C-38B)
were connected to either a microphone amplifier of
Trident S80 mixing console, Millenia HV-3D-8, John
Hardy M-1, or API 512c and mixed to two channels
on Trident S80 mixing console. The resulting two
channel mix was paralleled to eight two-channel lines
and sent to eight recorders.
The recorders were operated to record the music/speech performance simultaneously. The calibration was done to −16 dBFS with an input of
+4.0 dBu 1kHz sinewave achieving the equal recording levels within ±0.1 dB error measured with NTI
XL2 Sound Level Meter. No effect processing was
applied and no digital processing was used in the
course of recording except for the analog-to-digital
conversion in each recorder.
2.3. Stimuli Preparation
Although there were 19 recorded sources (2 jazz trio
performances, 6 percussion, 6 speeches, 4 vocals, and
2 piano performances), only six of them were used
in the following subjective listening test:
• drums solo
• triangle solo
• speech (male, in Japanese language)
• vocal solo (female)
• jazz trio
• classical piano
These six were selected on the basis of having wide
spectral and temporal varieties that were thought
to reveal the differences between the audio formats.
Stimuli such as speech are added for their familiarity to the listeners. Among these, triangle solo
and speech were recorded using only one microphone
(Sanken CO-100K for the triangle and Neumann
U87 for the speech), and thus two channels in the
recordings are identical.
The recorded materials were edited to be about 10
to 15 seconds each in the respective audio recorder
without level adjustment and fade in/out applied to
minimize the effect of signal processing. Any processing except for this trimming was done on none of
the sounds in the course of making the stimuli. The
frequency responses and amplitudes of the stimuli
are shown in Figures 3 and 4.
3. LISTENING TEST
3.1. Listening Environment
Listening tests were conducted in two separate sites:
a listening room in TEAC Corporation (Site A) and
Sound Production Studio in Senju Campus of Tokyo
University of the Arts (Site B). Site A is a room
used for critical listening and products evaluations
with fairly less reverberation. Site B is a mixing
studio conforming to ITU-R BS.1116 [5] used often
for listening tests and evaluation of audio materials.
Two TASCAM DA-3000 (from the same production
lot with the same firmware version installed) were
used for playback of all the stimuli. They were set
to master- and slave-mode for playback synchronization. Hence, the same digital-to-analog converter
was used for all stimuli played back. Outputs from
DA-3000 were sent to a remote controllable monitor
switcher (operates in analog domain) which enabled
a listener to switch between one of the two playback
sources.
Two loudspeakers were positioned in the standard
stereo playback according to ITU-R BS.775 [6], with
2.70 m (≈ 8.86 feet) from the listening position (Figure 2). Two Genelec 1032A were used at Site A and
Genelec 8050A were used at Site B. A stereo volume controller was installed as a precaution for loud
noise exposure to human subjects. Because no loud
noise was emitted by accident, the level was kept at
constant level throughout the experiment. Esoteric
C-02 preamplifier was used at Site A and Tomoca
TCC-100ST was used at Site B for the volume controller.
3.2. Participants
Total of 46 listeners (30 and 16 people at Site A and
B, respectively) with normal hearing participated in
the test. Participants in Site A were selected from
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 4 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
cascade out
cascade in
DA-3000
(master)
L
DA-3000
(slave)
R
L
• temporal separability,
• overall quality, and
R
• overall preference.
switcher
All attributes were provided in the listeners’ native
language of Japanese. The choices were made on all
eight attributes for one pair of stimuli before moving
on to the next pair. The stimuli pairs were presented
in a different random order for each of the participants. Stimuli in a pair were also assigned randomly
to two playback systems as well.
volume
controller
lo
spe udake
r
t)
fee
86
2.7
60°
In order to reduce the duration of the test, only 10
pairs each for a given source material were done.
The comparisons included in the test (also shown in
Table 2) are:
(8.
e
6f
0 met)
8
(8.
2.7
0m
dlou ker
a
e
sp
remote
Fig. 2: Signal path in listening test setup.
the people not involved in the development or evaluation of the recording devices. Participants in Site
B are students and professors in Sound Recording
program with timbral ear training experiences.
3.3. Test Design
A double-blind two-intervals two-alternatives forced
choice method (pairwise A-B comparison) was used
for the listening test. A listener was presented with
a pair of two stimuli and asked to listen carefully to
the similarity and dissimilarity while freely switching
between them, and asked to choose which of the two
stimuli has higher sensation or impression related to
a given attribute. The eight attributes used in the
test are:
• three pairs among the three recorders in PCM
(192kHz/24bit): DA-3000, HS-2000, and MR2000S,
• three pairs among the three recorders in DSD
(2.8MHz): DA-3000, DV-RA1000HD, and MR2000S,
• a pair of the two recorders in DSD (5.6MHz):
DA-3000 and MR-2000S, and
• three pairs among three formats PCM
(192kHz/24bit), DSD (2.8MHz), and DSD
(5.6MHz) on DA-3000.
Ten comparisons each for six programs resulted in
60 trials.
• image width,
The test began after the instruction and a training
session were given. The listeners were allowed to
take a break at any time. The test was done individually for each participant and took approximately
1.5 to 2 hours each. Listening tests were done between August 5th to 30th, 2013, and the listeners
were compensated for their participation.
• image depth,
4. RESULTS AND DISCUSSION
• image definition,
• timbral brightness,
• timbral richness,
For the reasons discussed earlier, only the results of
comparisons between three audio formats are presented in this section.
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 5 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
Condition
PCM (192 kHz)
DSD (2.8 MHz)
DSD (5.6 MHz)
DA-3000
Stimulus 1
DA-3000
DA-3000
HS-2000
DA-3000
DA-3000
DV-RA1000HD
DA-3000
DSD (5.6 MHz)
DSD (2.8 MHz)
DSD (5.6 MHz)
Stimulus 2
HS-2000
MR-2000S
MR-2000S
DV-RA1000HD
MR-2000S
MR-2000S
MR-2000S
PCM (192 kHz)
PCM (192 kHz)
DSD (2.8 MHz)
Table 2: Ten comparison pairs in the test. For each comparison, six programs were presented resulting in
60 trials.
No noticeable differences were found in responses
data between the two sites, therefore the data from
two sites were summed.
Binomial test was used to analyze the test results.
p-values from the binomial test for each combination
of source, comparison pair, and attribute are shown
in Table 3. A p-value shows the probability of how
likely that a left hand side format on “comparison”
column is chosen to have the same level of sensation
or impression on a given attribute to the right hand
side format on the same row. Smaller the p-value
is, statistically more significant that the left hand
side has a higher level of sensation or impression
on the given attribute. For example, Drums stimulus of DSD (5.6 MHz) has p = .001 when compared
against PCM (192 kHz/24 bit) in overall preference.
This suggests that DSD version was statistically significantly preferred over the PCM version. The last
row (“combined”) shows the result of the binomial
test of all sources combined. In the following discussion, statistical significance level α = .01 is used. It
is indicated with two or three asterisks in Table 3.
Spatial attributes, width, depth, and definition, were
not significantly different for stimuli with monophonic contents (triangle and speech). Comparisons
on these attributes was statistically significant for
Vocal and Jazz Trio stimuli, and a subset of the attributes were found to be significantly different for
Drums and Piano stimuli. The result is somewhat
obvious that monaural stimuli cannot reveal the differences between the two formats.
For timbral attributes, richness had significant dif-
ferences for all comparisons between DSD and PCM,
but only one significant difference with brightness
between DSD (5.6 MHz) and PCM for Drums stimulus. Although the authors’ expectation was that
brightness can be used to discriminate between the
formats, opposite result was obtained. “Sharpness,”
a synonymous attribute to brightness, is the attribute related to spectral centroid with weight on
high frequency [7, 8]. The results suggest that participants were not able to hear the differences between the spectral differences in high frequency regions. On the other hand, although very subtle,
difference in frequency curves of DSD and PCM is
larger below 31.5 Hz compared to that of high frequency ranges above 16 kHz disregarding the spectral noise above 32 kHz in DSD (2.8 MHz) (Figure 1).
Participants may have relied on the low frequency
contents to discriminate the formats, and it is supported by Triangle stimulus which has less low frequency contents was not being highly significant in
the comparison.
The attribute temporal separability was found to be
not significant for all stimuli.
Overall quality and preference showed similar tendency of participants being chosen DSD (5.8 MHz)
more than PCM (192 kHz/24 bit) for Drums,
Speech, Vocal, and Jazz Trio stimuli. Recall that
the participants were asked to choose which of the
two stimuli has the higher sensation or impression
in a given attribute. Therefore, DSD was chosen to
have higher quality and preference in most of the
attributes than PCM.
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 6 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
There were no significant differences between DSD
(2.8 MHz) and DSD (5.6 MHz) for all attributes in
any of the source materials under α = .01 level.
Combined result show the result of binomial test
with response data from all six stimuli summed. Statistically significant differences in all attributes between PCM and DSD for both 2.8 MHz and 5.6 MHz
are seen. On the other hand, significant differences between 2.8 MHz and 5.6 MHz of DSD were
not found.
5. CONCLUSION
In order to evaluate the differences between subjective impression of the sounds recorded using
high resolution audio formats, three audio formats recorded with multiple studio-quality audio recorders were evaluated in a double-blind AB comparison listening test. Three formats are
PCM (192 kHz/24 bits), DSD (2.8 MHz), and DSD
(5.6 MHz). They were chosen because they are some
of the highest resolutions currently available in most
of the consumer or professional audio recorders.
The three formats were compared by 46 participants
on six sound programs and eight attributes. From
the result of binomial test applied on the data from
pairwise comparison experiment, statistically significant differences between PCM and DSD but not
between the two sampling frequencies (2.8 MHz and
5.6 MHz) of DSD.
Although there were stimuli (such as monaural
sounds like Triangle and Speech) and attributes
(such as brightness and temporal separability) that
were not applicable to discriminate between the formats, stimuli having broad spectra and clear temporal transients (such as Vocal, Jazz Trio, and Piano)
and attributes such as spatial width, spatial depth,
timbral richness were able to be used to discriminate
between DSD and PCM. Overall quality and preference showed similar tendency of in favor of DSD
(5.6 MHz) over PCM (192 kHz/24 bit).
6. REFERENCES
[1] E. Brad Meyer and David R. Moran. Audibility
of CD-standard A/D/A loop inserted into highresolution audio playback. Audio Engineering
Society, 55(9):775–779, September 2007.
[2] Wieslaw Woszczyk, Jan Engel, John Usher,
Ronald Aarts, and Derk Reefman. Which of
the two digital audio systems best matches the
quality of the analog system? In Proceedings
of AES 31st International Conference, London,
UK, June 2007. Audio Engineering Society.
[3] Dominik Blech and Min-Chi Yang. DVD-Audio
versus SACD: Perceptual discrimination of digital audio coding formats. In Proceedings of 116th
Convention, Berlin, Germany, May 2004. Audio
Engineering Society.
[4] Nobuharu Aoshima. Computer-generated pulse
signal applied for sound measurement. Journal of
Acoustical Society of America, 69(5):1484–1488,
May 1981.
[5] International Telecommunication Union. Rec.
ITU-R BS.1116-1: Methods for the subjective assessment of small impairments in audio systems
including multichannel sound systems, October
1997.
[6] International Telecommunication Union. Rec.
ITU-R BS.775-3: Multichannel stereophonic
sound system with and without accompanying
picture, August 2012.
[7] G. von Bismarck. Timbre of steady sounds: A
factorial investigation of its verbal attributes.
Acoustica, 30:146–159, 1974.
[8] G. von Bismarck. Sharpness as an attribute of
the timbre of steady sounds. Acoustica, 30:159–
172, 1974.
Authors were careful in preparing the stimuli and in
conducting the experiment. Nevertheless, of which
physical aspects participants were listening to when
discriminating the formats are still not fully understood. It is our hope that this presentation serves
for understanding the qualities of the high-resolution
audio formats.
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 7 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
Vocal Solo (female)
90
80
80
70
70
60
60
Power (dB)
Power (dB)
Drums Solo
90
50
40
50
40
30
30
20
20
10
10
0
0
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
4k
8k
16k 32k
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
90
80
80
70
70
60
60
50
40
16k 32k
4k
8k
16k 32k
4k
8k
16k 32k
50
40
30
30
20
20
10
10
0
0
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
4k
8k
16k 32k
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
Speech (male)
Classical Piano
90
90
80
80
70
70
60
60
Power (dB)
Power (dB)
8k
Jazz Trio
90
Power (dB)
Power (dB)
Triangle Solo
4k
50
40
50
40
30
30
20
20
10
10
0
0
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
4k
8k
16k 32k
16 31.5 63 125 250 500 1k 2k
Frequency (Hz)
Fig. 3: Frequency responses of six stimuli used in the listening test. Power on vertical axis is not in a physical
scale, but relative levels of six stimuli are preserved. Plots were generated from PCM (192 kHz/24 bit) version
of the stimuli.
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 8 of 10
Marui et al.
Subjective Evaluation of PCM and DSD
Vocal Solo (female)
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
Amplitude
Amplitude
Drums Solo
1
0
-0.2
0
-0.2
-0.4
-0.4
-0.6
-0.6
-0.8
-0.8
-1
0
2
4
6
8
10
Time (sec)
12
14
-1
0
16
2
4
6
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
-0.2
-0.4
-0.6
-0.8
-0.8
4
6
8
10
Time (sec)
12
14
-1
0
16
2
4
6
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
-0.2
-0.4
-0.6
-0.8
-0.8
6
8
10
Time (sec)
14
16
12
14
16
0
-0.6
4
12
-0.2
-0.4
2
8
10
Time (sec)
Classical Piano
1
Amplitude
Amplitude
Speech (male)
1
-1
0
16
-0.2
-0.6
2
14
0
-0.4
-1
0
12
Jazz Trio
1
Amplitude
Amplitude
Triangle Solo
8
10
Time (sec)
12
14
16
-1
0
2
4
Fig. 4: Amplitude plots of six stimuli used in the listening test.
(192 kHz/24 bit) version of the stimuli.
6
8
10
Time (sec)
Plots were generated from PCM
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 9 of 10
View publication stats
Spatial
Timbral
Width
Depth
Definition Richness Brightness
0.001 *** 0.024 *
0.005 **
0.226
0.005 **
0.011 *
0.024 *
0.024 *
0.002 **
0.024 *
0.146
0.024 *
0.146
0.011 *
0.146
0.013 *
0.441
0.151
0.052 .
0.027 *
0.092 .
0.092 .
0.441
0.006 **
0.151
0.231
0.671
0.559
0.987
0.231
0.092 .
0.006 **
0.027 *
0.151
0.151
0.231
0.001 *** 0.908
0.013 *
0.027 *
0.231
0.769
0.849
0.671
0.151
0.001 *** 0.001 *** 0.002 **
0.000 *** 0.231
0.000 *** 0.002 **
0.013 *
0.000 *** 0.151
0.027 *
0.671
0.441
0.849
0.441
0.000 *** 0.000 *** 0.006 **
0.000 *** 0.151
0.006 **
0.092 .
0.151
0.001 *** 0.052 .
0.994
0.973
0.908
0.948
0.671
0.001 *** 0.002 **
0.001 *** 0.329
0.329
0.000 *** 0.092 .
0.000 *** 0.151
0.092 .
0.329
0.092 .
0.559
0.329
0.908
0.000 *** 0.000 **
0.000 *** 0.000 ***
0.000 ***
0.000 *** 0.000 *** 0.000 ***
0.000 *** 0.007 **
0.034 *
0.244
0.475
0.858
0.172
Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1
Temporal
Separability
0.087 .
0.226
0.011 *
0.151
0.092 .
0.329
0.052 .
0.151
0.151
0.092 .
0.441
0.329
0.441
0.231
0.769
0.151
0.151
0.973
0.003 **
0.010 **
0.093 .
‘’1
Overall
Quality
Preference
0.000 *** 0.001 ***
0.005 **
0.011 *
0.048 *
0.048 *
0.231
0.231
0.092 .
0.013 *
0.052 .
0.441
0.027 *** 0.006 **
0.013 *
0.027 *
0.441
0.441
0.000 *** 0.000 ***
0.027 *
0.013 *
0.329
0.441
0.001 *** 0.002 **
0.092 .
0.052 .
0.849
0.908
0.052 .
0.092 .
0.092 .
0.027 *
0.769
0.559
0.000 *** 0.000 ***
0.000 *** 0.000 ***
0.058 .
0.142
Table 3: p-values from binomial test for each combination of source, comparison pair, and attribute. A p-value shows the probability
of how likely that a left hand side format on “comparison” column (e.g., DSD5) is chosen as to have the same level of sensation on a
given attribute to the right hand side format (e.g., PCM) on the same row. Smaller the p-value is, statistically more significant that
the left hand side has higher level of sensation on the given attribute. The symbols “DSD5,” “DSD2,” and “PCM” each denote DSD
(5.6 MHz), DSD (2.8 MHz), and PCM (192 kHz/24 bit), respectively. The last row (“combined”) show the result of binomial test with
all sources combined.
Combined
Piano
Trio
Vocal
Speech
Triangle
DSD5
DSD2
DSD5
DSD5
DSD2
DSD5
DSD5
DSD2
DSD5
DSD5
DSD2
DSD5
DSD5
DSD2
DSD5
DSD5
DSD2
DSD5
DSD5
DSD2
DSD5
Drums
PCM
PCM
DSD2
PCM
PCM
DSD2
PCM
PCM
DSD2
PCM
PCM
DSD2
PCM
PCM
DSD2
PCM
PCM
DSD2
PCM
PCM
DSD2
Comparison
Source
Marui et al.
Subjective Evaluation of PCM and DSD
AES 136th Convention, Berlin, Germany, 2014 April 26–29
Page 10 of 10
Download