Liaison Statement To: ITU SG 16 Q8, 9, 10/16 For Information

advertisement
Liaison Statement
To: ITU SG 16 Q8, 9, 10/16
For Information
Source: IETF CODEC Working Group, Real-Time Applications and Infrastructure Area
(RAI)
Date: 9 November 2011
Speech and Audio Coding Standardization
The IETF codec working group would like to thank the ITU for their liaison statement on 25 August 2011,
providing comments on the Opus codec. We have taken the comments provided in that document and
addressed many of them in the most recent revision of the Opus specification, which you can find at
(http://datatracker.ietf.org/doc/draft-ietf-codec-opus/) . A second working group last call was issued
on October 31st, completing on 19 November, coincident with the conclusion of the Taipei IETF meeting.
Going through each of the comments in the liaison, here is how they have been addressed:
SG16 experts are unsure about the maturity level of the specification and wonder
when a final standard will be delivered. It is stated that "the design team believed
the codec was complete by June 2011, consequently, the codec group issued a
WGLC for the codec on July 8, 2011". However, since that date, several patches
and bug fixes have been sent on the IETF reflector, which suggests that the lastcall agreement on the codec was based upon an unstable version.
In the IETF process, the purpose of the working group last call is to solicit additional comments, which
often result in further modifications of the specification. The last call issued when the chairs believe that
no substantive issues remain and the last call is to test that assumption. As such, it is quite appropriate
and reasonable for minor changes to be made by document authors, as well as for more substantive
ones based on comments.
Misalignments between the specification text and the C-code implementation
have also been noted: some algorithmic features performed in the C-code
implementation, e.g., warped LPC, are not described in the text, whilst some
algorithmic features described in the specification text are not implemented in
the source code, e.g., the switching between SILK and CELT at speech/music and
music/speech frame transitions.
Thank you for pointing out these errors. Warped LPC is mentioned at a high level on page 135 of the
specification. The details are specified in the code, which is the normative specification of the algorithm.
As for switching between SILK and CELT; the encoder will switch automatically but not based on a
music/speech detection. Such a thing could be implemented, but since it is encoder-side and not
normative, it does not need to be part of the reference implementation. The group discussed that and
agreed that it does not make sense to include within the specification.
The "readme" file is not in agreement with the "help" output of the executable
command line (probably the readme has been written for an older software
version?);
This has been fixed.
The C-code still contains some "TODO" comments;
All the todos that needed to be done have been done; the others have been changed to NOTE
comments or removed.
Parts of the C-code seems to be either unreachable or remain unoptimized: We
believe that a significant amount of work still needs to be done to derive an
efficient implementation without useless additional complexity;
Actually the code has been run and tested across many different platforms, and the latest version
includes test vectors included (by reference) which the group believes test all aspects of the decoder.
This is also a reference implementation and does not, by design, include platform-specific assembly or
other optimizations, which are out of scope for this work.
The portability of the current version is rather limited. Speech and audio coding
standards are expected to have a wide portability so that they can be used in a
wide range of environments. The OPUS codec software seems to have been
natively developed for Linux (or Cygwin) and does not seem to be easily portable
to other platforms. For instance, it cannot be compiled directly on another
platform with a different compiler such as DOS/Microsoft Visual Studio and
building a Microsoft Visual Studio project will require various modifications to the
C-code;
The autoconf and MSVC projects do make things easier on a broader set of systems, but they're about
half the size of the whole codec so not really suitable for inclusion in the draft. But they're in the SCM
linked from http://opus-codec.org/. The web interface there will build tarfile snapshots from the
repository.
Testing has actually been done on:




Linux (x86, x86_64, IA64, PPC, Armv7) (GCC 4.7, 4.5, 3.x depending on the platform)
Linux with LLVM compiler (x86, x86_64)
NetBSD (x86)
FreeBSD (x86)







Solaris 10 (Ultrasparc)
Win32 via Mingw
Win32 via LCC-Win32
Win32 via OpenWatcom
Dos32 via OpenWatcom
IBM S/390
VAX (MicroVAX 3900, via SIMH, really It's quite slow)
Test vectors to check the compliance with the OPUS standard are missing: Speech
& audio coding standards should have a minimum set of Test Vectors to check
whether the generated executable works properly and any implementation
complies with the expected standardized format;
Agreed. The latest version includes test vectors.
The auxiliary functionalities required for VoIP, e.g. time shortening/stretching,
are not provided together with the codec. An important justification for the
formalization of theIETF Codec WG was that these functionalities were stated to
be very crucial for VoIP quality and are not provided in the codecs from other
SDOs.
This was discussed as part of the working group last call. Consensus from the group is that these kinds of
algorithms are non-normative and do not need to be included as part of the specification itself. Rather,
the decoder includes control parameters which allow a jitter buffer implementation to do this. A pointer
to a jitter buffer implementation which does such warping (the Google webRTC code) was included as
an informative reference.
The understanding of SG16 experts was that the primary objective of the IETF
Codec WG was to develop a codec which is royalty-free and easily distributable,
as given in guidelines (http://datatracker.ietf.org/doc/draft-ietf-codecguidelines/), and this was the main motivation behind using royalty-free codecs
to define the quality requirement references, as given in Section 5.1 of codec
requirements (http://datatracker.ietf.org/doc/draft-ietf-codec-requirements/). It
is unfortunate that this objective seems to not have been achieved. We believe
that the choice of the codecs for quality requirement references were not
appropriate and have subsequently been shown to be somewhat misguiding.
These requirements should have been set with regard to standardized codecs
based on their technical merits rather than their royalty status.
The IETF itself cannot make decisions on the validity of patent claims. This is for the judgment of the
members of the working group to make on their own. The group will decide, as part of its final working
group last call, whether participants believe the document to be ready for publication based on our
goals for the working group. Our goals and objectives for the work remain the same.
According to test results provided in another IETF deliverable referred in your LS
(http://datatracker.ietf.org/doc/draft-valin-codec-results/), OPUS appears to
have some promisingquality. Yet, this deliverable does not include any formal test
results based on a test plan designed with appropriate standardized testing
methodologies. Moreover, it is a compilation of various tests conducted for
different purposes using older versions of the codec. Therefore, it is difficult to
assess the quality of the final version of OPUS codec which enters WGLC.
The testing document has now been accepted as a working group item
(http://datatracker.ietf.org/doc/draft-ietf-codec-results/). The group has discussed the dispensation of
the older test results, and has agreed to move them to an appendix. The document now includes some
testing done since WGLC, and will grow to include additional testing that gets performed after issuance
of the final codec specification.
Download