TR41.3.5/99-05-07 Standard Method for Measuring Transmission Performance of Hands-Free Telephone Sets P1329 Draft 22 April 1999 Copyright 1999 by the Institute of Electrical and Electronic Engineers, Inc. 345 East 47th Street, New York, NY 10017 USA All rights reserved. This is an unapproved draft of a proposed IEEE standard, subject to change. Permission is hereby granted for IEEE Standards Committee participants to reproduce this document for purposes of IEEE standardization activities. Permission is also granted for member bodies and technical committees of ISO and IEC to reproduce this document for purposes of developing a national position. Other entities seeking permission to reproduce portions of this document for these or other uses, must contact the IEEE Standards Department for the appropriate license. Use of information contained in this unapproved draft is at your own risk. IEEE Standards Department Copyright and Permissions 445 Hoes Lane, PO Box 1331 Piscataway, NJ 08855-1331 USA Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 1 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS IEEE Standards documents are developed within the Technical Committees of the IEEE Societies and the Standards Coordinating Committees of the IEEE Standards Board. Members of the committees serve voluntarily and without compensation. They are not necessarily members of the Institute. The standards developed within IEEE represent a consensus of the broad expertise on the subject within the Institute as well as those activities outside of IEEE that have expressed an interest in participating in the development of the standard. Use of an IEEE Standard is wholly voluntary. The existence of an IEEE Standard does not imply that there are no other ways to produce, test, measure, purchase, market, or provide other goods and services related to the scope of the IEEE Standard. Furthermore, the viewpoint expressed at the time a standard is approved and issued is subject to change brought about through developments in the state of the art and comments received from users of the standard. Every IEEE Standard is subjected to review at least once every five years for revision or reaffirmation. When a document is more than five years old and has not been reaffirmed, it is reasonable to conclude that its contents, although still of some value, do not wholly reflect the present state of the art. Users are cautioned to check to determine that they have the latest edition of any IEEE Standard. Comments for revision of IEEE Standards are welcome from any interested party, regardless of membership affiliation with IEEE. Suggestions for changes in documents should be in the form of a proposed change of text, together with the appropriate supporting comments. Interpretations: Occasionally questions may arise regarding the meaning of portions of standards as they relate to specific applications. When need for interpretation is brought to the attention of IEEE, the Institute will initiate action to prepare appropriate responses. Since IEEE Standards represent a consensus of all concerned interests, it is important to ensure that any interpretation has also received the concurrence of a balance of interests. For this reason IEEE and the members of its technical committees are not able to provide an instant response to interpretation requests except in those cases where the matter has previously received formal consideration. Comments on standards and requests for interpretations should be addressed to: Secretary, IEEE Standards Board 445 Hoes Lane P.O. Box 1331 Piscataway, NJ 08855-1331 USA IEEE Standards documents are adopted by the Institute of Electrical and Electronics Engineers without regard to whether their adoption may involve patents on articles, materials, or processes. Such adoption does not assume any liability to any patent owner, nor does it assume any obligation whatever to parties adopting the standards documents. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 2 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS INTRODUCTION (This introduction is not a part of IEEE Standard P1329-199x, IEEE Standard Methods for Measuring Transmission Performance of Hands-Free Telephone Sets.) This standard has been developed in response to a widely expressed need by the telecommunications industry for a standard, comprehensive method for testing the transmission performance of hands-free telephone sets. This standard is also in agreement with the ITU-T (formerly CCITT) test arrangements and calibration procedures. The IEEE will maintain this standard current with the state of the technology. Comments on this standard and suggestions for the additional material that should be included are invited. Comments should be addressed to: Secretary, IEEE Standards Board, The Institute of Electrical and Electronics Engineers, Inc., 445 Hoes Lane, Piscataway, NJ 08855. This new standard began in 1992 and was prepared by the Subcommittee on Telephone Instrument Testing of the Transmission and Access Systems Committee of the IEEE Communications Society (formerly the IEEE Communications Technology Group). At the time this standard was approved, the members of the Working Group of the Subcommittee on Telephone Instrument Testing, were as follows: John Bareham Roger Britt Chandru Butani Cliff Chamney Paul Coverdale Steve Graham James Gurnavage Roger Gutzwiller Glenn Hess Roger Hunt Frederick M. Kruger Ron Magnuson Stephen Rittmueller Terry Spencer Joseph Sternalio Christopher Struck Stephen Whitesell Robert Young At the time this standard was approved, the members of the Subcommittee on Telephone Instrument Testing, were as follows: John Bareham, Chair Glenn Hess, Vice Chair Steve Graham, Secretary Roger Britt Chandru Butani Cliff Chamney Paul Coverdale James Gurnavage Roger Gutzwiller Frederick M. Kruger Ron Magnuson Stephen Rittmueller Terry Spencer Christopher Struck Stephen Whitesell Robert Young When the IEEE Standards Board approved this standard on ... it had the following membership: Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 3 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS TABLE OF CONTENTS 1. Overview................................................................................................................................................ 8 1.1. Scope. ................................................................................................................................... 8 1.2. Purpose. ................................................................................................................................ 8 1.3. Contents of Standard............................................................................................................. 8 1.4. How To Use This Standard................................................................................................... 9 2. References.............................................................................................................................................. 10 3. Definitions ............................................................................................................................................. 12 4. Acronyms and Abbreviations................................................................................................................. 16 5. Test Methods ......................................................................................................................................... 16 5.1 General. ................................................................................................................................. 16 5.2 Fast Fourier Transform (FFT) and Cross Spectrum Analysis. ............................................... 16 5.2.1 Dual-Channel FFT................................................................................................ 17 5.2.2 Single-Channel FFT. ............................................................................................ 17 5.2.3 Maximum Length Sequence (MLS) Analysis....................................................... 17 5.3 Real-Time Filter Analysis (RTA). ......................................................................................... 17 5.3.1 Dual-Channel Real-Time Filter Analysis. ............................................................ 17 5.3.2 Single-Channel Real-Time Filter Analysis. .......................................................... 17 5.4 Sine-Based Analysis. ............................................................................................................. 18 5.4.1 Discrete Tone (Stepped Sine)............................................................................... 18 5.4.2 Swept Sine............................................................................................................ 18 5.4.3 Time Delay Spectrometry (TDS). ........................................................................ 18 5.5 Free Field Techniques. .......................................................................................................... 18 5.6 Method Comparative Summary. ............................................................................................ 19 Signal Type ................................................................................................................................................ 19 6. Test Signals............................................................................................................................................ 20 6.1 General. ................................................................................................................................. 20 6.2 Classifications........................................................................................................................ 20 6.3 Modulation Types.................................................................................................................. 20 6.3.1 Square Wave Modulation..................................................................................... 20 6.3.2 Sine Wave Modulation......................................................................................... 20 6.3.3 Pseudo-Random Modulation. ............................................................................... 20 6.4 Deterministic Signals............................................................................................................. 21 6.4.1 Sine Wave. ........................................................................................................... 21 6.4.2 Pseudo-Random.................................................................................................... 21 6.5 Random Signals. .................................................................................................................... 21 6.5.1 White Noise.......................................................................................................... 21 6.5.2 Pink Noise. ........................................................................................................... 21 6.6 Speech-Like Signals. ............................................................................................................. 21 6.6.1 Simulated Speech. ................................................................................................ 21 6.6.1.1 P.50 Artificial Voice. ..................................................................................... 22 6.6.1.2 P59 Artificial Conversational Speech............................................................. 22 6.6.1.3 Simulated Speech Generator (SSG). ................................................................. 22 6.6.2 Synthesized Speech. ............................................................................................. 22 6.6.3 Real Speech. ......................................................................................................... 22 6.7 Compound Signals................................................................................................................. 22 6.7.1 Sequential Presentation. ....................................................................................... 22 6.7.1.1 Composite Source Signal. ................................................................................. 23 6.7.2 Simultaneous Presentation.................................................................................... 23 6.7.2.1 TDS Sweep with P.50 Noise Bursts. ................................................................. 23 6.7.2.2 TDS Sweep with Artificial Voice...................................................................... 23 6.7.2.3 TDS Sweep with Real Speech ........................................................................... 23 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 4 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 6.7.2.4 TDS Sweep with Random or Pseudorandom Noise .......................................... 23 6.7.2.5 Pseudorandom Noise with P.50 Noise Bursts. .................................................. 23 6.7.2.6 Pseudorandom Noise with Artificial Voice. ...................................................... 23 6.7.2.7 Pseudorandom Noise with Real Speech ............................................................ 23 6.7.2.8 Pseudorandom Noise with Random or Pseudorandom Noise............................ 23 6.7.2.9 Sine Wave with Notched Real Speech. ............................................................. 24 6.8 Test Signal Bandwidth ............................................................................................ 24 6.9 Signal Parameter Summary.................................................................................................... 24 Test Signal......................................................................................................................... 24 6.10 Signal Comparative Summary. ............................................................................................ 24 7. Test Equipment, Environment, and Impairments ................................................................................... 25 7.1 General. ................................................................................................................................. 25 7.2 Test Equipment...................................................................................................................... 25 7.2.1 Measuring Microphones....................................................................................... 25 7.2.2 Mouth Simulator................................................................................................... 26 7.2.3 Head And Torso Simulator (HATS)..................................................................... 26 7.2.4 Standard Circuits for Transmission and Voice Switching Measurements. ........... 26 7.2.5 Standard Circuits for Acoustic Echo Canceller Measurements. ........................... 26 7.3 Test Environment................................................................................................................... 27 7.3.1 Background Noise Level. ..................................................................................... 27 7.3.2 Anechoic Chamber. .............................................................................................. 27 7.3.3 Simulated Free-Field. ........................................................................................... 29 7.3.4 Test Table............................................................................................................. 29 7.3.5 Test Room Characteristics.................................................................................... 30 7.4 Impairments. .......................................................................................................................... 30 7.4.1 Network Impairments. .......................................................................................... 30 7.4.1.1 Test Loops......................................................................................................... 30 7.4.1.2 Loop Current. .................................................................................................... 30 7.4.1.3 Termination Impedance..................................................................................... 30 7.4.1.4 Network Noise................................................................................................... 30 7.4.2 Acoustic Impairments........................................................................................... 30 7.4.2.1 Nearby Reflecting Surfaces. .............................................................................. 30 7.4.2.2 Hoth Room Noise.............................................................................................. 31 7.4.2.3 Room Reverberation.......................................................................................... 31 7.5 Post Processing...................................................................................................................... 31 8. Test Calibration...................................................................................................................................... 31 8.1 Measurement Bandwidth and Resolution. ............................................................... 31 8.2 Send. ...................................................................................................................................... 31 8.2.1 Acoustic Test Spectrum........................................................................................ 31 8.2.2 Acoustic Test Level.............................................................................................. 32 8.2.3 Mouth Simulator Calibration Procedure............................................................... 32 8.2.4 HATS Calibration Procedure ............................................................................... 33 8.3 Receive and Echo. ................................................................................................................. 34 8.3.1 Electrical Test Spectrum. ..................................................................................... 34 8.3.2 Electrical Test Level............................................................................................. 34 9. Transmission Measurements .................................................................................................................. 35 9.1 General. ................................................................................................................................. 35 9.1.1 Measurement Bandwidth and Resolution. ............................................................ 35 9.1.2 Recommended Test Methods. .............................................................................. 35 9.2 Positioning of HFT and Test Transducers. ............................................................................ 37 9.2.1 Desktop Hands-Free. ............................................................................................ 37 9.2.2 Open Listening. .................................................................................................... 39 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 5 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 9.2.3 Non Desk-Top Hands-Free................................................................................... 40 9.2.4 HATS Positioning. ............................................................................................... 40 9.3 Send. ...................................................................................................................................... 41 9.3.1 General. ................................................................................................................ 41 9.3.2 Special Instructions For Microphones.................................................................. 41 9.3.3 Frequency Response............................................................................................. 41 9.3.4 Noise. ................................................................................................................... 41 9.3.5 Input-Output Linearity.......................................................................................... 42 9.3.6 Distortion.............................................................................................................. 42 9.3.6.1 Test Signal......................................................................................................... 42 9.3.6.2 Suitability Test .................................................................................................. 42 9.3.6.3 Distortion Measurement .................................................................................... 43 9.3.7 Loudness Rating Applications.............................................................................. 43 9.3.8 Conversion of Measured Data to ISO R 10 Format.............................................. 43 9.3.9 Mid-Band Average Send Sensitivity. ................................................................... 44 9.3.10 Send Directionality............................................................................................ 44 9.3.10.1 Measurement Procedure. ................................................................................. 44 9.3.10.2 Data Presentation............................................................................................. 45 9.4 Receive. ................................................................................................................................. 46 9.4.1 General. ................................................................................................................ 46 9.4.2 Frequency Response............................................................................................. 46 9.4.3 HATS DRP to ERP Correction. ........................................................................... 47 9.4.4 Noise. ................................................................................................................... 47 9.4.5 Input-Output Linearity.......................................................................................... 47 9.4.6 Distortion.............................................................................................................. 47 9.4.6.1 Test Signal......................................................................................................... 48 9.4.6.2 Suitability Test .................................................................................................. 48 9.4.6.3 Distortion Measurement .................................................................................... 48 9.4.7 Loudness Rating Applications.............................................................................. 49 9.4.7.1 Corrected Receive Loudness Rating Using a Free-Field Microphone............... 49 9.4.7.2 Corrected Receiver Loudness Rating When Using HATS ................................ 49 9.4.8 Mid-Band Average Receive Sensitivity................................................................ 49 9.4.9 Receive Directionality. ......................................................................................... 50 9.5 Digital Only. .......................................................................................................................... 50 9.5.1 Echo Path Response. ............................................................................................ 50 10. Voice Switching Measurements.......................................................................................................... 51 10.1 General. ............................................................................................................................... 51 10.2 Classification. ...................................................................................................................... 51 10.3 Switching Parameters. ......................................................................................................... 51 10.4 Test Conditions.................................................................................................................... 52 10.4.1 Signal Levels. ..................................................................................................... 52 10.4.2 Loop Lengths...................................................................................................... 52 10.4.3 Noise Levels....................................................................................................... 53 10.5 Test Parameters. .................................................................................................................. 53 10.5.1 Threshold Level.................................................................................................. 53 10.5.2 Build-Up Time. .................................................................................................. 53 10.5.3 Hang-Over Time................................................................................................. 54 10.5.4 Switching Time and Thresholds Between Two Active States. ........................... 56 10.5.5 Take-Over (Break-Through) Time. .................................................................... 60 11. Acoustic Echo Canceller Measurements .............................................................................................. 60 11.2 Test Signals. ........................................................................................................................ 61 11.3 Test Conditions.................................................................................................................... 61 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 6 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 11.4 Round Trip Echo Path Delay (EPD).................................................................................... 61 11.5 Echo Return Loss (ERLST) – Single Talk. ........................................................................... 62 11.5.1 Echo Return Loss, Temporally Weighted – Single Talk (ERLTST)..................... 62 11.5.2 Echo Return Loss, Segmental – Single Talk (ERLSST). ...................................... 62 11.5.3 Weighted Terminal Coupling Loss – Single Talk (TCLWST).............................. 62 11.6 Convergence Time (Tc)....................................................................................................... 63 11.7 Echo Return Loss, Temporally Weighted – Double Talk (ERLTDT).................................... 63 11.8 Send Speech Attenuation During Double Talk (ADT_S). ...................................................... 63 11.8.1 Send Speech Attenuation During Double Talk vs Time (ASTDT). ....................... 64 11.8.2 Send Speech Attenuation During Double Talk, Conversational Average (ASADT)........................................................................................................................... 64 11.9 Receive Speech Attenuation During Double Talk (ARDT) ................................................... 64 11.9.1 Receive Speech Attenuation During Double Talk vs Time (ARTDT)................... 64 11.9.2 Receive Speech Attenuation During Double Talk, Conversational Average (ARADT)............................................................................................................ 65 11.10 Send Speech Front End Clipping Time During Double Talk (TSFDT). ............................... 65 11.11 Receive Speech Front End Clipping Time During Double Talk (TRFDT)........................... 65 Annex A Simulated Speech Generator....................................................................................................... 67 Annex B Composite Source Signal ............................................................................................................ 69 Annex C ITU-T Recommendation P.50 Noise Bursts Over TDS Sweep .................................................. 71 Annex D Hoth Room Noise ....................................................................................................................... 72 Annex E Useful Conversion Procedures .................................................................................................... 75 Annex F Recommended Test Bed.............................................................................................................. 77 Annex G Detailed Test Methodology For Temporally Weighted ERL...................................................... 82 Annex H ERLt Test Algorithm................................................................................................................... 86 Annex I Double Talk Testing..................................................................................................................... 91 Annex J Acoustic Echo Path Tutorial ........................................................................................................ 99 Annex K HFT Microphones ...................................................................................................................... 100 Annex L 1/3 Octave Passbands.................................................................................................................. 101 Annex X DRP TO ERP Corrections For HATS Receive Measurements................................................... 102 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 7 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Standard Method for Measuring Transmission Performance of Hands-Free Telephone Sets 1. Overview Objective or subjective methods can be used to measure hands-free telephone (HFT) transmission performance. This standard discusses objective procedures utilizing a sound source, laboratory microphone, and test instruments to characterize transmission performance. Subjective procedures are particularly applicable for rating overall communication connections involving the real voice and real ear of human subjects. Hands-free telephones can be evaluated by purely objective methods provided they agree with the desirable performance characteristics of subjective testing. Hands-free telephones present a complex problem for accurate and repeatable measurements of the device characteristics. In order to meet these measurement goals the reader must be aware of the non-linear and time-variant characteristics of an HFT. This standard describes test signals and corresponding analysis methods, which can be chosen to ensure the HFT is in a well defined operating state during testing. Execution of this standard provides a means of determining the operational characteristics of an HFT in conditions encountered during normal operation. 1.1. Scope. This standard provides the techniques for objective measurement of electroacoustic and voice switching characteristics of analog and digital hands-free telephones. Due to the various characteristics of HFTs and the environments in which they operate, not all of the test procedures in this standard are applicable to all HFT’s. Application of the test procedures to atypical HFTs must be determined on an individual basis. 1.2. Purpose. The purpose of this standard is to provide practical methods for making laboratory measurements of the transmission and voice switching characteristics of HFTs so that their performance may be evaluated on a standardized basis. 1.3. Contents of Standard. This is a brief summary of the sections contained in the standard. The primary measurement procedures appear in the sectioned portion of the document. Attached Annexes contain additional information or details of procedures referred to from within the relevant section. Sections 2, 3 and 4 provide references, definitions, and acronyms, which will be useful in executing the tests of this standard. These sections provide a background in the terminology used for HFT testing. Section 5 details the most common analysis techniques used to make the electroacoustic and switching measurements. The section explains the advantages and disadvantages of each technique in relation to the stimulus signals chosen for testing. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 8 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Section 6 details the test signals used to place the HFT in a well-defined state for the measurements. This section explains the use of these signals to maintain a stable HFT state as well as provide a suitable measurement signal. Section 7 details the test equipment and the test environment. The test equipment includes the interfaces, analyzers and transducers. The test environment include the room characteristics and furniture used during testing. Section 8 details the calibration procedure needed to ensure that the equipment is in a known state. Calibration of the electrical interfaces and acoustic transducers is explained. Section 9 details the various send and receive transmission test procedures. This section includes procedures for positioning the HFT for testing and generating reports of the transmission characteristics. Section 10 details the procedures and parameters involved in testing the voice switching characteristics of half-duplex and full-duplex HFTs. Various environmental conditions are explained in reference to the operation of a voice switched HFT. Section 11 details the procedures and parameters unique to testing an HFT equipped with an acoustic echo canceller (AEC). Characteristics unique to AEC HFTs are presented. Also, environment and interface conditions specific to AECs are described. 1.4. How To Use This Standard. The following sequence is suggested when referencing the contents of this standard prior to setting up a test sequence: 1. Choose Acoustic Environment [section 7]. Electroacoustic measurements on hands-free telephones must be made in an environment with a low background noise level. In addition, either the test environment or the test method should minimize or eliminate the influence of reflections. Section 7.6 provides guidance in the selection and set-up of the test environment. Under certain circumstances, measurements may have to be made in a "real room environment." That is, in a room having acoustic characteristics similar to those expected to exist in a room in which the hands free telephone may be used. Be sure to record for the test report the test room environmental conditions of temperature, humidity, and barometric pressure, in addition to the ambient acoustic environment ( overall noise level, and 1/3 Octave band sound levels are recommended). 2. Choose test signal based on HFT behavior [section 6]. Selection of the correct test signal(s) is critical since different hands-free telephones may respond differently to many of the possible test signals. The choice will be a balance between one that correctly stimulates the detection circuits of the HFT, and one that is suitable for the specific measurement. 3. Choose analysis method (based on test signal), [section 5]. Each analysis technique has inherent advantages and disadvantages. A particular method may be better suited for use with certain stimulus signals. In fact, certain methods rely upon the use of a synchronized or otherwise unique signal. Please note that within a given test, different test signal types (i.e., continuous sine, pulsing, alternating tones, pink noise, tone burst, simulated speech, etc.) may be required. 4. Setup hardware, fixtures, etc. [section 7]. Different equipment will be required depending upon whether analog or digital sets are being tested and which test signals and analysis methods are chosen. The first part of section 7 provides guidance in the selection of the test equipment required to generate and process the selected test signal(s). Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 9 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 5. Calibrate the system [section 8]. Follow the checkout and calibration procedures provided by instrument manufacturer(s), or existing relevant standards, to verify proper operation and calibration of all equipment, as appropriate. 6. Perform transmission measurements [section 9]. Procedures are provided for measuring parameters affecting the send and receive performance of hands-free telephone sets. Procedures for measuring electrical characteristics also are provided in this section. 7. Perform voice switching measurements [sections 10 & 11]. Refer to section 10 to select the procedures and parameters to be used in testing the voice switching characteristics of hands-free telephones. The manner in which the HFT handles the transmission of signals in both directions can be classified into different groups. This section provides guidance in test set-up, test environment, test equipment and procedures. Section 11 provides guidance for the testing of the unique characteristics of hands-free telephones with acoustic echo cancellation (AEC) circuitry. The methods described will effectively evaluate AEC performance. For each test, describe the test environment, position of HFT, test signal, analysis method, frequency range and resolution, and other relevant conditions. 2. References This standard shall be used in conjunction with the following publications. When the following standards are superseded by an approved revision, the revision shall apply, but the impact on results should be determined. [1] ANSI S1.1-1994, American National Standard Acoustical Terminology (Including Mechanical Shock and Vibration)1 [2] ANSI S1.4-1983, American National Standard Specification for Sound Level Meters [3] ANSI S1.6-1984 (R1990), Preferred Frequencies, Frequency Levels, and Band Numbers for Acoustical Measurements [4] ANSI S1.11-1986, Specification for Octave-Band and Fractional-Octave-Band Analog and Digital Filters [5] ANSI S1.12-1967 (R1986), American National Standard Specifications for Laboratory Standard Microphones [6] ANSI S3.36-1985, American National Standard Specification for a Manikin for Simulated In-Situ Airborne Acoustic Measurements. [7] ANSI/IEEE Standard 100-1988, IEEE Standard Dictionary of Electrical and Electronics Terms2 1ANSI and ISO publications are available from the Sales Department, American National Standards Institute, 11 West 42nd Street, 13th Floor, New York, NY 10036 USA. (Tel: 212-642-4900 Fax: 212-302-1289) 2IEEE publications are available from the Institute of Electrical and Electronics Engineers, Service Center, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331, USA. (Tel: 908-981-0060 Fax: 908-981-9667) Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 10 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS [8] ANSI/IEEE Standard 269-1992, IEEE Standard Methods for Measuring Transmission Performance of Analog and Digital Telephone Sets [9] ANSI/IEEE Standard 661-1992, IEEE Standard Method for Determining Objective Loudness Ratings of Telephone Connections [10] ANSI/IEEE Standard 743-1984, IEEE Standard Methods and Equipment for Measuring the Transmission Characteristics of Analog Voice Frequency Circuits [11] British Standard 6789, Apparatus with One or More Particular Functions for Connection to the British Telecommunications Public Switched Telephone Network, Part 2, Specification for Apparatus with Loudspeaking Facilities, 1984 [Need a reference, similar to footnote 2 & 3, for obtaining this publication.] [12] European Telecommunications Standards Institute Draft pr1-ETS 300 245-3, Integrated Services Digital Network (ISDN); Technical Characteristics of Telephony Terminals, Part 3, PCM A-Law Loudspeaking and Handsfree Function, 1994 [Need a reference, similar to footnote 2 & 3, for obtaining this publication.] [13] ISO 3: Preferred Numbers, Series of Preferred Numbers2 [14] ISO/DIS 266 Acoustics, Preferred Frequencies for Measurements (Revision of ISO 266:1975) [15] ITU-T Recommendation G.122 [16] ITU-T Recommendation G.167 [17] ITU-T Recommendation P.34, Transmission Characteristics of Hands-Free Telephones, 19933 [18] ITU-T Recommendation P.50, Artificial Voices, Telephone Transmission Quality, 1989 [19] ITU-T Recommendation P.51, Artificial Mouths, 1993 [20] ITU-T Recommendation P.56, Objective Measurement of Active Speech Level, 1993[21] ITU-T Recommendation P.57, Artificial Ears, 1993 [21] ITU-T Recommendation P.58, Head and Torso Simulator for Telephonometry, 1993, [22] ITU-T Recommendation P.59, Artificial Conversational Speech, 1993 [23] ITU-T Recommendation P.76, Determination of Loudness Ratings; Fundamental Principles, Telephone Transmission Quality, Geneva, 1989 [24] ITU-T Recommendation P.79, Calculation of Loudness Ratings for Telephone Sets, 1993 [25] ITU-T Recommendation P.340 3ITU-T (formerly CCITT) publications are available from the ITU-T General Secretariat, International Telecommunications Union, Sales Section, Place des Nations, CH-1211, Geneve 20, Switzerland/Suisse. (Tel: +41-22-730-5285 Fax: +41-22-730-5194) ITU-T publications are also available in the United States from National Technical Information Service, Department of Commerce, 5285 Port Royal Road, Springfield, VA 22161. (Tel: 703-487-4650 Fax: 703-321-8547) Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 11 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS [26] ITU-T Recommendation P.501 3. Definitions These definitions apply specifically to measurements of the transmission performance of hands-free telephone sets, and may not be applicable to other disciplines. For definitions not covered, see ANSI S1.11994 [1] and ANSI/IEEE Standard 100-1988 [7]. 3.1 acoustic echo canceller (AEC). A circuit or algorithm designed to eliminate acoustic echoes and prevent howling due to acoustic feedback from loudspeaker to microphone. 3.2 acoustic input. The free field sound pressure level developed by a mouth simulator at the mouth reference point. Also see sound pressure level. 3.3 acoustic output. The sound pressure level developed at the measuring microphone. Also see sound pressure level. 3.4 attenuation range (aH). The difference in level in dB between maximum inserted switched loss and the full removal of that switch loss in a particular transmission direction. 3.5 automatic gain control (AGC). A circuit or algorithm that varies gain as a function of the input signal amplitude. 3.6 build-up time (TR). Time from the input signal going above the threshold level up to 50% of the complete removal of the insertion loss. 3.7 convergence time (Tc). The time required to reach within 3 dB of maximum echo return loss, or 25 dB loss, which ever occurs first. 3.8 3.9 double-talk (DT). Two talkers speaking simultaneously in opposite transmission directions. echo path delay (EPD). The total delay of the echo path from the receive electrical test point to the send electrical test point, excluding any delay in the test equipment. 3.10 echo return loss (ERL). The loss in the echo path from the receive electrical test point to the send electrical test point. ERL can be weighted in time, frequency or both. 3.11 echo return loss, segmental double talk (ERLSTD). As per annex I with receive electrical test point and Sin simultaneously active. [This definition will need clarified without using the annex reference.] 3.12 echo return loss, segmental single talk (ERLSST). The echo return loss from the receive electrical test point to the send electrical test point as measured in section 11.6.2.2. The acoustic echo canceller is in normal operation and the mouth simulator is inactive. The algorithm features a fixed 32 ms frame size averaged over 320 ms. [This definition will need clarified without using the section reference.] 3.13 echo return loss, temporally weighted double talk (ERLTDT). As per section 11.6.4.1 with the receive electrical test point and mouth simulator simultaneously active. [This definition will need clarified without using the section reference.] Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 12 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 3.14 echo return loss, temporally weighted single talk (ERLTST). The echo return loss from the receive electrical test point to the send electrical test point as measured in annex G. The acoustic echo canceller is in normal operation, and there is no signal coming from 50TP. The algorithm features a perceptually based non-fixed window size. [This definition will need clarified without using the annex reference.] 3.15 echo path response. The output at the send electrical test point due to an input at the receive electrical test point. It is a measure of acoustic, vibration, and electrical coupling from the receive circuit to the send circuit. 3.16 feed circuit. An arrangement for supplying DC power to a hands-free telephone set and an AC path between the hands-free telephone and a terminating circuit. 3.17 frequency response. Electrical, acoustic, or electroacoustic sensitivity (output/input) or gain as a function of frequency. 3.18 full duplex. An operating condition which allows simultaneous communication in both send and receive directions with 3 dB or less switched loss in either direction. Classified as Type 1 in ITU-T Recommendation P.340 [25]. 3.19 half duplex. An operating condition which allows communication in either send and receive directions with more than 20 dB switched loss in either direction. Classified as Type 3 in ITU-T Recommendation P.340. 3.20 hands-free reference point (HFRP). The point on the reference axis of the mouth simulator, 50 cm in front of the lip plane. 3.21 hands-free telephone (HFT). A device for connection to a telephone network capable of two-way voice communication without close coupling to the user’s mouth or ear. 3.22 hands-free telephone test circuit. An assembly consisting of a hands-free telephone set(s) and interface(s) as may be required to realize simulated partial telephone connections. 3.23 hang-over time (TH). Time from the input signal going below the threshold level up to 50% of the complete insertion of the switched loss. 3.24 head and torso simulator (HATS). A device that accurately reproduces the sound transmission and pick-up characteristics of the median head and torso of adult humans. See ANSI S3.36-1985 [6], ITU-T Recommendation P.58 [22]: 1993, and IEC 959 (1990). 3.25 interface. A device placed between the line output of a digital hands-free telephone set and the test equipment. The device performs at least one of the following functions: simulation of a normal network connection control of the hands-free telephone, or access for the reference codec to the digital voice signal. 3.26 Loop. See recursive. 3.27 loudspeaking telephone. A telephone with hands-free receive but not hands-free send capability. 3.28 mouth reference point (MRP). The point on the reference axis of the mouth simulator, 25 mm in front of the lip plane. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 13 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 3.29 mouth simulator. See ITU-T Recommendation P.51 [19], “Artificial Mouth.” 3.30 near-field test point (NFTP). The acoustical measurement point located 1 cm directly above the center of the HFT’s loudspeaker along the axis of the speaker. 3.31 noise discriminator/noise guard/noise monitor. A circuit or algorithm intended to discriminate between speech and noise. It can affect switching, transmission, and/or noise performance. 3.32 non-linear devices. Nonlinear processing including, AGC circuits, noise guards, etc. 3.33 open listening. A mode of telephone communication in which a handset is used in the normal position for send. The incoming signal is received simultaneously by the handset and loudspeaker. 3.34 partial duplex. An operating condition which allows simultaneous communication in both send and receive directions with 3 to 20 dB switched loss in either direction. Classified as Type 2 in ITU-T Recommendation P.340. 3.35 receive. The acoustic output of a hands-free telephone due to an electrical input. 3.36 receive attenuation during double talk (ARDT). Attenuation in the receive path, seen at 50TP, inserted during double talk. The send talker initiates the double talk. 3.37 receive electrical test point (RETP). The electrical measurement point of a battery feed circuit, reference codec, or wireless reference base station for signals applied to the hands-free telephone in the receive direction. For further discussions on electrical interfaces, see IEEE Standard 269-1992 [8]. For further discussions on wireless interfaces, see appropriate wireless standards. 3.38 Receive Loudness Rating Directionality (RLRD). Receive loudness rating versus angle around the HFT, normalized to the loudness rating at 50TP. 3.39 receive speech front end clipping during double talk (CRDT). The length of time that speech undergoes syllabic clipping, as seen at 50TP, just after the onset of double talk. The receive talker initiates the double talk. 3.40 Recommended Test Position (RTP). An acoustic test point other than 50TP, that corresponds to the most appropriate user position for non-standard desktop and non-desktop applications. This may be specified by the HFT manufacturer. 3.41 Recursive. See Loop. 3.42 reference codec. For the purposes of this standard, a well-defined analog-to-digital and digital-toanalog converter for testing digital telephones using analog test equipment. See Standard 269-1992 section 5 for details on a reference codec. 3.43 reference volume control setting. The volume control position resulting in a nominal receive loudness rating. 3.44 send. The electrical output of a hands-free telephone due to an acoustic input. 3.45 send attenuation during double talk (ASDT). Attenuation in the send path, seen at the send electrical test point, inserted during double talk. The receive talker initiates the double talk. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 14 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 3.46 send electrical test point (SETP). The electrical measurement point of a battery feed circuit, reference codec, or wireless reference base station for signals coming from the hands-free telephone in the send direction. For further discussions on electrical interfaces, see IEEE Standard 269-1992. For further discussions on wireless interfaces, see appropriate wireless standards. 3.47 send front-end syllabic clipping during double talk (CSDT). The length of time that speech undergoes syllabic clipping, as seen at the send electrical test point, just after the onset of double talk. The send talker initiates the double talk. 3.48 send noise level. The send noise level of an HFT is measured in units of dBm, psophometrically weighted (dBmp), according to the method described in ITU-T 0.41 (10/94). See Annex E for details. 3.49 Send Loudness Rating Directionality (SLRD). Send loudness rating versus angles around the HFT, normalized to the loudness rating at 50TP. 3.50 single-talk (ST). One talker speaking while the opposite transmission direction is silent. 3.51 sound pressure level. The sound pressure level, in decibels, of a sound is 20 times the logarithm to the base 10 of the ratio of the pressure of the sound to the reference pressure. For this standard, the reference pressure is normally 1 Pascal (Pa), and sound pressure levels are expressed in dB re 1 Pa (dBPa). When a reference pressure of 20 uPa is used, the sound pressure level will be expressed as dBSPL. Unless otherwise indicated, RMS values of pressure are used. Most telephony acoustic measurements are referenced to 1 Pa. However, measurements such as receive noise and room noise are generally referenced to 20 uPa. Note: 0 dB Pa = 94dBSPL, 0 dBSPL = 20 microPascals, 1 Pa = 1 N/m^2. An A-weighted [2] sound pressure level in dB (dBSPL, A-weighted) is often abbreviated as dBA. 3.52 spectrum analyzer. An instrument that measures the power of a signal in multiple frequency bands. The frequency bands may be constant bandwidth (i.e. FFT analyzer), or constant percentage bandwidth (real-time filter analyzer). 3.53 switching time (TS). Time taken to switch from one transmission direction to the other in alternating single talk conversation. 3.54 take-over time (TT). Time taken to switch from one transmission direction to the other in doubletalk conversation. The signal in the first direction is continuously applied while the interrupting signal is applied in the opposite direction. TT is measured from the application of the interrupting signal to 50% removal of loss in that direction. 3.55 threshold level (ITH). The minimum signal level necessary for removing insertion loss. 3.56 two-wire transmission. A transmission scheme where the send and receive signals are carried in one pair of wires. 3.57 weighted terminal coupling loss (TCLw). Long term, time averaged echo return loss, weighted in the frequency domain per ITU-T Recommendation G.122 [15]. 3.58 50 cm test point (50TP). The acoustic test point 50 cm from the HFT. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 15 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 4. Acronyms and Abbreviations For frequency response, use the commonly accepted letter “H” as follows: HS (f) = send frequency response (in dB V/Pa) HR (f) = receive frequency response (in dB Pa/V) HEP (f) = echo path frequency response (in dB V/V) For spectra, use the letter “G.” This corresponds to common usage, especially in two-channel FFT analysis literature. The analysis bandwidth must be specified: GERP (f) = spectrum at Ear Reference Point (in dBPa) GHFRP (f) = spectrum at Hands-Free Reference Point (in dBPa) G50TP (f) = spectrum at 50 cm Test Point (in dBPa) GMRP (f) = spectrum at Mouth Reference Point (in dBPa) GRETP (f) = spectrum at Receive Electrical Test Point (in dBV) GSETP (f) = spectrum at Send Electrical Test Point (in dBV) For levels measured over a wide band, with the bandwidth to be specified, use the letter “L.” This corresponds to common usage in sound level measurements, as specified in ANSI S1.1: LERP = level at Ear Reference Point (in dBPa) LHFRP = level at Hands-Free Reference Point (in dBPa) L50TP = level at 50 cm Test Point (in dBPa) LMRP = level at Mouth Reference Point (in dBPa) LRETP = level at Receive Electrical Test Point (in dBV) LSETP = level at Send Electrical Test Point (in dBV) For specially calculated sensitivities: SR = average receive sensitivity (in dB Pa/V) SR0 = normalized receive sensitivity (in dB SPL/V at 1 meter) Ss = average send sensitivity (in dB V/Pa) 5. Test Methods 5.1 General. Various analysis techniques are available for electroacoustic measurements. Each technique has inherent advantages and limitations. A particular method may be better suited for use with certain stimulus signals. Certain methods, in fact, rely upon the use of a synchronized or otherwise unique stimulus signal. The following section describes the most common techniques and their application to measurements of loudspeaking and hands-free telephones. 5.2 Fast Fourier Transform (FFT) and Cross Spectrum Analysis. The Fourier Transform is a mathematical operation that decomposes a time signal into its complex frequency components. The Inverse Fourier Transform reverses the process, reconstructing the time signal from its Fourier components. By applying the FFT algorithm to a sampled time signal, a spectrum can be computed. This is a parallel analysis resulting in a narrow band (constant bandwidth) frequency spectrum. Low frequency resolution may be limited. Here, blocks of time data are analyzed. Care must be taken in the Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 16 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS proper windowing of the data (i.e., Hanning, Flat-top, etc.), overlap processing, and the number of averages to ensure an accurate analysis. The number of spectral lines and the record length determines the frequency resolution. The frequency range and time resolution are inversely related. Because the data is discrete, the highest frequency that can be measured is determined by the sampling frequency. Some degree of data processing is usually available in both the time domain and in the frequency domain. An FFT analyzer may also have a zoom capability, for increased frequency resolution across a restricted bandwidth. 5.2.1 Dual-Channel FFT. A dual-channel FFT analyzer performs simultaneous measurements of the hands-free telephone input and output. This type of measurement is optimized for system analysis. Most FFT analyzers calculate the frequency response from the cross spectrum and either the input or output autospectrum. In this way, different response estimators can be used to minimize noise at the system input or output. This also enables other functions such as coherence, phase, group delay, coherent power and non-coherent power to be computed. Extensive data processing is normally available in both the time domain and in the frequency domain. It is possible to improve measurement S/N by averaging and delay compensation, however, this method is limited in measuring harmonic distortion. 5.2.2 Single-Channel FFT. Without cross spectrum capabilities, the system input and output must be measured separately. These response measurements require control of the excitation spectrum and/or a two-pass analysis. The measurement S/N due to noise at the system input or output is therefore not improved. Any post-processing features available will apply only to the directly measured spectra, not to the response function. This method is also limited in measuring harmonic distortion. 5.2.3 Maximum Length Sequence (MLS) Analysis. The MLS technique4 employs a large (typically 16K), single-channel FFT and a well-defined pseudorandom pulse excitation. The length of the excitation signal is equal to the correlation length, eliminating leakage. The MLS excitation and analysis are inherently synchronized. The received response signal is cross-correlated with the MLS signal to obtain the time response. An FFT is then used to obtain the frequency response. This also enables computation of coherence, phase, group delay, coherent power and non-coherent power. Some non-linear analysis capabilities and post-processing are available. This method does allow improvements of measurement S/N. 5.3 Real-Time Filter Analysis (RTA). Although usually implemented digitally, real-time analysis is essentially a parallel filter bank, usually implemented digitally. This results in a constant percentage (logarithmic) frequency resolution. The analysis is carried out in parallel and the signal is processed continuously. The filters should be 1/12 or 1/24 octave and fulfill the ANSI S1.11-1986 [4] standard. The statistical accuracy of real-time measurements is usually determined by the averaging time or confidence level. This type of analysis is optimized for single-port acoustical measurements (i.e., no control of the system input). 5.3.1 Dual-Channel Real-Time Filter Analysis. Two channels enable simultaneous measurements of the system input and output and direct computation of the frequency response (output/input). This method does provide limited harmonic distortion capability, and some direct post-processing of the data. 5.3.2 Single-Channel Real-Time Filter Analysis. 4 D. D. Rife and J. Vanderkooy, “Transfer-Function Measurement with Maximum Length Sequences”, J. Audio Eng. Soc., Vol. 37, No. 6, (June 1989). Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 17 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS A single-channel real-time analyzer requires separate measurements of the system input and output. Response measurements will require control of the excitation spectrum and/or a two-pass analysis. This method requires the test system to be time invariant, and is limited in measuring harmonic distortion. Some direct post-processing of the data may be available. 5.4 Sine-Based Analysis. Sinusoidal excitation provides a high measurement S/N ratio and high degree of frequency selectivity. The analysis is performed serially using either a quadrature or RMS detector. This often includes a tracking filter for noise suppression and selective measurements of distortion components. The quadrature detector multiplies the response signal by a synchronized (and appropriately delayed) sine and cosine signal. This enables measurement of the complex, steady-state frequency response (i.e., magnitude and phase, real and imaginary parts). Complex averaging algorithms can be employed to improve the measurement S/N ratio. The use of an RMS detector requires a separate phase meter to obtain phase information. 5.4.1 Discrete Tone (Stepped Sine). Discrete tone testing allows a measurement to be performed at precisely defined frequencies. These frequencies can be at the ANSI/ISO [3,13,14] preferred numbers or in other user-defined formats. In this case, the frequency interval (not resolution) should be stated. In addition to frequency response measurements, intermodulation and difference frequency distortion testing are usually carried out using this method. Phase and group delay information is also provided. These tests normally require an anechoic room, although tone-burst techniques can be used with gating to obtain simulated free field results, and measurement S/N can be improved using complex averaging. 5.4.2 Swept Sine. This technique is similar to discrete tone testing, but instead employs a continuous linear or logarithmic sine sweep excitation. The measurement is typically slow due to sweep rate limitations. Swept sine testing is usually performed using analog instrumentation. In addition to frequency response measurements, harmonic distortion is well suited to this method. This method requires an anechoic room, although toneburst techniques can be used with gating to obtain simulated free field results. The frequency may not be constant throughout the analysis. 5.4.3 Time Delay Spectrometry (TDS). TDS5 utilizes a linearly swept sine excitation signal that is synchronized to the measuring instrument. With this signal, a one-to-one relationship between time and frequency is established and simulated free field measurements can be performed. The measured response signal is multiplied with an appropriately delayed version of the excitation. This, in turn, is fed to a filter and detector. The frequency resolution of a TDS measurement is narrow band (constant bandwidth) with frequency. This is, by design, a selectable measurement parameter. Therefore, low frequency resolution may be limited. Like other simulated free field techniques, the effective time window determines this, i.e. the time between the arrival of the direct sound and the arrival of the first reflection. This method is also well suited for harmonic distortion, and provides phase, group delay, and time response information. TDS may be implemented using analog or digital processing. In the later case, refinements and corrections for deterministic errors in the measurement process may also be incorporated. It is possible to improve measurement S/N through complex averaging or delay compensation. This method allows post-processing of the data. 5.5 Free Field Techniques. Gating employs a time capture during the measurement, effectively windowing the measured response. The frequency resolution is the reciprocal of the gating time. 5 R. Heyser, “Acoustical Measurements by Time Delay Spectrometry”, J. Audio Eng. Soc., Vol. 15, No. 10, (October 1967). Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 18 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Post-process time windowing enables the direct sound in a measurement to be separated from reflection, producing a simulated free field condition. In this case, the frequency resolution is the reciprocal of the applied time window. Both gating and post-process windowing can be used on measurements in ordinary rooms. As discussed previously, MLS and TDS are inherently simulated free field techniques. The time windowing may be performed as a part of the data collection or as a post-processing window operation. The frequency resolution available in an anechoic chamber is largely determined by the measurement technique employed. Anechoic chambers are limited at low frequencies by the size of the open space available and the depth of the absorptive material on the walls, floor and ceiling. 5.6 Method Comparative Summary. The table below identifies various test methods described previously. The corresponding test signals and conditions are shown for each method. The various signal classifications will be described in section 6. Signal Type Ref. Anechoic Chamber Needed? Deterministic Signal Random Signal Speech-Like Signal Compound Signal 5.2.1 5.2.2 5.2.3 FFT/Cross Spectrum Dual-Channel FFT Single-Channel FFT Max. Length Seq. Y Y R Y Y N Y Y N Y Y Y * Y * 5.3.1 5.3.2 Real-Time Filter Dual-Channel RTA Single-Channel RTA Y Y Y Y Y Y Y Y Y Y 5.4.1 Sine-Based Discrete Tone Y N N N Y Y Y Y * Test Method 5.4.2 Swept Sine Y N N 5.4.3 Time Delay Spectr. R N N Y Test method is appropriate for this signal. N Should not be used. R Required signal with this test method. * Anechoic chamber is required unless simulated free field methods are used. Table 1 – Signal Compatibility with Test Method Note, a “Yes” entry indicates the test method is appropriate for the specific test signal, while “No” implies it should not be used. The “Yes/No” entry depends on the test signal used. The “Optional” entry may be required depending on the test method, and the column entries for the chamber indicate it’s need. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 19 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 6. Test Signals 6.1 General. The test signal must place the hands-free telephone in a well-defined reproducible state for the period of the measurement. It must insure that the transfer function of the unit remains stable during the measurement period, and yet provide a suitable signal for the specific measurement. The choice of the signal will be a balance between one that correctly stimulates the detection circuits in the hands-free telephone, and one that is suitable for the specific measurement. 6.2 Classifications. The various types of signals are divided into several groups as discussed below. The classical measurement signals can be separated into deterministic signals and continuous random signals. More complex random signals include modulated random signals and speech-like signals that characterize human speech. Finally, there are compound signals composed of two sources, one for biasing the unit into a stable state, and the other being the actual test signal itself. 6.3 Modulation Types. Several types of modulation may be applied to deterministic or random signals. This is done in order to approximate the syllabic rhythm of real speech. Test signals may be modulated in various ways to correctly stimulate a hands-free phone, depending on the signal processing actually used in the phone. For example, a modulated noise signal is often an appropriate stimulus for a send circuit with a noise-guard feature. In the presence of a continuous signal over a few hundred milliseconds in duration, the noise-guard process reduces gain substantially. On the other hand, a continuous noise signal is often an appropriate stimulus for a receive circuit with automatic gain control (AGC). 6.3.1 Square Wave Modulation. Square wave modulation is an on-off pattern. The recommended pattern is 250 ms ON and 150 ms OFF, ±10 ms. This pattern is common in many international hands-free testing methods. It is close to the modulation rate of real speech. Other timing patterns may be used after confirming that the maximum measured response has been reached. In some cases, a periodic pulse pattern of this type will not correctly activate the telephone circuit. In such cases, a randomly varied pulse pattern may be used. The average “on” and “off” times should approximate 250 ms and 150 ms respectively. With this type of modulation, all measurements are to be performed during the “on” part of the pattern. For other types of modulation, the signal is to be measured during the entire presentation time. 6.3.2 Sine Wave Modulation. Sine wave modulation may be used to produce a simple and smooth speech amplitude envelope. The recommended rate is 4 Hz. Modulation depth should be at least 50%, but not so great as to introduce distortion. 6.3.3 Pseudo-Random Modulation. Pseudo-random modulation may be used to produce a relatively speech-like amplitude envelope. The modulation spectrum should cover from approximately 1 to 10 Hz, with the center at approximately 4 Hz. The extremes of the modulation spectrum should be rolled off gradually. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 20 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 6.4 Deterministic Signals. Deterministic (periodic) signals can always be used to measure the frequency response of linear, time invariant telephones. When modulated, they can be used to measure the response of telephones with many, but not all, speech-processing features. Deterministic signals may be used to measure linearity, some kinds of distortion, and switching times. 6.4.1 Sine Wave. In addition to use in measuring the frequency response of linear, time invariant telephones, sine waves are useful for measurements of harmonic and difference-frequency distortion. Square wave, sine wave and pseudo-random modulation are all suitable. 6.4.2 Pseudo-Random. A pseudo-random signal has a periodic structure in the time domain. In the frequency domain, almost any magnitude and phase spectrum is possible. When used with FFT types of analysis the period of the pseudorandom signal is to be matched in length and triggered to the analysis period. These signals require a semicomplex test procedure. Square wave, sine wave and pseudo-random modulation are all suitable. If square wave modulation is used, the “on” time must correspond to one complete period of the pseudo-random signal. 6.5 Random Signals. Random signals can be described by their statistical characteristics, such as the long-term power spectral density and probability density functions. These signals are not periodic, but are stationary as far as these statistical characteristics are concerned. When measuring such signals, a sufficient number of averages must be taken to obtain a given accuracy in estimating the long-term spectrum. In practice, many practical noise generators produce pseudo-random signals, typically with a very long period. If the period of such signals is very long compared to the analysis period, and if the analysis period is not correlated to the generator period, then these signals can be considered random. 6.5.1 White Noise. White noise has a constant spectral density per Hertz. The amplitude distribution is typically truncated Gaussian, with a crest factor of 12 dB, ± 2 dB. Square wave, sine wave and pseudo-random modulation are all suitable. 6.5.2 Pink Noise. Pink noise has a power spectral density that decreases 3 dB per octave. The amplitude distribution is typically truncated Gaussian, with a crest factor of 12 dB, ± 2 dB. Square wave, sine wave and pseudo-random modulation are all suitable. 6.6 Speech-Like Signals. Speech-like signals include ITU-T Recommendation P.50 [18] artificial voice, ITU-T Recommendation P.59 [23] artificial conversational speech, simulated speech generator (SSG), as well as synthesized and real speech signals. Speech-like signals are ideal for determining linearity and transfer characteristics in the frequency domain. These signals place the hands-free telephone in a well-defined reproducible state, ensure that the transfer function of the unit remains stable, and provide a suitable signal for the specific measurement. 6.6.1 Simulated Speech. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 21 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Typical parameters of simulated speech include long-term average spectrum, short-term spectrum, instantaneous amplitude distribution, speech waveform structure, and the syllabic envelope. 6.6.1.1 P.50 Artificial Voice. ITU-T Recommendation P.50 defines the temporal and spectral parameters for a test signal which emulates the characteristics of speech. This artificial voice is a continuous speech signal with a frequency range of 89.1Hz to 8919 Hz. Pauses may need to be inserted to emulate the on-off characteristics of conversational speech. See section 6.6.1.2 for information on inserting pauses. 6.6.1.2 P59 Artificial Conversational Speech. Artificial conversational speech is a test signal generated by inserting pauses in the continuous artificial voice signal described by ITU-T Recommendation P.50. The on-off temporal characteristics of conversational speech are defined in ITU-T Recommendation P.59. This test signal is useful for evaluating devices that are sensitive to the on-off nature of conversational speech. 6.6.1.3 Simulated Speech Generator (SSG). To generate a signal approximating the amplitude distribution of speech, a main signal having a Gaussian distribution is modulated by a specially tailored modulating signal, and the resultant signal is shaped to approximate the long-term frequency spectrum of speech. See annex A for details of this signal. 6.6.2 Synthesized Speech. Speech-like signals may be produced using a digital processing technique rather than applying one of the signal sources described above. Conversational speech can be sampled, digitized, processed, and reproduced as synthesized speech. It may also be created from complex multiple tones that simulate the talk-spurts, pauses, and activity factors associated with speech characteristics. 6.6.3 Real Speech. Speech-like signals are not limited to signal sources or synthesized digital processing, but may also include real speech signals. This is often done by recording conversational speech, preferably in a digital format to avoid signal degradation with use. These real speech recordings are then reproduced using a playback device as the signal source. 6.7 Compound Signals. The signals described above rely on one signal source to place the hands-free telephone in a well-defined reproducible state, insure that the transfer function of the unit remains stable, and provide a suitable signal for the specific measurement. By applying two signal sources, one can be used specifically for “biasing” the unit into a stable, reproducible state, while the other is the actual test signal required for measurements. These compound signals include those where the two sources are applied in sequence, and those where both sources are applied simultaneously. Compound test signals can provide extra test flexibility and solve problems which are difficult or impossible using simple test signals. The bias signal can be a signal that, by itself, is unsuitable or very inconvenient for the actual measurement. The measurement signal can be a signal that, by itself, is unsuitable as a bias signal. If desired, the measurement signal can be presented so as not to have a substantial effect on the action of the bias signal. This can be done by adjusting the temporal and/or level relationships between the two signals. The bias signal can be changed to put the telephone in different states with minor or even no change in the measurement signal. 6.7.1 Sequential Presentation. This class of test signals is characterized by the separation of the bias and analysis signals in time. The bias signal is presented until the HFT is in a stable state. Once a stable state is reached, the appropriate analysis Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 22 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS signal is applied and a measurement is performed. The analysis must be completed while the HFT is still in its stable state. The CSS is one example of this type of signal. 6.7.1.1 Composite Source Signal. The Composite Source Signal (CSS) is a compound signal using a voiced signal to simulate the voice properties, followed by a deterministic signal for measuring the transfer functions, and an inserted pause to provide amplitude modulation. The deterministic signal has either a flat or speech shaped power density spectrum. It has the advantage of short measurement periods and duplex operation where, using an uncorrelated double talk signal, the test signals can be applied from the talking and listening directions at the same time. See annex B for details of this signal. 6.7.2 Simultaneous Presentation. This class of test signals is characterized by presentation of the bias and analysis signals at the same time. Some conditioning of the HFT may be required before beginning the analysis. The bias and analysis signals must be separable by the analysis method. A synchronous analysis method is usually required. The P.50 Burst with Sine Sweep is one example of this type of signal. 6.7.2.1 TDS Sweep with P.50 Noise Bursts. This compound signal has two components, which are presented at the same time, but not synchronized with each other. The bias signal is intended to insure that the telephone is in a stable, well-defined operating state. The measurement signal is intended to ensure a well-defined, reproducible measurement, which is especially well adapted to simulated free-field techniques. An anechoic room is not necessary when using this signal. See annex C for details of this signal. 6.7.2.2 TDS Sweep with Artificial Voice. Similar to section 6.7.2.1, except the bias is a continuous artificial voice signal defined in ITU P.50 (See section 6.6.1.1). 6.7.2.3 TDS Sweep with Real Speech Similar to section 6.7.2.1, except the bias is real speech (see Section 6.6.3) 6.7.2.4 TDS Sweep with Random or Pseudorandom Noise Similar to section 6.7.2.1, except the bias is white or pink random noise as described in 6.5. Pseudorandom noise (6.4.2) with white or pink spectrum is considered equivalent if the pseudorandom period is not correlated with the bias. 6.7.2.5 Pseudorandom Noise with P.50 Noise Bursts. This compound signal has two components, which are presented at the same time, but not synchronized with each other. The bias signal is intended to insure that the telephone is in a stable, well-defined operating state. The measurement signal is intended to ensure a well-defined, reproducible measurement, which is especially well adapted to simulated free-field techniques. An anechoic room is not necessary when using this signal. 6.7.2.6 Pseudorandom Noise with Artificial Voice. Similar to section 6.7.2.5, except the bias is a continuous artificial voice signal defined in ITU P.50 (See section 6.6.1.1). 6.7.2.7 Pseudorandom Noise with Real Speech Similar to section 6.7.2.5, except the bias is real speech (see Section 6.6.3) 6.7.2.8 Pseudorandom Noise with Random or Pseudorandom Noise Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 23 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Similar to section 6.7.2.5, except the bias is white or pink random noise as described in 6.5. Pseudorandom noise (6.4.2) with white or pink spectrum is considered equivalent if the pseudorandom period is not correlated with the bias. 6.7.2.9 Sine Wave with Notched Real Speech. A sine wave signal is the measurement signal and real speech is the bias signal. A notch filter removes a band of the speech signal at the sine wave frequency. (See annex I for details.) 6.8 Test Signal Bandwidth In general, the test signals and analysis methods in this standard cover a frequency range of 100 Hz to 10kHz. Some signals, such as SSG (6.6.1.3), are defined only for a smaller bandwidth, and can only be used within their defined range. The minimum bandwidth for this standard is 175 Hz to 4.5 kHz, or the 1/3 octave bands from 200Hz through 4 kHz. 6.9 Signal Parameter Summary. The table below defines the bandwidth and minimal analysis interval for the various test signals identified in section 6. These parameters would be applied to the calibration and test procedures. Ref. Test Signal Usable Bandwidth (Hz) Minimum Alternative Analysis Analysis Interval Interval ISO R40 or 1/12 ISO R40 or 1/12 Oct. steps Oct. steps 6.4.1 Sine Wave 100-10,000 6.4.2 Pseudo-Random 100-10,000 25 Hz bands 1/12 Oct. bands 6.5.1 White Noise 100-10,000 25 Hz bands 1/12 Oct. bands 6.5.2 Pink Noise 100-10,000 1/12 Oct. bands 50 Hz bands 6.6.1. 1 P.50 Artificial Voice 89.1-8,910 1/12 Oct. bands 50 Hz bands 6.6.1.2 P.59 Artificial Conversational Speech 89.1-8,910 1/12 Oct. bands 50 Hz bands 6.6.1.3 Simulated Speech Generator 100-5,000 25 Hz bands 1/12 Oct. bands 6.6.2 Synthesized Speech 100-10,000 25 Hz bands 1/12 Oct. bands 6.6.3 Real Speech 100-10,000 25 Hz bands 1/12 Oct. bands 6.7.1.1 Composite Source Signal 100-10,000 25 Hz bands 1/12 Oct. bands 6.7.2.1 TDS Sweep with Bias 100-10,000 50 Hz Bands 1/12 Oct. bands 6.7.2.5 Pseudorandom noise with 100-10,000 50 Hz Bands 1/12 Oct. bands Bias (Alternates shown in parentheses) Table 2 - Test Signal Parameters and Analysis Methods 6.10 Signal Comparative Summary. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 24 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The table below identifies the various test signals described previously. The corresponding test methods and conditions are shown for each signal. The various method classifications were described in section 5. Test Signal Sine FFT/Cross Real-Time Free Field Based Spectrum Filter Technique Method Method Method Method Bias Signal Signal Gating or Modulate HFT Stability 6.4 Deterministic 6.4.1 Sine Wave 6.4.2 Pseudo-Random Yes No No Yes No Yes Yes No Optional Optional Yes Yes Uncertain Uncertain 6.5 Random 6.5.1 White Noise 6.5.2 Pink Noise No No Yes Yes Yes Yes No No Optional Optional Yes Yes Uncertain Uncertain 6.6 Speech-Like 6.6.1 Simulated Speech 6.6.2Synthesized Speech 6.6.3 Real Speech No No No Yes Yes Yes Yes Yes Yes No No No No No No Maybe Maybe Maybe Yes Yes Yes 6.7 Compound 6.7.1 Sequential No Yes No No Yes Maybe Yes 6.7.2 Simultaneous No Yes Yes Yes Yes Maybe Yes This table may be removed!!! It is somewhat redundant, and suitability of a given test signal is discussed in appropriate sections. There are too many inter-dependencies. Table 3 - Test Signals Comparison Note, a “Yes” entry indicates the test signal is appropriate for the specific test method, while “No” implies it should not be used. The “Optional” and “Maybe” entries may be required depending on test method. 7. Test Equipment, Environment, and Impairments 7.1 General. The test equipment, environmental concerns, and impairments recommended to evaluate hands-free telephone set transmission performance are described in this section. See IEEE Standard 269-1992 for information concerning other test equipment needed for this standard. Different equipment will be required depending on whether analog or digital sets are being tested and which test signals and analysis methods are chosen. For hands-free analog telephone sets, feed circuits and test loops are required. For hands-free digital telephone sets, interfaces and codecs are required. The test equipment required to generate and process input and output signals for the sinusoidal method includes a variable frequency generator, a level recording device, and a spectrum analyzer, harmonic analyzer, or distortion analyzer. For the continuous spectrum method the required equipment includes a continuous spectrum generator, a spectrum analyzer, and a processor (optional). 7.2 Test Equipment. 7.2.1 Measuring Microphones. A half inch (type M) laboratory standard free-field microphone per ANSI S1.12-1967 (R1986) [5] is required for calibrating the mouth simulator and measuring receive characteristics. The sensitivity of the microphone should be constant within ± 0.5 dB from 100 Hz to 10,000 Hz. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 25 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS A one inch (type L) laboratory standard free-field microphone per ANSI S1.12-1967 (R1986) is required for measuring low sound pressure levels, background noise in the test environment, and receive noise from the HFT under test. The sensitivity of the microphone should be constant within ± 0.5 dB from 100 Hz to 5,000 Hz. In principle, background noise in the test environment should be measured with a low noise, random field microphone. In practice, use of a free-field microphone is acceptable provided six measurements are taken with the microphone aimed in each direction along three mutually perpendicular axes (left/right, front/back, up/down). The maximum of the six measurements shall be used. 7.2.2 Mouth Simulator. The mouth simulator shall comply with the specifications given in ITU-T Recommendation P.51. 7.2.3 Head And Torso Simulator (HATS). A head and torso simulator for measurements of hands-free telephones should conform to ITU-T Recommendation P.58. The HATS should be equipped with both an ear simulator and a mouth simulator. These transducers should conform to ITU-T Recommendation P.57 [21]. Electro-acoustic measurements on hands-free telephones using HATS will differ from measurements performed using a separate microphone and mouth simulator. This use of HATS should therefore be reported. The HATS configuration, neck position (if variable), the use of absorptive clothing, and any other relevant parameters should also be noted. 7.2.4 Standard Circuits for Transmission and Voice Switching Measurements. The standard circuits for hands-free telephone transmission and voice switching measurements are the same as those used in IEEE Standard 269-1992 for telephones. For interfacing to wireless systems, see appropriate wireless standards. 7.2.5 Standard Circuits for Acoustic Echo Canceller Measurements. A test hybrid is required to perform acoustic echo return loss measurements on analog HFT’s. The test hybrid is used to cancel or remove the HFT’s hybrid reflection, making a two-wire analog HFT behave as a virtual four-wire phone. This means that only the acoustic portion of the echo path will remain, and thus can be measured. There are a number of ways to implement a test hybrid: the use of an echo canceller with freezable coefficients, measurement post processing, or use of a freezable adaptive analog filter bank. The requirements for the hybrid are: • High impedance bridge mode for tapping off lines without affecting impedance. • DC blocking on the input. • Known one way delay and round trip echo path (in/out delay). Any delay in the digital hybrid or acquisition card will impact both the echo path model delay and the synchronization of data acquisition. The fixed echo path delay inserted by the units must be quantified. • Noise and distortion performance 10 dB below echo detection thresholds. A full scale SNR of 80 dB is preferred. • High echo return loss. G.167 calls for a digital hybrid with an echo return loss of 60 dB. • Ability to freeze the adaptive hybrid to ensure the canceller does not enhance return loss. The test hybrid is to be trained with white noise. Fulfillment of the above requirements should be tested at a nominal level of -16 dBV at RETP with a 600Ω load on the hybrid. Measure the long-term average level. This can be done using ITU-T Recommendation P.56 (SV6) on the input and output. The difference should be 60 dB and at least 50 dB. An amplifier with high gain (at least 40 dB) will be required on the echo path to bring any residual echo up to a level that will trigger the P.56 algorithm's activity detector. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 26 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Available commercial hybrids may not meet the above criteria. However, one possible post-processing implementation of this technique, which does meet the above specifications, is given in annex F. 7.3 Test Environment. Electroacoustic measurements on hands-free telephones should be conducted in a test environment, which will not affect the results beyond the intended influence of the test table and the measurement transducers. The test environment must have a low background noise level. In addition, either the test environment or the test method should minimize or eliminate the influence of reflections. [Need to address simulating a “real room” environment for some measurements.] 7.3.1 Background Noise Level. The background noise level in the test environment should not exceed the limits shown in table 4. The overall A-Weighted noise level must not exceed 29 dBSPL [12]. Octave Band Center Frequency (Hz) Octave Band Level (dBSPL) 63 49 125 34 250 29 500 29 1000 29 2000 29 4000 29 8000 29 Table 4 - Test Room Noise Levels 7.3.2 Anechoic Chamber. An anechoic chamber should be large enough to comfortably accommodate the test table and transducers and HFT. Free field conditions should exist throughout the frequency range of interest. Errors due to deviations from the inverse square law or due to the influence of reflections shall not exceed ± 1.5 dB below 800 Hz. Errors above 800 Hz shall not exceed ± 1.0 dB. Figure 1 shows the position of transducers relative to the location of the test table, but free field verification is done without the test table in the chamber. A verification of free field conditions can be conducted as follows: Using a ½ inch free field microphone positioned along the seven axes shown in figure 1, measure at distances of 315, 400, 500, 630, 800 and 1000mm from the mouth simulator. The level should decrease 6 dB for each doubling of distance. Deviations represent departure from ideal free field conditions [11], [12]. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 27 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 2 3 Virtual Point Corresponding to Centerpoint of Table (Point B) 1 4 Top View Lip Ring 50 C 30 B Upper Table 40 7 6 5 30 B Upper Table Surface 40 Dimensions in cm. Figure 1 - Axes for Determining Free Field Conditions Note: Axes 1, 2, 3, and 4 are in horizontal plane normally occupied by table surface. Measurements of free field sound pressure are made in absence of the table. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 28 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The signal used for this test should have the same frequency format as the subsequent hands-free response measurements. The tolerances specified above must be met over the entire measurement range in that format. This verification should also be performed if a simulated free field technique is employed (see section 7.3.3). 7.3.3 Simulated Free-Field. Several methods exist which enable reflection-free measurements in a room with ordinary, untreated boundaries. Tone burst gating, FFT analysis of short noise bursts, Time Delay Spectrometry (TDS), Maximum-Length Sequence (MLS), and post-process windowing of an impulse response are among the possibilities which can together be called “simulated free-field methods.” All these methods create a time window, which is adjusted to be slightly shorter than the interval between the first arriving signal and the first reflection. Although each method creates the time window differently, the effective length of the time window is the essential parameter. The lowest frequency that can be measured (fmin) is the inverse of the length of the time window (T). For simulated free-field results, the time window can be set as long as desired, as long as no reflections are included. In general, the larger the room, the longer the time window can be. When the following conditions are met, simulated free-field methods can be used as a substitute for an anechoic room: (1) The signal has a suitable bias effect on the hands-free telephone, so the unit is measured in the desired state. The simulated free-field signal may be the only test signal used, or it may be combined with a suitable bias signal. (2) The effective time window (T) is long enough (> 5.8 ms milliseconds) to result in a lower frequency limit (fmin) of 175 Hz or lower. This is slightly below the lowest frequency of the 200 Hz 1/3 octave band. Example: The measurement microphone is 50 cm from the lip ring of the mouth simulator. This direct path corresponds to a signal delay of about 1.5 ms. The first reflection path from the mouth to the nearest boundary is 124 cm, and another 124 cm back to the microphone, for a total of 248 cm. This corresponds to a delay of about 7.3 ms. The difference between the direct and first reflection path is about 5.8 ms. If the effective time window is set to this amount, the lower frequency limit (fmin) is 175 Hz. If the test table is positioned midway between floor and ceiling, the above example could be set up in a room of about 2.5 meters (m) ceiling height, and smallest wall-to-wall dimensions slightly larger. If the smallest wall-to-wall dimensions are on the order of 5 m, with a ceiling height of 2.5 m, the test table should be located on the floor. In such a case, the time window can be approximately doubled, and (fmin) halved. The lower frequency limit (fmin) is also the frequency resolution of the measurement. In the above example, there is effectively a data point every 175 Hz, although the curve will usually be smooth. This can result in some smearing of detail at the lower end of the measurement range. However, the true response of many hands-free telephones is already rather smooth at low frequencies, so this may not always be serious. A larger untreated room can be used for better resolution. 7.3.4 Test Table. The test table should be a hard, bare table with a surface area of at least 1 square meter. No horizontal dimension of the table shall be less than 0.8m. The table should be flat, rigid, horizontal and thick enough to provide a sound reflecting surface on which the HFT rests. The tabletop should be composed of a high Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 29 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS density, hard surfaced material such as hardwood, polished marine plywood or smooth plastic laminated high-density particleboard. The table shall be used for all measurements of desktop HFT’s. It may also be used for measurements of non-desk-top HFT’s where appropriate. 7.3.5 Test Room Characteristics. The reverberation time (RT60) of the test room should meet ITU-T Recommendation G.167 [16]. When averaged over the transmission bandwidth, RT60, shall be approximately 500 ms, the reverberation time in the lowest octave shall be no more than twice this average value; the reverberation time in the highest octave shall be not less than half this value. The volume of a typical test room shall be of the order of 50 m3 / 1500 ft3. 7.4 Impairments. It is important to understand how a hands-free telephone performs under less than ideal conditions. For this reason, acoustic and network impairments may be introduced into the measurement so the effect on the performance of the HFT may be evaluated. It is beyond the scope of this standard to list all impairments, but some are listed in the following sections. 7.4.1 Network Impairments. Network impairments are those that can occur in the path of a telephone connection. 7.4.1.1 Test Loops. An analog HFT should be tested with various lengths of cable or simulated cable. Recommended loop lengths for testing North American telephones are 0 km. 2.7 km and 4.6 km (0 feet, 9000 feet and 15, 000 feet) of 26 AWG cable. 7.4.1.2 Loop Current. Loop current may be varied to determine if there are any detrimental effects. This is especially important if the HFT is powered from the line rather than from a local power supply. 7.4.1.3 Termination Impedance. Network termination impedance mismatch can result in the send signal being reflected back toward the telephone. This is especially true with analog systems but reflections can also occur with a digital system if some portion of an end-to-end connection is analog. This reflection can effect HFT echo canceller or switching circuits. 7.4.1.4 Network Noise. Network circuit noise can effect the HFT in various ways. Many designs sense the nominal noise level on the line in order to set the switching threshold. Also, network noise can affect other non linear processes within an HFT. Network noise shall be white in spectrum, and levels measured in dBm, psophometric weighted (dBmp). 7.4.2 Acoustic Impairments. Acoustic Impairments often occur due to the environment in which the telephone is placed. Some examples are given in the following sections: 7.4.2.1 Nearby Reflecting Surfaces. The HFT should be evaluated with nearby reflecting surfaces. In particular receive signals such as dial tone may cause a half-duplex telephone to become unstable and oscillate between the receive and send states – this is commonly referred to as self-interruption. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 30 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 7.4.2.2 Hoth Room Noise. Hoth noise is random acoustic noise that has a power density spectrum corresponding to that published by Hoth. This spectrum is designed to simulate typical ambient room noise. See annex D for details. 7.4.2.3 Room Reverberation. [John will write a contribution for this section.] 7.5 Post Processing. If post processing is used for acoustic echo testing, the main post processing functions will include: • • • • • • • • • • Calculation of segmental ERLs using 32 ms frames and 320 ms average. Calculation of the ERLt algorithm. Calculation of rms level with 8 ms sliding average. Calculation of rms level with 4 ms sliding average. Calculation of impulse response given source and output data files. Notch and bandpass filters (see section F.4). Calibration of Sound card in D/A and A/D. A filter replicating the Fletcher-Munson response at 30 Phons. This can be accurately approximated (to within a dB or two below 2500 Hz) by a first order high pass filter with a -3 dB point of 800 Hz. File scaling and measurement of rms level. Activity detection as called for in the various tests. 8. Test Calibration 8.1 Measurement Bandwidth and Resolution. The same bandwidth shall be used for calibration and measurement. The actual bandwidth or frequency interval used shall be stated. The calibration and measurement shall be performed using the same measurement resolution. The measurement resolution shall be stated. 8.2 Send. 8.2.1 Acoustic Test Spectrum. The acoustic test spectrum is measured at the Mouth Reference Point (MRP). For sinusoidal test signals, the spectrum shall be flat within ± 1 dB over the actual measurement bandwidth. The electrical input to the mouth simulator may be equalized to meet this requirement. For other test signals, the acoustic spectrum shall meet the target spectrum and spectrum tolerance for the type of signal used, as defined in section 6. The default tolerance is ± 3 dB from 175 -4500 Hz (or the 1/3 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 31 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS octave bands from 200 - 4000 Hz), and +3/-5 dB elsewhere. The electrical input to the mouth simulator may be equalized to meet this requirement. The continuous spectrum default tolerance shall also apply to the bias and measurement parts of the compound, parallel test signals defined in section 6.7.2. 8.2.2 Acoustic Test Level. The standard test level for send, Nominal PMRP, is -4.7 dBPa, at the MRP. Total harmonic distortion should be less than 5% for this test condition. For sinusoidal test signals, the level at the MRP shall be held constant at all frequencies of test, within the tolerance specified in section 8.2.1. For other continuous spectrum test signals, levels shall be measured over the entire spectrum. Out-of-band signals from 40 to 20,000 Hz shall add no more than 0.5 dBPa to this level. 8.2.3 Mouth Simulator Calibration Procedure. Mouth simulator calibration requires measurements at both the MRP and the HFRP in a free field, without the test table. To calibrate at the MRP, use a ½” free-field microphone oriented at 0 degrees to the mouth axis, with the center of the protection grid at the MRP (see figure 2). Subtract 0.6 dB from the measurement to give the actual sound pressure at the MRP (this compensates for the fact that the acoustic center of the microphone is slightly in front of the protection grid). This method is valid over the entire frequency range covered in this standard6. 25 mm Free Feild Microphone Lip Ring of Mouth Simulator Figure 2 - Calibration of Mouth Simulator at MRP To calibrate at the HFRP, use a 1/2” free-field microphone oriented at 0 degrees to the mouth axis, as shown in figure 3. 6 Use of a ½” pressure microphone oriented at 90 degrees to the mouth axis, with the center of the protection grid at the MRP., is essentially equivalent up to 5000 Hz. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 32 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Lip Ring of Mouth Simulator Free-field Microphone 50 cm Figure 3 - Calibration of Mouth Simulator at HFRP To calibrate the mouth, first measure GMRP(f) the spectrum at the MRP. Adjust the mouth equalization to meet the target spectrum for the signal being used at a total sound pressure of -4.7 dBPa. Next, measure LHFRP, the total sound pressure level at the HFRP. Adjust the drive to the mouth simulator, without changing the spectrum shape, so that LHFRP = -28.7 dBPa. Finally, remeasure GMRP(f), the calibrated spectrum at the MRP. This spectrum is used to calculate the send frequency response. Remeasure LMRP, the total calibrated sound pressure level at the MRP, should be measured for reference. In general, LMRP will be close to, but not exactly -4.7 dBPa. LMRP - LHFRP = Corr, a correction factor relating to the acoustic radiation characteristics of the mouth. For a mouth simulator exactly meeting the specifications of ITU-T Recommendation P.51, Corr = 24 dB. The Corr value for an actual mouth simulator will differ slightly, depending on the type of mouth and the bandwidth of the measurement. Send stimulus levels specified in this standard are nominal levels, which differ from actual levels at the MRP as follows: LMRP = nominal LMRP + Corr – 24 dBPa 8.2.4 HATS Calibration Procedure Calibration of the HATS mouth simulator is performed using the procedure described in section. 8.2.3 with the following exceptions: 1. The spectrum at the MRP may be measured by an acoustically equivalent method if provided for by the manufacturer7. 2. LHFRP is measured with the HATS neck stright as shown in figure 4. The neck position may be different for measurements. 7 A 1/4 inch microphone at 90° to the mouth axis. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 33 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Equivalent Lip-Plane 50 cm Free-field Microphone Figure 4 - HATS Mouth Simulator Calibration at the HFRP. The ear simulator(s) is calibrated according the manufacturer’s recommendation. 8.3 Receive and Echo. Electrical test signals in the receive direction shall be provided for a 900 ohm resistive source impedance. Calibration is performed across a 900 ohm resistive calibration load. As a result, electrical test signals are specified under nominally loaded conditions (This is equivalent to one-half the open-circuit voltage). After calibration, the 900 ohm calibration load is removed. The source is then connected to the HFT without further adjustment. 8.3.1 Electrical Test Spectrum. For sinusoidal test signals, the spectrum shall be flat within ± 1 dB over the actual measurement bandwidth. The electrical input may be equalized to meet this requirement. For other test signals, the spectrum shall meet the target spectrum and spectrum tolerance for the type of signal used, as defined in section 6. The default tolerance is ± 3 dB from 175 -4500 Hz (or the 1/3 octave bands from 200 - 4000 Hz), and +3/-5 dB elsewhere. The electrical input may be equalized to meet this requirement. The continuous spectrum default tolerance shall also apply to the bias and measurement parts of the compound, parallel test signals defined in section 6.7.2. 8.3.2 Electrical Test Level. The standard test level for receive is -18 dBV. This test level is recommended for measurements at minimum volume, and -28 dBV at the 900 ohm calibration load is recommended at maximum volume. Total harmonic distortion should be less than 5% for these test conditions. For sinusoidal test signals, the level shall be held constant at all frequencies of test, within the tolerance specified in section 8.3.1. For other continuous spectrum test signals, levels shall be measured over the entire spectrum. Out-of-band signals from 40 to 20,000 Hz shall add no more than 0.5 dBV to this level. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 34 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 9. Transmission Measurements 9.1 General. Procedures are given in the following sections for measuring parameters affecting the send and receive performance of hands-free telephone sets. These parameters include frequency response, noise, inputoutput linearity, distortion, equalization, and directivity. In addition, procedures are given for measuring electrical characteristics. Because hands-free telephone set characteristics are affected by loop impedances, terminations, loop currents, and operating levels, the measurements should be made under test loop and signal level conditions representative of those conditions the hands-free telephone set is expected to encounter in use. Records should be kept of parameters used in all measurements. The sensitivities measured should be presented as dB Pa/V or dB V/Pa. Input levels should be reported. All equipment should be calibrated in accordance with the manufacturer's recommendations prior to making measurements. The location of the mouth simulator and the characteristics of the room should be such that the level of any extraneous sound is at least 20 dB below that of the lowest signal to be measured. Proper positioning or sound treatment should be provided to minimize reflected waves and ambient noise. In general, anechoic chambers are required to perform the tests in this standard. Test conditions for all transmission measurements are described in section 7 unless otherwise stated. A range of volume control settings should be used where appropriate, but most measurements should use the reference volume control setting. 9.1.1 Measurement Bandwidth and Resolution. The same bandwidth shall be used for calibration and measurement. The actual bandwidth used shall be stated. The calibration and measurement shall be performed using the same measurement resolution. The measurement resolution shall be stated. 9.1.2 Recommended Test Methods. The table below identifies the recommended test method(s) for each measurement parameter. Multiple methods should be conducted to ensure the HFT is characterized at optimum performance. HFTs with compression or VOX circuitry should be characterized with multiple methods to ensure a fully operational state. Frequency Response Noise InputOutput Linearity Distort 5.2 FFT/Cross Spectrum 5.2.1 Dual-Channel FFT 5.2.2 Single-Channel FFT 5.2.3 Max. Length Seq. Yes Yes Yes No Yes No Yes Yes Yes No No No Yes Yes Yes Yes Yes Yes Yes Yes Yes 5.3 Real-Time Filter 5.3.1 Dual-Channel RTA 5.3.2 Single-Channel RTA Yes Yes No Yes Yes Yes No No Yes Yes Yes Yes Yes Yes 5.4 Sine-Based 5.4.1 Discrete Tone Yes No Yes Yes Yes Yes Yes Test Methods Loudness Mid-Band Loudness Rating Average Rating Sensitivity Directivity Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 35 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 5.4.2 Swept Sine 5.4.3 Time Delay Spectr. 5.5 Free Field Tech. MLS & TDS Yes Yes No No Yes Yes Yes Yes Yes No Yes Maybe Table 5 - Test Methods Comparison Yes Yes Yes Yes Yes Yes Yes Yes Yes Note, a “Yes” entry indicates the test method is appropriate for the specific measurement parameter, while “No” implies it should not be used. The “Maybe” entry depends on the free field technique used. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 36 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Frequency Response Noise InputOutput Linearity Distort 6.4 Deterministic 6.4.1 Sine Wave 6.4.2 Pseudo-Random Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes Yes Yes Yes 6.5 Random 6.5.1 White Noise 6.5.2 Pink Noise Yes Yes Yes Yes Yes Yes No No Yes Yes Yes Yes Yes Yes 6.6 Speech-Like 6.6.1 Simulated Speech 6.6.2 Synthesized Speech 6.6.3 Real Speech Yes Yes Yes Yes Yes Yes Yes Yes Yes No No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Test Signals 6.7 Compound 6.7.1 Sequential 6.7.2 Simultaneous Yes Yes Yes Yes (1) Yes Yes Yes Yes (1) Table 6 - Test Signals Comparison Loudness Mid-Band Loudness Rating Average Rating Sensitivity Directivity Note, a “Yes” entry indicates the test signal is appropriate for the specific measurement parameter, while “No” implies it should not be used. Also, any signal may be utilized to place HFT in the proper test state for noise measurements, and reference (1) under distortion measurements must be a sine wave analysis signal. 9.2 Positioning of HFT and Test Transducers. 9.2.1 Desktop Hands-Free. The hands-free telephone should be positioned 40 cm back from the front edge of the test table as shown in figure 5A and 5B. Figure 5A shows the mouth simulator positioned for send measurements. The lip ring is positioned 50 cm from the front center of the hands-free telephone and 30 cm above the table. The axis of the mouth simulator is aligned with the 50 cm imaginary line. Receive measurements are made with the artificial mouth replaced by the measuring microphone with the center of the microphone grid at point C and the microphone axis aligned with the 50 cm imaginary line. Hands-free telephones with separate microphone and loudspeaker housings should be positioned as shown figure 6. Note, this physical test arrangement should be used for all measurements including frequency response. Effects due to the presence of the table are considered a part of the measured response. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 37 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Mouth Simulator (or Measuring Microphone) 50TP (Center of Lip Ring) 50 cm 30 cm 40 cm Figure 5A - Standard Test Position for Hands-free Measurements - Side View (Mouth Simulator shown for Send Measurements) Approximately one meter 40 cm R M Center Line of Table J M Measuring Microphone (or Mouth Simulator) Figure 5B - Standard Test Position for Hands-free Measurements - Top View (Microphone shown for Receive Measurements) Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 38 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Approximately one meter Hands-free Microphone 40 cm 60 cm o 48.6 48.6 Center Line of Table o Measuring Microphone (or Mouth Simulator) 40 cm Hands-free Loudspeaker Figure 6 - Standard Test Position for Hands-free with Detached Microphone and Speaker 9.2.2 Open Listening. In open listening mode the telephone transmits via the handset microphone and receives through both the handset receiver and the loudspeaker. The send gain is usually much less than normal hands-free since the send microphone is at the mouth rather than at a nominal 50 cm distance. This reduction in send gain is usually not enough to eliminate the need for gain switching or echo canceling to prevent acoustic echo. Send measurements in open listening mode are made with the handset mounted on a test head and using standard handset transmission test methods outlined in IEEE-269-1992 although different test signals may be required if the telephone employs switching or acoustic echo canceling in the open listening mode. Open listening receive loudspeaker measurements are made using the same test position and test methods employed in hands-free measurements except the that the handset is taken off the cradle and placed out of the way during measurement. Echo path loss measurements during open listening are the same as for hands-free except for the positioning of the handset. Figure 7 shows a recommended test setup for making echo path loss measurements in the open listening mode. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 39 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 50TP (Center of Handset Receiver) 30 cm 50 cm 40 cm Figure 7 - Standard Test Position for Open Listening Echo Path Loss Measurements 9.2.3 Non Desk-Top Hands-Free HFT’s designed for other than traditional table top or desktop positioning should be tested with the appropriate user positioning in mind. This position shall be defined as the “recommended test position” (RTP). The RTP should be obtained from the manufacturer, and should be based upon the product’s intended use. For testing purposes, this will dictate the distance and position geometry relationship between the HFT and the mouth simulator and microphone. Measurements performed at other distances or positions shall be noted, and in the absence of a RTP, 50TP is recommended. 9.2.4 HATS Positioning. When using HATS, position HFT as shown in figure 8. The head may be bent at the neck to face directly at the HFTP if this facility is available. The intersection of the HATS lip plane and the mouth axis should be positioned at 50TP. 50TP (center of lip plane) 50 cm 30 cm 40 cm Figure 8 - Position of HATS, Hands-Free Telephone, and Test Table Note: This physical test arrangement should be used for all measurements including frequency response. Diffraction effects due to the presence of the table are considered a part of the measured response. The use of HATS can cause a dip to appear in the measured frequency response due to reflections from the chest. These effects are also considered a part of the measurement. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 40 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Due to the symmetry of the HATS, the measurement can be performed using only one ear simulator. The test object, however, may not have a symmetric or centrally located receiver. Therefore, the ear used for the measurement (right or left) should be reported. Note that the ERP does not occupy the same position as the MRP. While this is more realistic than a simple microphone, there will be level and response differences compared to measurements performed using a separate microphone and mouth simulator. The increased distance to the ERP will decrease the measured response level at low frequencies. The “obstacle effect” of the head will be part of the measured receive response. The obstacle effect causes an increase in the measured response of approximately 8 to 12 dB around 3 kHz at the ERP. 9.3 Send. 9.3.1 General. Send measurements consist of connecting the hands-free telephone to the appropriate interface as covered in section 7.5. Select the desired test signal to be applied as an acoustic stimulus from the calibrated mouth simulator, and measure the resultant signal at the Send Electrical Test Point (SETP). Measure the sound pressure level of the mouth simulator as prescribed in section 8.2.2. Adjust the acoustic output spectrum as defined in section 8.2.1 for sinusoidal and other signals as defined in section 6. Due to interaction with the receive path, send measurements should be made at reference and maximum receive volume control settings. 9.3.2 Special Instructions For Microphones. Multiple microphones, directional microphones, beam steering microphones, microphones with gating and/or AGC result in a unique set of measurement challenges. For testing purposes, it may ease the task if there is prior knowledge of the amount and types of microphones incorporated in the HFT. Users who are unfamiliar with microphone types and designs may wish to refer to Annex K. 9.3.3 Frequency Response. The send frequency response HS(f) is given by the equation below: (Eq. 1a) G (f) H S (f) = 20 log SETP + Corr - 24 in dBV/Pa G MRP (f) where: GSETP(f) is the RMS power spectrum at the Send Electrical Test Point (SETP) GMRP(f) is the RMS power spectrum at MRP. Note, if the cross-spectrum method is used, the send frequency response becomes: (Eq. 1b) H S (f) = 20 log G (M RP)(SETP) (f) G MRP (f) + Corr -24 in dBV/Pa where: G(MRP)(SETP) (f) is the cross spectrum. 9.3.4 Noise. The hands-free telephone's microphone should be isolated from sound input and mechanical disturbances that would cause significant error. Measure the electrical output signal at the SETP, averaging over a minimum period of 5 seconds using the noise meter described in IEEE Standard 269-1992 section 5.15.1. Single-channel FFT or real-time analysis may also be implemented for measuring noise spectra. However, the overall weighted average may require some post-processing to comply with the noise meter defined in Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 41 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS IEEE 269-1992. Steps should be taken to ensure the hands-free telephone is fully operating in the send mode. 9.3.5 Input-Output Linearity. Measure the hands-free telephone set as described in section 9.3.3 using any of the recommended test methods prescribed in that section. Apply acoustic input levels representing the total range that the handsfree telephone is expected to encounter in use. For a linear characteristic, the output level should follow an input level change dB for dB. 9.3.6 Distortion. Distortion tests for HFTs are derived from standard harmonic distortion measurement techniques; however, a continuous sine wave signal is frequently not a suitable signal for testing HFTs. Alternative signals are specified below, as well as a test to determine which signal is suitable for the HFT under test. In principle, these methods can be extended to difference-frequency distortion measurements. The stimulus consists of two sine waves (or two narrow-band pseudo-random noise signals) of equal amplitude, with the stimulus level calculated on a power basis. Analysis is with either a notched weighting filter or by bandpass filters (or equivalent algorithm). Difference-frequency distortion tests may be the best way to evaluate an HFT above about 1000Hz, where the harmonics of a single tone (or narrow-band pseudo-random noise signal) lie above the cutoff frequency of the HFT. 9.3.6.1 Test Signal Three types of distortion test signals are recommended. These include continuous sine waves (6.4.1), modulated sine waves and narrow-band pseudo-random noise. A square wave (6.3.1), sine wave (6.3.2), or a pseudo-random modulation (6.3.3) can modulate sine wave signals. The narrow-band pseudo-random noise (6.4.2) may be used as the default test signal for all distortion measurements. It should have an effective bandwidth of 25 to 50Hz, and out-of-band signals should add no more than 0.5dB to the overall level of the test signal. A period of 250ms is recommended for this signal, since this will provide some modulation at a 4Hz rate. The crest factor should be 9 3dB. When a narrow-band pseudo-random test signal is not suitable, modulation may be applied in a similar manner to modulating a sine wave. Send distortion should be measured at the standard stimulus level of –4.7dBPa, and other levels in the range from –30dBPa to +6dBPa. Measurements should also be made over a range of frequencies within the telephone band, such as the ISO R-10 preferred frequencies from 315Hz to ½ the upper frequency limit of the HFT under test. Note: Use of Acoustic Reference Level (ARL) is not recommended for send distortion measurements. The rationale is that send sensitivity is normally not user-adjustable, and the level of the human talker lies in a reasonably controlled range. The HFT should have low distortion regardless of the level actually sent to the far end. 9.3.6.2 Suitability Test To test the suitability of a particular distortion test signal, send frequency response should first be measured at the standard test level (9.3.3). The proposed distortion test signal should then be applied at each distortion test frequency, at the standard level, and the send frequency response measured at those Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 42 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS frequencies. If the result is within 2dB of the comparable values previously obtained in the complete send frequency response, then the proposed distortion test signal is suitable. Distortion does not have to be measured using the same test signal as send response. 9.3.6.3 Distortion Measurement The output fundamental is measured at SETP, with a bandpass filter or equivalent algorithm. The distortion is measured by use of a psophometric filter according to ITU-T recommendation O.41, but with a notch added to eliminate the test signal. The output of the notched filter includes harmonics as well as noise. The notch filter output is divided by the fundamental and expressed in percent. The result is signal-to-totaldistortion-and-noise. The notch must attenuate the test signal by at least 50dB. This will result in a distortion floor of 0.3%, permitting measurements of distortion from 1% and above with 6% or better accuracy. The filter shall be compensated for the notch on a power basis. A constant shall be added to each point of the notched filter frequency response, so that the power sum of all points, on a logarithmic frequency scale, is equal to the power sum all frequency response points of the original psophometric filter. Harmonic analysis using bandpass filters, or an equivalent algorithm, is recommended for further diagnostic information. At each harmonic frequency, the bandpass filter output is divided by the fundamental and expressed in percent. The result is 2nd, 3rd, etc. harmonic distortion at each test frequency. 9.3.7 Loudness Rating Applications. The send frequency response defined in section 9.2.4 can be used directly in calculating TOLR according to IEEE Standard 661-1992 [9]. ISO R10 format data (1/3 octave) is required for calculating SLR according to ITU-T Recommendation P.79 [25]. Measured data can be converted using the procedures in section 9.3.8. 9.3.8 Conversion of Measured Data to ISO R 10 Format. Measurements may be performed in various frequency formats, depending upon the analysis method employed. Response measurements of hands-free telephones can contain numerous peaks and dips. This conversion, therefore, should be performed using “band-averaging”. The measured points within a particular 1/3 octave band are “power averaged” according to equation 2. At each ISO R10 preferred frequency (Eq. 2) 1 H ′( f ) = 10 log 10 N where H ′( f ) N ∑ 10 i =1 Hi 10 = response at the new preferred ISO R10 frequency f = preferred ISO R10 frequency N = number of response values within the 1/3 octave band centered at f = index for each response value within the 1/3 octave band i Hi = measured response value (in dB) For the lowest frequency within the band, i = 1. For the highest included frequency, i = N. The 1/3 octave passband limit frequencies can be calculated as: Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 43 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS (Eq. 3) ( n /10 ) ± 0.05 f = 10 where n is the Band Number. Example: For the 100 Hz band, the Band Number = 20; For the 125 Hz band, the Band Number = 21, etc. Table L1 in Annex L contains a list of the band numbers, 1/3 octave center frequencies, and corresponding pass band limits. For measured data at frequencies coinciding with a band-edge frequency (i = 1 and/or i = N), reduce the value by 3 dB, and use that data point in both the upper and lower frequency band calculations. For constant percentage bandwidth measurements, there will always be the same number of points for each converted band (4 or 8, for 1/12 or 1/24 octaves, respectively). For constant bandwidth data (e.g., FFT) on a log frequency axis, the measurement data will appear under sampled at low frequencies and over sampled at higher frequencies. 9.3.9 Mid-Band Average Send Sensitivity. Unlike a handset or headset, a hands-free telephone is not closely coupled to the mouth and ear during use. Therefore, a single-number sensitivity calculation more general than loudness rating may be appropriate for some applications. The mid-band average sensitivity is useful for estimating the electroacoustic transducer sensitivity and/or output level. The mid-band average send sensitivity is similar to a microphone sensitivity figure. The mid-band average send sensitivity is calculated using equation 4: (Eq. 4) 2500 SS = ∑H S (f) f = 500 8 [dB V/Pa] where HS(f) are the send response values (in dB V/Pa) at the ISO R10 preferred 1/3 octave frequencies from 500 Hz to 2500 Hz. For responses in other formats, the data should be converted using the method described in section 9.3.8. 9.3.10 Send Directionality. The directional characteristics of the HFT are useful in determining how well the device performs when the user is not directly in front of the telephone - such as in a conference situation. The overall directional characteristics are reported as Send Loudness Rating Directionality (SLRD). SLRD is send loudness rating vs. Angle around the HFT, normalized to the loudness rating at 50TP. Due to variations in the frequency response around the HFT, it is also advisable to investigate each response at the various measurement points. For example polar plots at various frequencies can be derived from these responses. 9.3.10.1 Measurement Procedure. The HFT and mouth simulator are positioned as described in section 9.2, using 50TP or RTP as appropriate. This position shall correspond to the 0 degree angle. The send frequency response of the HFT is measured at the 0 degree angle and the loudness rating calculated from the data. The mouth is then moved counter-clockwise (or the HFT rotated about its Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 44 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS physical center) to the second measurement angle. Take care to keep the center of the lip ring of the mouth simulator at a constant distance from the physical center of the HFT. A second frequency response measurement is made and the loudness rating calculated. This process is continued for the remaining positions of interest. Twelve measurements, at 30-degree intervals, are suggested as a minimum. 9.3.10.2 Data Presentation. Once the send loudness ratings are calculated the data may be presented in the form of a table of loudness rating verses angle or may be presented graphically as shown in figure 9, below. To plot the data graphically the following method is suggested: 1) Take the Loudness Rating at the 0 degree angle as a 0 dB reference. 2) Calculate the delta between loudness ratings at each of the data points and the 0 dB reference. The more positive the delta the quieter it is. 3) Plot the deltas for each angle around the HFT. Table 7 and figure 9 illustrated two example data sets. Example 1 is characteristic of a hands-free telephone with an omnidirectional microphone pattern. Example 2 is more directional with the most sensitive position being at the front of the telephone. ANGLE Degrees CCW 0 30 60 90 120 150 180 210 240 240 300 330 Example 1 SEND Delta LOUDNESS relative to RATING SLR at 0 degrees (SLR) dB dB 15 16.9 16.6 17.1 15.0 17.2 17.1 18.2 0.0 +1.4 +1.6 +2.1 0.0 +2.2 +2.1 +3.2 Example 2 SEND Delta LOUDNESS relative to RATING SLR at 0 degrees (SLR) dB dB 16.6 18.9 18.8 25.9 27.5 24.4 26.0 28.5 0.0 +2.3 +2.2 +9.3 +10.9 +7.8 +9.4 +11.9 COMMENT Reference Position Example 2 Send is 11.9 dB quieter at 210 degree position. 16.2 +1.2 26.5 +9.9 17.8 +2.8 23.1 +6.5 17.1 +2.1 19.7 +3.1 16.5 +1.5 18.7 +2.1 Note: The more positive the Loudness Rating the quieter the response. Table 7 - Send Loudness Rating Directionality Examples Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 45 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Example Plot 2 90° Example Plot 1 60° 120° 30° 150° 0° 180° 12 9 6 3 Front of Telephone 0 dB dB Delta (loss) relative to SLR at 0 degrees 330° 210° 240° 300° 270° Note: The HFT is loudest at 0 degrees in this example Figure 9 - Send Loudness Rating Directionality with Two Sample Plots 9.4 Receive. 9.4.1 General. Receive measurements are made by connecting the hands-free telephone to the appropriate interface as covered in section 7.5. Select the desired test signal to be applied as an electrical stimulus at the Receive Electrical Test Point (RETP), and measure the resultant signal at the 50 cm Test Point (50TP). Measure the electrical level across a 900 ohm source as prescribed in section 8.3.2. Adjust the electric output spectrum as defined in section 8.3.1 for sinusoidal and the other signals. These signals are defined in section 6. 9.4.2 Frequency Response. The receive frequency response HR(f) is given by the equation below: (Eq. 5a) G (f) H R (f) = 20 log 50TP in dB Pa / V G RETP (f) where: G50TP(f) is the RMS power spectrum at the 50TP GRETP(f) is the power spectrum at the Receive Electrical Test Point (RETP) Note: If the cross-spectrum method is used, the receive frequency response becomes: Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 46 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS (Eq. 5b) H R (f) = 20 log G (RETP)(50TP) (f) G (RETP) (f) in dB Pa / V where: G(RETP)(50TP) (f) is the cross spectrum. These power spectra can be obtained by using discrete measurements of the power at each point as outlined in sinusoidal methods, or by continuous spectrum methods. Any of the test methods described in section 5 may be used to measure frequency response. The sine-based methods are the simplest procedures to implement, but require an anechoic chamber. Free field techniques incorporate the most complex procedures, however, they may be conducted in an ordinary room. Results should be reported as dB Pa/V. 9.4.3 HATS DRP to ERP Correction. Due to the construction of the ear simulator in the HATS, resulting receive data will be referred to the ear drum (DRP). In order to refer the receive response to the ERP, a correction must be applied to the measured receive response. Depending upon the format of the measured data, the correction values in annex X should be added to the measured response. Table X.1 is from ITU-T Recommendation P.58. Table X.2 is the same correction at the ISO R40 preferred frequencies. It may be necessary to convert this table to other formats. The receive response should always be presented referred to the ERP. 9.4.4 Noise. The test signal chosen for the receive frequency response measurement (9.4.2), is to be applied for at least 10 seconds prior to the noise measurement. Measure the acoustical output signal at 50TP, averaging over 30 seconds, beginning 500ms after the receive test signal is removed. The measurement may be performed directly with an A-weighting filter, or the A-weighted result may be calculated from a spectrum measurement. The result is receive idle noise, measured in dB re 20 Pa, expressed as L50TP(A). To test the validity of this measurement, de-activate the HFT and repeat the measurement for 5 seconds, expressed as LROOM(A). If LROOM(A) is at least 10dB less than L50TP(A), the measured L50TP(A) is considered valid. If not, the receive noise can only be said to be below LROOM(A). If the above test fails, first be sure the test room and measuring microphone meet the requirements of this standard. If very low noise levels must be measured and the noise in the test room cannot be further reduced, an alternate procedure can be used. Reposition the test microphone at half the standard distance from the HFT. (If 50TP is the appropriate test position, the microphone should be placed 25cm from the HFT. Otherwise, use ½ RTP.) Repeat the measurement procedure. If the validity test passes, subtract 6dB from the measured result and report that value. This alternate procedure is allowable only for HFTs with one loudspeaker, and will give approximate results. Noise measurements closer than ½ the standard test distance are not recommended. 9.4.5 Input-Output Linearity. Measure the hands-free telephone set as described in section 9.4.2 using any of the recommended test methods prescribed in that section. Apply electrical input levels representing the total range that the handsfree telephone is expected to encounter in use. For a linear characteristic, the output level should follow an input level change dB for dB. 9.4.6 Distortion. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 47 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Distortion tests for HFTs are derived from standard harmonic distortion measurement techniques; however, a continuous sine wave signal is frequently not a suitable signal for testing HFTs. Alternative signals are specified below, as well as a test to determine which signal is suitable for the HFT under test. In principle, these methods can be extended to difference-frequency distortion measurements. The stimulus consists of two sine waves (or two narrow-band pseudo-random noise signals) of equal amplitude, with the stimulus level calculated on a power basis. Analysis is with either a notched weighting filter or by bandpass filters (or equivalent algorithm). Difference-frequency distortion tests may be the best way to evaluate an HFT above about 1000Hz, where the harmonics of a single tone (or narrow-band pseudo-random noise signal) lie above the cutoff frequency of the HFT. 9.4.6.1 Test Signal Three types of distortion test signals are recommended. These include continuous sine waves (6.4.1), modulated sine waves and narrow-band pseudo-random noise. A square wave (6.3.1), sine wave (6.3.2), or a pseudo-random modulation (6.3.3) can modulate sine wave signals. The narrow-band pseudo-random noise (6.4.2) may be used as the default test signal for all distortion measurements. It should have an effective bandwidth of 25 to 50Hz, and out-of-band signals should add no more than 0.5dB to the overall level of the test signal. A period of 250ms is recommended for this signal, since this will provide some modulation at a 4Hz rate. The crest factor should be 9 3dB. When a narrow-band pseudo-random test signal is not suitable, modulation may be applied in a similar manner to modulating a sine wave. Receive distortion should be measured at the standard stimulus level of –16dBV, as well as other levels in the range from –30dBV to 0dBV. Measurements should also be made over a range of frequencies within the telephone band, such as the ISO R-10 preferred frequencies from 315Hz to 3150Hz. Test frequencies over ½ the upper frequency limit of the HFT under test may be useful in evaluating the loudspeaker system. Distortion measurements should be made over a range of volume control settings, including minimum, reference, and maximum. Note: It may be advisable to adjust the test signal level to produce a specified sound pressure output at 50TP, similar to the method in TIA-470B for handsets. A good case could be made for this approach, since users tend to adjust volume controls based on absolute level of the received signal. 9.4.6.2 Suitability Test To test the suitability of a particular distortion test signal, receive frequency response should first be measured at the standard test level (9.4.2). The proposed distortion test signal should then be applied at each distortion test frequency, at the standard level, and the receive frequency response measured at those frequencies. If the result is within 2dB of the comparable values previously obtained in the complete receive frequency response, then the proposed distortion test signal is suitable. Distortion does not have to be measured using the same test signal as receive response. 9.4.6.3 Distortion Measurement The output fundamental is measured at 50TP, with a bandpass filter or equivalent algorithm. The distortion is measured by use of a A-weighting filter according to ANSI S1.4, but with a notch added to eliminate the test signal. The output of the notched filter includes harmonics as well as noise. The distortion is divided by the fundamental and expressed in percent. The result is signal-to-total-distortion-and-noise. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 48 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The notch must attenuate the test signal by at least 50dB. This will result in a distortion floor of 0.3%, permitting measurements of distortion from 1% and above with 6% or better accuracy. The filter shall be compensated for the notch on a power basis. A constant shall be added to each point of the notched filter frequency response, so that the power sum of all points, on a logarithmic frequency scale, is equal to the power sum all frequency response points of the original A-weighting filter. Harmonic analysis using bandpass filters, or an equivalent algorithm, is recommended for further diagnostic information. At each harmonic frequency, the bandpass filter output is divided by the fundamental and expressed in percent. The result is 2nd, 3rd, etc. harmonic distortion at each test frequency. Distortion measurements should be made over a range of volume control settings, including minimum, midway, and maximum volume. 9.4.7 Loudness Rating Applications. The receive frequency response defined in section 9.4.2 can be used directly in calculating ROLR according to IEEE Standard 661-1992. ISO R10 format data (1/3 octave) is required for calculating RLR according to ITU-T Recommendation P.79. Measured data can be converted using the procedures in section 9.3.8. 9.4.7.1 Corrected Receive Loudness Rating Using a Free-Field Microphone. Receive loudness ratings calculated according to IEEE Standard 661-1992 and ITU-T Recommendation P.79 assumes listening with one ear, using a handset. Both ears are used to listen to a hands-free telephone, resulting in a perceived loudness advantage compared to handset listening. Hands-free receive loudness rating calculations must therefore be corrected to enable subjectively relevant comparison between handset and hands-free loudness ratings. The receive loudness rating is first calculated according to the relevant standard. Then 14 dB is subtracted from the result giving the hands-free receive loudness rating. The effect of the correction is to make the hands-free correction appear 14 dB louder than the uncorrected rating. The 14 dB correction is provisional, and is under continued study. 9.4.7.2 Corrected Receiver Loudness Rating When Using HATS When using HATS, the correction is 12 dB. Most of this correction is due to binaural versus monaural listening. Other factors include head diffraction, pinna effects, and the position of real ears relative to 50TP. 9.4.8 Mid-Band Average Receive Sensitivity. Unlike a handset or headset, a hands-free telephone is not closely coupled to the mouth and ear during use. Therefore, a single-number sensitivity calculation more general than loudness rating may be appropriate for some applications. The mid-band average sensitivity is useful for estimating the electroacoustic transducer sensitivity and/or output level. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 49 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The mid-band average receive sensitivity is calculated using equation 6: 2500 SR = ∑H R (f) f = 500 8 [dB Pa/V] (Eq. 6) where HR(f) are the receive response values (in dB Pa/V) at the ISO R10 preferred 1/3 octave frequencies from 500 Hz to 2500 Hz. For responses in other formats, the data should be converted using the method described in section 9.3.8. The normalized receive output sensitivity for a measurement at 50TP is: S! RO = S R + 88 [dBSPL at 1 m for 1 Volt input] (Eq. 7) The normalized receive output sensitivity is similar to a sensitivity specification for a loudspeaker. 9.4.9 Receive Directionality. As with the send condition, the receive characteristics of the HFT can vary with position around the telephone. The overall receive directional characteristics are reported as Receive Loudness Rating Directionality (RLRD). RLRD is a measure of the Receive Loudness Rating verses angle around the HFT. Measurement methods and data presentation are the same as used for Send Directionality, Section 9.3.10, with the exception that receive measurements are made with the measurement microphone at the 50TP or RTP. Due to variations in the receive frequency response around the HFT, it is advisable to also investigate each response at various measurement points. For example, polar plots at various frequencies can be derived from these responses. 9.5 Digital Only. 9.5.1 Echo Path Response. The echo path frequency response is the RMS power spectrum at the send electrical test point (SETP) divided by the spectrum at the receive electrical test point (RETP). (Eq. 8a) H EP (f) = 20 log G SETP (f) in dB V / V G RETP (f) Note, if the cross-spectrum method is used, the echo path frequency response becomes: (Eq. 8b) H EP (f) = 20 log G (RETP)(SETP) (f) G (RETP) (f) in dB V / V where: G(RETP)(SETP) (f) is the cross spectrum GRETP (f) is the input autospectrum These power spectra can be obtained by using discrete measurements of the power at each point as outlined in sinusoidal methods, or by continuous spectrum methods. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 50 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Results should be reported as dB V/V. 10. Voice Switching Measurements 10.1 General. Most hands-free telephones contain voice-switched circuits whose main function is to avoid singing through acoustic feedback. In various ways, such circuits insert a loss in either the sending or receiving direction relative to full send or receive. Switching from one direction to the other occurs when a signal above a certain threshold is applied from the opposite direction, or when the control circuit, taking into account the relative levels and the nature of the signals in both directions, allows the switching. The amount of switch loss a set employs determines the type of set as described below. Some designs employ acoustic and or line echo cancellation, and can closely approximate full duplex performance; however, many current echo canceller designs still need some switching to meet necessary requirements for echo return loss (ERL). In the case of acoustic echo cancelling hands-free telephones, voice switching implementation is similar to voice-switch only telephones, but with shallower switched loss depth. The following sections define the switching parameters and describe methods for testing them. 10.2 Classification. Hands-free sets are classified into three types on the basis of duplex capability, and this classification is determined by the attenuation range (aH) as defined below: Type 1 has less than 3 dB of switched loss depth and can be considered “Full Duplex”. Type 2 has between 3 and 20 dB of switched loss depth and can be considered “Partial Duplex”. Type 3 has more than 20 dB of switched loss depth and has no duplex capability. Measurements of voice switching characteristics may be divided into two categories: A: Characteristics for alternate conversation, in which two parties communicate by alternating speech (single talk) spurts without interrupting each other. In this case, it may be assumed that the voice switch circuit returns to an idle state before being activated by an input signal in either direction. B: Characteristics for simultaneous conversation, in which both parties may interrupt each other by simultaneous talk (double-talk), or where speech at one end of a connection breaks through acoustic or network noise that is present at the other end. The first case is of fundamental importance, as its characteristics also affect simultaneous conversation characteristics, and hands-free telephones should therefore always be checked in that respect. The second case is a difficult environment for typical switched-loss hands-free telephones, as well as hands-free telephones employing acoustic echo cancellation. 10.3 Switching Parameters. There are six fundamental voice switching parameters, threshold level (ITH), build-up time (TR), hang-over time (TH), switching time (TS), take-over time (TT) and attenuation range (aH). A suitable choice of switching parameter values can minimize the degradation of speech quality introduced by voice switching. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 51 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Improper choice of parameter values, particularly switching times, may lead to serious clipping effects and loss of initial or final consonants in speech. Threshold levels should be chosen so that switching is not interrupted by random environmental noise sources at either end of the call, and must also allow the user to move about close to the HFT. In addition, ambient room/network noise effects on threshold should not impair performance. Ambient noise levels can be used to improve threshold performance, as talkers tend to speak louder in a noisy environment than in a quiet one. Build-up time should be short enough so that the initial transient components of speech are not lost, but not so short that insertion loss removal would be noisy. Hang-over time should be long enough to cover average pauses in speech so that intermittent unwanted switching does not occur before the initial talker is finished, but short enough to allow for reasonable breakin from the second talker. Switching time from one active to active state to the other should balanced to best simulate full duplex operation. Switching time is also dependent on both build-up time and hang-over time. Attenuation range can be measured during the measurement of switching time from one active state to another. The attenuation range is obtained from the difference between the maximum level at full activation and the minimum level obtained immediately after transmission reversal. This represents the maximum electrical return loss introduced by switched loss only in the HFT (does not include the loss due to an acoustic echo canceller). At the reference volume control setting, this value can be used to classify the HFT as Type 1, Type 2 or Type 3, but be aware that the HFT can, in some cases, have different attenuation ranges depending upon the volume control setting. Many of these parameters will be somewhat different in the case of a full or partial duplex hands-free system (Type 1 & 2) that uses acoustic echo cancellation. If voice switching is employed along with acoustic echo cancellation, the switching depth may be reduced, thus making measurements more difficult. 10.4 Test Conditions. Unless stated otherwise in the following procedures, all test equipment, the test environment, impairments, calibration and positioning are the same as for the transmission tests. Refer to Sections 7,8, and 9.2.2 for all HFTs. Refer to Sec. 9.3.2 (HFT microphones) and 9.4.3 (HATS) when applicable. Unless stated otherwise, a test hybrid as described in Section 7.2.5 is necessary for applying the following tests to an analog HFT. It is important for the terminal to function properly in the face of impairments such as acoustic noise, network noise and echo path changes both acoustic and network in nature. It is recommended that testing be repeated along with the application of combinations of impairment. Refer to section 7.4. 10.4.1 Signal Levels. The send levels should be variable from -50 to 0 dBPa and the receive levels should be variable from -70 to 0 dBV. 10.4.2 Loop Lengths. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 52 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS It is advisable to test analog hands-free telephone switching parameters with a variety of loop lengths as defined in section 7.4.2.1, or whatever the intended deployment recommends. In all cases, state the conditions. 10.4.3 Noise Levels. Network noise environments of -90 to -50 dBmp, as measured at the SETP, should be employed as recommended for switching parameter tests above. Network noise is defined in section 7.4.1.4. A range of 35 to 65 dBA Hoth acoustic noise should be used as recommended for the tests above. ITU recommends that 50 dBA be used as a Hoth noise level. Hoth room noise is defined in section 7.4.2.2 and can be generated as described in annex D. For typical measurement conditions, vary the network noise from -90dBmp to -50 dBmp in 10 dB increments. Similarly, vary the acoustic Hoth noise [see annex D] from 35 dBA to 65 dBA in 10 dB increments. If the test chamber permits, also perform measurements in as quiet an ambient room noise condition as possible. For measurements that require both electrical and acoustic noise, a matrix combination of the above network and acoustic noise levels should be performed. The above matrix can be varied according to need depending on the type of switching parameter. It is valuable to test threshold levels under a wider variety of conditions as compared to switching times. 10.5 Test Parameters. 10.5.1 Threshold Level. Threshold level (ITH) is the minimum signal level necessary for removing insertion loss. Due to the unstable hysteresis-type nature of the voice switch around threshold, the “full on” threshold is defined as the minimum level for which the voice switch is in the stable state where the insertion loss can be completely removed. The envelope of the 1004 Hz periodic tone burst input signal is shown in figure 10. Adjust amplitude I1 to determine ITH. Adjust amplitude I2 to be zero, i.e. a silent period. Adjust the on time T1 to 100 ms. Adjust the off time T2 to be greater than the hang-over time. For Type 2 and 3 sets, send and receive thresholds can be obtained by increasing the amplitude from a low level until “full on” switching occurs. The send threshold level is measured in dBPa with respect to the MRP and the receive threshold is measured at the RETP. The reference volume control setting should be used for these tests, but other volume control settings should also be used to characterize its effect on threshold. For both transmission directions, measure the threshold level under quiet conditions, noisy network conditions and noisy room conditions. See section 7.4 for basic environment conditions. 10.5.2 Build-Up Time. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 53 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Build-up time (TR) is the time from the input signal going above the threshold level up to 50% of the complete removal of the insertion loss. Note that 50% refers to a representation of the output amplitude on a linear scale. The envelope of the 1004 Hz periodic tone burst input signal is shown in figure 10. For receive, it is recommended that amplitude I1 be adjusted to -30 dBV. For send, it is recommended that amplitude I1 be adjusted to -4.7 dBPa, amplitude I2 may be set to zero. Adjust the on time T1 to 100 ms. Adjust the off time T2 to be greater than the hang-over time. For both transmission directions, measure the build-up time at the reference volume control setting with no network or acoustic impairments. 10.5.3 Hang-Over Time. Hang-over time (TH) is the time from the input signal going below the threshold level up to 50% of the complete insertion of the switched loss. Note that 50% refers to a representation of the output amplitude on a linear scale. The envelope of the 1004 Hz periodic tone burst input signal is shown in figure 10. For receive, it is recommended that amplitude I1 be adjusted to -30 dBV. For send, it is recommended that amplitude I1 be adjusted to -4.7 dBPa. Adjust amplitude I2 to a level just below ITH or set I2 so that the set will switch to an idle mode. Adjust the on time T1 to 100 ms. Adjust the off time T2 to be greater than the hang-over time. For both transmission directions, measure the hang-over time under typical line and room conditions with the volume control in the nominal position. See section 7.6 for basic environment conditions. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 54 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Threshold I1 (S) Send Input Pressure I (S) ITH (S) 2 T1 T2 TH(S) T (S) R Send Output Voltage time _1 2 O2(S) O1(S) O1(S) O3(S) 1 _ 2 O2(S) TR(S) = measured build-up time (send) TH(S) = measured hangover time (send) Figure 10a - Send Build-up and Hangover Time Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 55 time STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Threshold I1 (R) Receive Input Voltage I (R) ITH (R) 2 T1 T2 TH(R) T (R) R Receive Output Pressure time _1 2 O2(R) O1(R) O1(R) O3(R) 1 _ 2 time O2(R) TR(R) = measured build-up time (receive) TH(R) = measured hangover time (receive) Figure 10b - Receive Build-up and Hangover Time 10.5.4 Switching Time and Thresholds Between Two Active States. Switching time (TS) is the time taken to switch from one active state to the other, i.e. from full send to full receive or from full receive to full send in alternating conversation. The signal in the first direction is removed 10 to 30 ms after application of the signal in the opposite direction. TS is measured from the removal of signal in the first direction to 50% removal of loss in the opposite direction. 50% refers to a representation of the output amplitude on a linear scale. Active-to-active thresholds are measured under the same conditions, and are determined when full transmission direction reversal no longer occurs. The following method typically applies to both Type 2 and Type 3 sets. This test method recommends that the receiving test microphone be placed halfway between the set and mouth simulator to keep relative levels of send and receive even, facilitating measurements. Mounting the microphone at the 50TP reference position is recommended for break-through threshold level measurements. For receive to send, the hands-free telephone is subjected to a periodic 1000 Hz tone burst in the receive direction. Within a delta (∆) 10 to 30 ms before the receive burst is completed, a send burst is sent to the set to switch the transmission direction (see figure 11a). Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 56 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS For send to receive, the hands-free telephone is subjected to a periodic 1000 Hz tone burst from the mouth simulator. Within a delta (∆) of 10 to 30 ms before the send burst is completed, a receive burst is sent to the set to switch the transmission direction (see figure 11b). The overlapped time interval before the primary tone burst is completed and the secondary tone burst is sent should be within the range of 10 to 30 ms. Send and receive tone bursts should have adjustable amplitudes to facilitate proper switching before the timing measurement is made. For attenuation range (aH), upon capture of the active-to-active state reversal of the transmission direction, record the amount of loss removal for both transmit to receive, and receive to transmit, and obtain the difference between the maximum level at full loss removal and the minimum level obtained after transmission reversal. If the transmit to receive is not the same as receive to transmit, then the larger of the two is the attenuation range. The send levels should be varied from -20 to 6 dBPa and the receive levels should be varied from -55 to -10 dBV to measure break-through switching thresholds where full transmission direction change occurs. This should be done for typical noise environments of -90to -50 dBmp network noise levels and acoustic Hoth noise levels of 35, 45, 55, and 65 dBA (see section 7.6). Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 57 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Threshold I1 (S) I0 (S) Send Input Pressure time T1 ∆T I0 (R) Receive Input Voltage time TS (R-S) T0 1 _ 2 Send Output Voltage O1 (S) O 1 (S) time O0 (R) Receive Output Pressure time T (R-S) = measured switching time (receive to send) S Figure 11a - Send Switching Time Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 58 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Threshold I1 (R) I0 (R) Receive Input Voltage time T1 ∆T I0 (S) Send Input Pressure time TS (S-R) T0 1 _ 2 Receive Output Pressure O1 (R) O 1 (R) time O0 (S) Send Output Voltage time T (S-R) = measured switching time (send to receive) S Figure 11b - Receive Switching Time Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 59 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 10.5.5 Take-Over (Break-Through) Time. Take-over time (TT) is the time taken to switch from one transmission direction to another, in double talk conversation. It is measured from the start of an interrupting signal to the time of 50% removal of the insertion loss in the interrupting direction. The signal in the first direction is applied continuously during the test. Take-over time is measured as shown in figures 12a and 12b. The test microphone is placed at the near field test point (NFTP). The test results may be influenced by To, the conditioning time between the start of the first signal and the application of the interrupting signal. 200ms is the standard value for T0. Take-over time can vary greatly depending on the relative levels in each direction. Testing over a range of levels is recommended. Test levels in each direction and To should be repeated along with TT. TTR_S denotes take-over time from receive to send, TT_SR the reverse. 11. Acoustic Echo Canceller Measurements An Acoustic Echo Canceller (AEC) is a device or function that aids in improving the duplex performance of an HFT by reducing the echo perceived by the far end. Even a superior acoustic design will suffer from room reverberation and acoustic coupling due to the sensitivity of the microphone and the close proximity of the microphone and loudspeaker. Unlike switched-loss echo suppression, an AEC attempts to remove the echo signal by estimating the returned echo and subtracting the echo from the transmitted signal. An AEC can reduce or eliminate the amount of switched-loss necessary to maintain stability of the acoustic system, resulting in improved HFT performance. Ideally, the AEC reduces the amount of switched-loss necessary, and results in a Type 1 HFT which allows full-duplex communication. Conversely, a poor AEC implementation will impair HFT performance. It is important that the HFT performs reasonably well in terms of loudness ratings, frequency response and noise for both transmission directions. By reducing transmit sensitivity to below nominal, it is possible to enhance echo control and full duplex operation. This design will result, however, in degraded performance of the HFT. Send and receive transmission measurements (Section 9) should be made before proceeding with the tests in this section. Type 1 and Type 2 HFTs may employ switched-loss while the AEC adapts to the environment and speech signals. The characteristics of the switched-loss operation can be evaluated using the techniques described in Section 10. However, care must be taken to ensure the AEC does not affect the switched-loss characteristics as they are evaluated. This can be accomplished by allowing the AEC to fully train in both transmission directions. The perceived quality of an AEC is a function of both the physical characteristics of the echo canceling function and the subjective nature of human hearing. The following test method provides means to evaluate how the AEC will be perceived as well as how the AEC operates. The use of post processing of the test results, temporal weighting, specific test signals and real-world environment allow the method to produce objective measures which correlate to subjective perception of performance. This section contains procedures for evaluating the performance of an Acoustic Echo Canceller (AEC) as implemented in a full duplex (Type 1) or partial duplex (Type 2) HFT as defined in section 3. The measurable quality parameters included are: echo path delay (EPD), echo return loss (ERL), convergence Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 60 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS time (TC), attenuation as a function of time (A), and front end clipping time (TF). Procedures are given for both single-talk (ST) and double-talk (DT) conditions. Measurement of the above parameters requires specialized techniques and careful control of environment and test signals to obtain accurate and repeatable results. Unlike Type 3 HFTs employing only voice switching,, an AEC device typically employs non-linear processing which results in not only a time varying device but an adaptive device. That is, the HFTs to which this section applies will respond differently depending on the environment, the type and level of test signals applied, as well as the order and timing of the measurement steps. An additional difference between AEC and voice switched HFTs is that the AEC device is likely to introduce some delay between the applied signal and the transmitted echo. This delay complicates the ERL measurements and requires some form of delay compensation to allow for accurate measurements. 11.2 Test Signals. Real speech (6.6.3) and a sine wave embedded in real speech (6.7.2.9) are recommended for most of the tests in this section, and could be used for conformance once a standardized speech sample set is agreed upon. The Composite Source Signal (CSS) (6.7.1.1) has been proposed as an alternative, though it can provide an overly optimistic ERL if the AEC has a non-linear processor (NLP). The artificial voice according to ITU-T recommendation P.50 (6.6.1.1) has not yet been fully investigated for use in these tests. Random or pseudorandom noise (6.4.2 and 6.5) are recommended for EPD, while CSS is recommended for TC. For a larger discussion on test signals see section 6. Unless otherwise noted, receive test signals should be presented at -18 dBV at RETP, and send test signals should be -4.7 dBPa at MRP. Since HFT performance may change with signal levels presented to it, it is advisable to investigate the HFT using different test signal levels. For example, a range of -26 to -6 dBV is suggested for receive, and -6 to +6 dBPa for send. 11.3 Test Conditions. Unless stated otherwise in the following procedures, all test equipment, the test environment, impairments, calibration and positioning are the same as for the transmission tests. Refer to Sections 7,8, and 9.2.2 for all HFTs. Refer to Sec. 9.3.2 (HFT microphones) and 9.4.3 (HATS) when applicable. Unless stated otherwise, a test hybrid as described in Section 7.2.5 is necessary for applying the following tests to an analog HFT. It is important for the terminal to function properly in the face of impairments such as acoustic noise, network noise and echo path changes both acoustic and network in nature. It is recommended that testing be repeated along with the application of combinations of impairment. Refer to section 7.4. 11.4 Round Trip Echo Path Delay (EPD). Echo audibility is dependent upon the round trip delay in the echo path. This delay is measured by determining the impulse response of the HFT. Continuous white or pink noise, random or psuedorandom, are recommended test signals (6.4.2 and 6.5). The compound signal of Section 6.7.2.4 or 6.7.2.8 may also be used. The HFT can be placed in nearly any reverberant or non-reverberant environment, as the first acoustic echo will be due to direct coupling in or near the HFT. The test signal is applied at RETP at –18dBV for 10 seconds so that the acoustic echo canceller reaches full convergence. No signal other than the acoustic return from the loudspeaker(s) is applied to the Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 61 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS microphone(s). After 10 seconds, echo is measured at SETP, and the impulse response is then calculated between RETP and SETP. Echo path delay is the delay of the primary peak in the impulse response. The first peak is the start of the impulse unless a subsequent peak is at least 10 dB greater. 11.5 Echo Return Loss (ERLST) – Single Talk. Echo return loss from the network interface perspective is measured. Three methodologies are described, in descending order of preference. Only one method need be conducted, with the first strongly recommended. Real speech (6.6.3) is the recommended test signal for all three methods. 11.5.1 Echo Return Loss, Temporally Weighted – Single Talk (ERLTST). Test signal is applied at RETP for 30 seconds so that the different functional units (in particular the acoustic echo canceller) reach their steady states. No other signal than the acoustic return from the loudspeaker(s) is applied to the microphone(s). Record the electrical signals at RETP and SETP for the next 1 minute. Align the RETP and SETP recordings in time by adding delay equal to EPD to the RETP signal. The time dependent value ERLTST is the difference (in dB) between the signal level at RETP and SETP calculated using the algorithm in annex H. ERLTST is named ALERLt in Annex H. Background for the algorithm is in annex G. 11.5.2 Echo Return Loss, Segmental – Single Talk (ERLSST). Segmental ERL, ERLs, is described in the Freetel methodology 8,9. Echo and source powers are broken into 32 ms segment averages, then echo return loss is calculated for each segment. An averaging window ten segments long slides every 32 ms to smooth the results. It is more a measure of canceller behavior than user reaction. ERLs should not be calculated for segments in which the stimulus at RETP is below a threshold 10dB less than the long term rms level of the signal at RETP, so that source inactivity is not falsely interpreted as high echo return loss. With continuous speech, this will statistically occur less than 3% of the time (ITU-T Recommendation P.50, figure 4). Instead, the current segment's ERLs is made equal to the previous segment, providing smoothing. The test method is the same as 11.5.1, except that the time-dependent value ERLSST is the difference (in dB) between the signal level at RETP and SETP calculated as shown in annex H, except using 32 ms power averaged segments, averaged over 10 segments. Background for the algorithm is in annex G. 11.5.3 Weighted Terminal Coupling Loss – Single Talk (TCLWST). TCLw is defined in ITU-T G.167 and ITU-T G.122; however, TCLw is not recommended, as it does not adequately account for audible temporal fluctuations in TCL. ITU-T SGXII reports show that segmental TCL can vary by 10 dB or more over time. 8 Enhancements of hands-free telecommunications, Esprit Consortium, Annals of telecommunications, 49 no 7-8 1994. 9 Methodology of Evaluation and Standards, Deliverable 1.2, Freetel, July 29 1993. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 62 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The test method is the same as 11.5.1, except that no time alignment is needed. The time independent value TCLWST is the difference (in dB) between the 1-minute averages of the signals at RETP and SETP, calculated as shown in G.122, trapezoidal method. 11.6 Convergence Time (Tc). The test procedure is an extension of 11.5.2, (ERLSST), except CSS is recommended as the test signal. Convergence measurement is an extension of the segmental ERL measurement, with further post processing. Test signal is applied at RETP for 90 seconds so that the different functional units (in particular the acoustic echo canceller) reach their steady states. No other signal than the acoustic return from the loudspeaker(s) is applied to the microphone(s). Record the electrical signals at RETP and SETP for the entire 90 seconds. Align the RETP and SETP recordings in time by adding delay equal to EPD to the RETP signal. Calculate the long term ERLSST over the final 60 seconds, using the method of section 11.5.2, and averaging all 320 ms segments. For the first 30 seconds, calculate the time dependent value ERLSST as the difference in dB between the signal level at RETP and SETP, calculated as in section 11.5.2 using 32 ms power averaged segments, averaged over 10 segments. However, in this case ERLSST should not be calculated for segments in which the stimulus at RETP is below a threshold 25dB less than the long term rms level of the signal at RETP. Determine the time required for the time dependent value ERLSST to reach within 3 dB of the long-term average ERLSST. This is the convergence time, Tc 11.7 Echo Return Loss, Temporally Weighted – Double Talk (ERLTDT). Real speech (6.6.3) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied to RETP. The “talker initiating double talk” mask is applied at the mouth simulator. Carry out the test described in Annex I, section I.5. Record the electrical signals at RETP and SETP during the 20-second tone application. Align the RETP and SETP recordings in time by adding delay equal to EPD to the RETP signal. The time dependent value ERLTDT is the difference (in dB) between the test signal level at RETP and the echo signal level at SETP, calculated using the algorithm in annex H. ERLTST is named “ALERLt” in Annex H. 11.8 Send Speech Attenuation During Double Talk (ADT_S). Two methods are given. The first measures the attenuation vs time after entering doubletalk for a specific frequency. The result of this method may depend greatly on the exact nature of the speech signal used, particularly as doubletalk is begun. There may also be a dependence on the frequency of the measurement tone, which is a sine wave embedded in the real speech creating doubletalk. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 63 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The second method measures a long-term conversational average. The result may depend greatly on the nature of the speech signals used, especially their temporal characteristics and degree of correlation. The result is the send response during doubletalk, which can be compared to the single-talk response measured in Section 9. 11.8.1 Send Speech Attenuation During Double Talk vs Time (ASTDT). A sine wave embedded in real speech (6.7.2.9) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied at MRP. The “talker initiating double talk” mask is applied at RETP. Carry out the test described in Annex I, section I.3. Using the 8 ms sliding averaging window on the sine signal measured at SETP, the time dependent value ASTDT is the difference (in dB) between the first 8 ms average before double talk and each 8 ms average after double talk. 11.8.2 Send Speech Attenuation During Double Talk, Conversational Average (ASADT). When applying this test to an analog HFT, a test hybrid is not strictly necessary. However, use of a test hybrid will improve the signal-to-noise ratio of the measurement, resulting in lower averaging time. TDS sweep or pseudorandom noise with real speech as the bias (6.7.2.3 or 6.7.2.7) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied at MRP. The “talker initiating double talk” mask is applied at RETP. Immediately after the training period is completed, uncorrelated but similar real speech bias signals are applied continuously in both directions at standard levels, until the measurement is completed with sufficient averaging of the TDS sweep for a clean measurement. The TDS measurement sweep is applied in the send direction. Send speech attenuation is the send response obtained during doubletalk divided by the standard send response measured in Section 9 under the same conditions, except in single-talk mode. Send attenuation may also be expressed as the increase in SLR (loudness reduction) between the single-talk and double-talk cases. 11.9 Receive Speech Attenuation During Double Talk (ARDT) Two methods are given. The first measures the attenuation vs time after entering doubletalk for a specific frequency. The result of this method may depend greatly on the exact nature of the speech signal used, particularly as doubletalk is begun. There may also be a dependence on the frequency of the measurement tone, which is a sine wave embedded in the real speech creating doubletalk. The second method measures a long-term conversational average. The result may depend greatly on the nature of the speech signals used, especially their temporal characteristics and degree of correlation. The result is the receive response during doubletalk, which can be compared to the single-talk response measured in Section 9. 11.9.1 Receive Speech Attenuation During Double Talk vs Time (ARTDT) Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 64 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS A sine wave embedded in real speech (6.7.2.9) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied at RETP. The “talker initiating double talk” mask is applied at MRP. Carry out the test described in Annex I, section I.3, substituting receive for send and vice-versa. Therefore, receive and send signals are swapped and results monitored at 50TP. Using the 8 ms sliding averaging window on the sine signal measured at 50TP, the time dependent value ARTDT is the difference (in dB) between the first 8 ms average before double talk and each 8 ms average after double talk. 11.9.2 Receive Speech Attenuation During Double Talk, Conversational Average (ARADT) When applying this test to an analog HFT, a test hybrid is not strictly necessary. However, use of a test hybrid will improve the signal-to-noise ratio of the measurement, resulting in lower averaging time. TDS sweep with real speech or pseudorandom noise as the bias (6.7.2.3 or 6.7.2.7) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied at RETP. The “talker initiating double talk” mask is applied at MRP. Immediately after the training period is completed, uncorrelated but similar real speech bias signals are applied continuously in both directions at standard levels, until the measurement is completed with sufficient averaging of the TDS sweep for a clean measurement. The TDS measurement sweep is applied in the receive direction. Receive speech attenuation is the receive response obtained during doubletalk divided by the standard receive response measured in Section 9 under the same conditions, except in singletalk mode. Receive attenuation may also be expressed as the increase in RLR (loudness reduction) between the single-talk and double-talk cases. 11.10 Send Speech Front End Clipping Time During Double Talk (TSFDT). A sine wave embedded in real speech (6.7.2.9) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied at RETP. The “talker initiating double talk” mask is applied at MRP. Carry out the test described in Annex I, section I.4. Using the 4 ms sliding averaging window on SETP, measure the rms level of the tone. Determine the time at which the tone level rises to within 3 dB of its average level at 200 ms into double talk. Determine the difference in time between this point and the cessation of the tone as seen at SETP. The difference between 200 ms and this length of time is the send speech front end clipping time during double talk, (TSFDT ): If you have managed to read this far, you have won a vacation for two in Maui, compliments of Kruger and Associates Inc. 37 Somerset Dr., Commack, N.Y. 11725-1636, phone (516) 543-5392. This is subject to completion of a skill testing question administered by Kruger and Associates Inc. 11.11 Receive Speech Front End Clipping Time During Double Talk (TRFDT) Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 65 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS A sine wave embedded in real speech (6.7.2.9) is the recommended test signal. See further comments in Annex I, section I.2. The canceller is trained as described in Annex I, section I.1, with the activity mask “talker active just before onset of double talk” applied at MRP. The “talker initiating double talk” mask is applied at RETP. Carry out the test described in Annex I, section I.4, substituting receive for send and vice-versa. Therefore, receive and send signals are swapped and results monitored at receive output. Using the 4 ms sliding averaging window on 50TP, measure the rms level of the tone. Determine the time at which the tone level rises to within 3 dB of its average level at 200 ms into double talk. Determine the difference in time between this point and the cessation of the tone as seen at SETP. The difference between 200 ms and this length of time is the receive speech front end clipping time during double talk, (TRFDT): Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 66 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex A Simulated Speech Generator Main Signal The main signal consists of eight 1024-point pseudo-random noise segments. Each segment has the same magnitude spectrum but a different phase spectrum with the phase randomized within and between the segments uniformly from 0 to 360 degrees, in order to randomize the interaction between the intermodulation products of the harmonically related spectral components. The duration of each segment is 80 ms and they are merged with each other through a raised cosine window with an additional 80 ms merging segment between them. The simultaneous fade-out of the previous segment and the fade-in of the following segment eliminate the transients, which would occur at the segment boundaries. The complete main signal thus consists of eight pseudo-random segments interleaved with eight merging segments, each of 80 ms duration having a total length of 1.28 seconds. A simple filter at the output provides the desired frequency shaping to approximate an average speech spectrum. Modulating Signal Measurements show that a Gamma distribution with parameter m = 0.545 provides a good approximation to the instantaneous amplitude distribution of continuous speech. The syllabic characteristics can be represented by a low pass response that is practically flat up to about 4 Hz (the -3 dB point) followed by -6 dB per octave roll-off. The final wave shape of the modulating signal was derived empirically from the Gamma distribution. Varying the period of this pulse in a pseudo-random manner and adjusting its rise and fall time ratio results in a satisfactory approximation to the spectrum of the modulation envelope of real speech. Combined Signal In order to extend the repetition time of the final signal and to spread more evenly the maxima of the modulating signal over the repeated sequence of the Gaussian signal, the ratio between the sampling clock frequencies of both signals was chosen to be 4/255. Thus the clocking frequency of the main signal is 12,800 Hz, and the clock frequency for the modulating signal is about 200.8 Hz. The repetition times are: 1.28 seconds for the Gaussian signal, 10.2 seconds for the modulating signal and 326.4 seconds for the final modulated signal. Main Signal Source (Gaussian) Shaping Filter Output Modulating Signal Source (Gamma) Figure A - Block Diagram of Simulated Speech Simulator Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 67 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Gaussian Signal Generator The Gaussian signal is made up of sixteen segments. The odd number segments are generated by filling a 2 by n array with zeros and then filling in the desired real and imaginary spectrum components using equations one and two. The first entry is zero i.e. no DC component and there are no components above 5500 Hz. X r ( ω ) = cos ( 2 π α − π ) (Eq. A1) X i ( ω ) = sin ( 2 π α − π ) (Eq. A2) where: α is a random number with uniform distribution 0 ≤ α ≥ 1 The inverse FFT is then taken to transfer the signal to the time domain. xi ( n ) ⇔ X i( ω ) X r ( ω ) The even number segments S(n) are: Si(n) =Si(n-1)*0.5(1+cos((π (i-0.5)/1024) + Si(n+1)*0.5(1-cos((π (i-0.5)/1024) i = 1 to 1024 n = 2, 4...,16 for n+1>16 use n+1-16 (Eq. A3) Gamma Function For the Gamma function the 2048 samples are divided into 21 random length pulse periods (number of samples). The periods are 167, 43, 63, 119, 48, 57, 78, 88, 93, 107, 51, 71, 259, 60,67, 207, 143, 54, 130, 45, 98. Each period is divided into rise time of one third and a fall time of two thirds, i.e. rise and fall times are in 1:2 ratio. The cubic interpolating spline function is used to model the rising and falling section of each segment. First calculate10 the coefficients B(I), C(I), D(I) for I =1 to 60 for a cubic interpolating spline. The number of points (knots) is 60. The abscissas of the knots, in increasing order, range in value from 0.05648176 to 0.983219. Y is the ordinate of the knots. Y (I) =I-0.5. where: n = number of samples in the rising (or falling) section s(i) is the value of the ith data point in the period For the rising time period: s(i) = spline value at abscissa (-0.5/n)+(1/n*i) For the falling time: s(i) = spline value at abscissa (-0.5/n)+(1/n*(n+1-i) 10 Refer to G.E. Forsythe, M.A. Malcolm, C.B. Moler, “Computer Methods for Mathematical Computations”, Prentice Hall, Inc. 1977 for additional information. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 68 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex B Composite Source Signal The composite source signal consists of the following components: ITU-T Recommendation P.50 Artificial Voice Component The voiced signal part of the CSS is the conditioning signal intended to activate possible speech detectors in hands-free telephone. The reason a voiced signal has been chosen is that presumably all devices designed for speech transmission will quickly respond to a voiced sound. The signal is to activate the hands-free telephone for the direction of transmission to be measured. As the duration, beginning and end of the voiced signal are known exactly, this signal can also be used to measure the switching time. By means of the signal shape in the time domain, the switching time and delay time of the entire system can be determined according to ITU-T Recommendation P.34 [17]. The duration of the signal amounts to 50 ms. Pseudo Noise Signal Component The measurement signal is the pseudo noise signal presented after the voiced artificial speech sound. The signal has certain noise-like features. The magnitude of its Fourier transform is constant with frequency while the phase is changing. For measurements, only the magnitude of the transfer function is of interest. The phase is not that important but can be determined as well. The signal is a complex spectrum produced in the frequency domain according to the following equation: H ( k ) = W ( k )∗ e j∗ik ∗Π where: (Eq. B1) k = − M 2 ,..., M 2, without 0 i k ∈ {+1, 0}, i k = −i − k , random The index M is adjusted to the chosen FFT size (i.e. 2048 points). The equation shows that the amount of the produced complex spectrum is constant for all frequencies if W(k) is chosen equal to 1 for all frequencies, whereas the phase may be + or - for each frequency, corresponding to a random sequence. However, to produce a different weighting in the frequency domain, W(k) can easily be adjusted in order to produce different spectra for the duration of the PN-sequence. Then, the spectrum is transformed into the time domain by means of the inverse Fourier transform producing the following signal: S ( n) = where: 1 M M 2 ∑ H ( k )∗ e j 2 Π∗n∗ k M (Eq. B2) k = M 2k ≠0 n = − M 2 ,..., M 2 i k ∈ {1, 0}, random Thus, a signal is produced which is limited in time (corresponding to the chosen length of the Fourier transform) and which is adjusted to the chosen FFT size correctly. If a longer time sequence is wanted, the signal can be cycled. This method permits time sequences of any length. The duration of this measurement signal may amount to about 200 ms by appropriate choice of M and the sampling rate. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 69 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Pause Component The pause has two purposes. An initial pause before applying any measurement signal is necessary to put hands-free telephones with time-variant transfer functions into a defined initial state. To this end, the pause should be as long as possible (>1 s). If, however, the unit is to be put into a constantly activated state (running speech-like), the intermediate pauses should be shorter (about 100 ms) to provide suitable amplitude modulation to the composite signal. The pause of the composite source sequence is 150 ms. Use of Composite Source Signal for Double Talk Measurements By choosing the voiced part of the CSS with a different pitch frequency than the single talk signal, and by using random noise rather pseudo-noise, a double talk signal can be created that has a low correlation to the single talk signal. Refer to ITU-T Recommendation P.501 [27] for additional information. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 70 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex C ITU-T Recommendation P.50 Noise Bursts Over TDS Sweep The bias signal consists of random noise with a spectrum and spectrum tolerance conforming to ITU-T Recommendation P.50. For send measurements, it is presented in bursts at a 4 Hz rate and 50% duty cycle (125 ms "ON", 125 ms "OFF"). The bias is presented at the standard test level during the "ON" bursts. For receive measurements, the bias may be presented either continuously or in the burst pattern. Continuous presentation may be the most appropriate bias of a telephone with a simple AGC function, but burst presentation may be better for telephones with more complex functions. Ideally, both ways should be measured to determine which gives the most typical results. The telephone will be measured in its average state during the entire measurement. The measurement signal is a series of sine sweeps from 100 to 10,000 Hz, at any rate suitable for Time Delay Spectrometry (TDS) measurements. The sweeps are not synchronized with the bias pulses. The sweep spectrum is the P.50 spectrum times a 3 dB-per-octave rising characteristic. (This is equivalent to passing a flat sweep through a filter designed to shape pink noise into the P.50 spectrum.) At 315 Hz, the level of the measurement signal is 15 dB below the overall level of the bias signal. The measurement is performed by TDS. The sweep length and number of averages are adjusted to obtain a satisfactory signal-to-noise ratio in the measurement. Typically, a measurement time (sweep length times number of averages) in the range of 16 to 128 seconds gives good results. The true frequency resolution of the TDS measurement will be determined by the time window chosen, not by the frequency interval in the analyzer. The minimum effective time window is 5.7 ms, which corresponds to a frequency resolution (lowest measurable frequency) of 175 Hz (see section ??? ). In principle, this method can be used with any desired bias signal, including any of the speech-like signals described in section 6.6. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 71 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex D Hoth Room Noise Hoth noise can be described as acoustic random noise that has a power density spectrum corresponding to that published by Hoth1. The spectrum of Hoth noise is designed to simulate typical ambient room noise over time. Test Table 1m Loudspeaker 50cm Test Figure D1 - Hoth Noise Test Setup Hoth noise can be reproduced using two non-correlated white noise generators and two equalizers in order to produce the required spectrum through four loudspeakers positioned radially 50 cm above the table, 1 meter away from HFRP and 45 degrees apart (see figure D1). Each one of the two uncorrelated noise signals are delivered to two loudspeakers in alternated fashion. Using a free field microphone placed at the HFRP in absence of the test table, the 1/3 octave spectrum can be calibrated. Once the spectrum is within ± 2 dB in each band of the Hoth specification, replace the table and hands-free telephone to the correct position. The overall A weighted level can now be set with a probe microphone located close to the microphone on the telephone, with the probe microphone configured to measure dBSPL with A weighting (dBA). Table D [CCITT Series P 1989, Supplement No. 13] below gives the spectrum density adjusted in level to produce a reading of 50 dBA on an IEC recommended sound level meter. Figure D2 shows a plot of this spectrum. The spectrum below is independent of level and shifts equally for each 1/3 octave band. Frequency (Hz) Spectrum Density (dB SPL/Hz) Bandwidth 10 log Total power in each _ƒ (dB) 1/3 Octave Band (dBSPL) Tolerance (dB) 1 HOTH (D. F.): Room noise spectra at subscribers' telephone locations, J.A.S.A., Vol. 12, PP.499-504, April 1941 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 72 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS 100 125 160 200 250 315 400 500 630 800 1000 1250 1600 2000 2500 3150 4000 5000 6300 8000 32.4 30.9 29.1 27.6 26.0 24.4 22.7 21.1 19.5 17.8 16.2 14.6 12.9 11.3 9.6 7.8 5.4 2.6 -1.3 -6.6 13.5 14.7 15.7 16.5 17.6 18.7 19.7 20.6 21.7 22.7 23.5 24.7 25.7 26.5 27.6 28.7 29.7 30.6 31.7 32.7 Table D - Hoth Noise Parameters 45.9 45.5 44.9 44.1 43.6 43.1 42.3 41.7 41.2 40.4 39.7 39.3 38.7 37.8 37.2 36.5 34.8 33.2 30.4 26.0 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 ±3 Spectrum Density Vs Frequency 35 30 Spectrum Density 25 20 15 10 5 0 -5 -10 100 1000 10000 Frequency (Hz) Figure D2 - Hoth Noise Spectrum Typical Hoth noise levels range from 35 dBA to 65 dBA, and switching parameter and speech detection tests should be performed at these levels in 10 dB increments. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 73 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Note that at low frequencies, sound levels are somewhat difficult to control due to both the size of the test chamber (attenuates poorly), and the introduction of external noise (air-conditioning/heating etc.). The test chamber should be designed to minimize undesirable low frequency sound levels. For optimum ambient noise simulation in the test chamber, it is best to have a diffuse source for Hoth noise. This can best be achieved by having somewhat reflective walls, and multiple sound sources. A compromise can be made with either the room or the number of sound sources. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 74 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex E Useful Conversion Procedures E.1 Conversions for dBV to dBm for 600 and 900 Ω 0 dBm is accepted as 1 mW, typically using a circuit impedance of 600 ohms or 900 ohms. 0 dBm = 10 log 1(mW) dBV = 10 log V2 = 20 log V For R = 600 ohms: P = V2/R, therefore dBm = 10 log V2/R * 1000 = 10 log V2/600 * 1000 = 10 log V2/0.600 So, V = 774.6 mV or 0 dBm = -2.22 dBV For R = 900 ohms: P = V2/R, therefore dBm = 10 log V2/R * 1000 = 10 log V2/900 * 1000 = 10 log V2/0.900 So, V = 948.7 mV or 0 dBm = -0.46 dBV To change from 600 ohms to 900 ohms or vice versa, for a constant voltage: Correction (dB) = -10 log (0.600/0.900) = 10 log (0.900/0.600) = 1.76 dB Correction (dB) = 10 log (|Z1| / |Z2|), i.e., the log of the ratio of the magnitude of the impedances, when converting from impedance Z1 to Z2. If converting from "Z1 = 600 ohms" to "Z2 = 900 ohms", the correction factor is -1.76 dB, thus we subtract 1.76 dB from the measurement. Depending on the impedance being used, conversion factors can be applied dB for dB to the measured or calculated result. Example 1: To convert a 600 ohm -20 dBm signal to dBV, simply subtract 2.22 to get 22.2 dBV. Example 2: -20 dBm is measured across 600 ohms. To find the level across 900 ohms, add a correction of -1.76 dB to get -21.76 dBm (since the larger load dissipates less power). E.2 Conversions for dBmp to dBrnC for Electrical Noise Measurements Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 75 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Two weighted noise measurement units have typically been used in telephony, dBmp and dBrnC. The main differences between these two measurement units are the shape of the weighting filter and the reference unit. The weighting filter for dBrnC is described in IEEE 743. The differences in the weighting functions are extremely slight, as to be insignificant; thus the conversion between the two units can be expressed as: – dBrnC = dBmp + 90 E.3 Loudness Rating Conversions Conversion from IEEE 661 to ITU-T P.79 as specified by EIA/TIA-579A is as follows: SLR (P.79) = TOLR (IEEE 661) + 57 RLR (P.79) = ROLR (IEEE 661) - 51 STMR (P.79) = SOLR (IEEE 661) + 9 The above conversions should be used as an approximation only. These conversions are based upon approximated frequency response curves as specified in TIA-579A. Proper conversion may depend upon actual measurements being made with each measurement standard where frequency responses deviate significantly from the norm. E.4 Acoustic Sound Pressure Conventions dBPa (dB Pascals) dBSPL (dB Sound Pressure Level) Where, 0 dBPa = 94dBSPL, and 0 dBSPL = 20 microPascals, 1 Pa = 1 N/m2 An A weighted sound pressure level in dB (dBSPL, A weighted) is often abbreviated to “dBA”. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 76 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex F Recommended Test Bed F.1 Example Implemented Test Hybrid The test hybrid can be constructed from a passive 2-4 wire hybrid, mated to a digital echo canceller as described in figure F1 below. The digital echo canceller is described below. P as sive two - f o ur wir e hybr id L ine o ut Tip R in g inte rf ace & D C Fe e d S o und c ar d with P r o g. D SP L ine in P os t P roc es sing o n P C f or N L M S E c ho C anc e lle r H FT Figure F1 The passive 2-4-wire hybrid is a transformer setup that can be balanced to provide a simulated 4-wire signal path that is then interfaced to the digital echo canceller. Figure F3 below describes the passive hybrid circuit. The digital echo canceller can be implemented through a post processing or real time application. In either case, simultaneous (full duplex) record-playback capability is needed. The algorithm is a “normalized least mean square” (NLMS), that has 249 regular coefficients and 1 DC coefficient, which cancels any possible DC component in the incoming signal. F.2 Example Adaptive Echo Canceller An NLMS (normalized least mean square) adaptive echo canceller is used to model the actual echo path formed by the 4 to 2 wire hybrid and the HFT. The block diagram is shown in figure F2. x (n) D/A Echo canceller Sound card zz-D-D e(n) DU T A/D y (n) Figure F2 - Block diagram of 2-to-4 wire conversion The adaptive echo canceller has 249 regular coefficients and 1 DC coefficient, which cancels any possible DC component in y(n). The delay of D samples inserted in the y(n) path, named as delay in the program, is: 1. 2. a delay of 20 samples to assure that the adaptive filter can correctly model the echo path, minus the measured delay jointly caused by the sound card, the 4-to-2 wire hybrid, and the HFT. This part is about 15 samples with the current implementation, resulting in D=5. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 77 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The convolution and subtraction are performed as per the following equations sum (n ) = N _ LMS − 2 ∑ h24 (n, j ) ⋅ x (n − j ) + h 24 (n, N _ LMS − 1) j=0 e ( n ) = y (n ) − sum ( n ) where N_LMS=250 is the total number of tap coefficients, h24 is an array containing the coefficients. The adaptation step size µis determined by 1 Ex ( n ) = 1 − N _ LMS ⋅ Ex ( n − 1 ) + 2 ( n ) x α Ex ( n ) + ε µ = In steady state, Ex(n) is an estimate of the energy in x(n), with a scaling factor N_LMS. α =0.05 is the normalized step size, and ε =0.1 is a small constant to put an upper limit to µ. The coefficients are then updated according to β = µ ⋅ e (n ) h 24 ( n + 1, j ) = h 24 (n, j ) + β ⋅ x ( n − j ), j = 0, 1, ..., N _ LMS − 2 h 24 ( n + 1, N _ LMS − 1 ) = h24 (n, N _ LMS − 1 ) + α ⋅ e (n ) N _ LMS + ε The training signal x(n) is white, with a probability distribution function p ( x ) 1 2 M 0 = ≤ , x , otherwise M where M is the maximum magnitude. Such a uniformly distributed x(n) is obviously zero-mean, with a peakto-RMS ratio. Peak ≡ RMS M ∫ ∞ −∞ x p(x)dx 2 = M 1 M 2 x dx 2M ∫−M = M 1 3 x= M x 3M x=0 = 3 In order to estimate the performance of the 2-to-4 wire conversion, we need to calculate its echo return loss (ERL). the formula used is ∑x (n) ∑e (n)− ∑ν (n) 2 ERL=10⋅ log10 n 2 n 2 n where ν (n) is the noise present in e(n) and is caused mainly by the HFT and also by the 4-to-2 wire hybrid. By using the above formula, we assume the two components of e(n), the residual echo and the noise, to be uncorrelated. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 78 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The timing for the application of the training signal x (n), for updating coefficients, and for estimation of echo+noise e(n), and estimation of noise ν (n), is shown in figure F3. T rain in g s ig n al x (n ) On C o efficien ts U p d ated C alcu latin g en erg y in x (n ) an d e(n ) O ff C alcu latin g en erg y in x (n ) O ff F ro zen On O ff O ff 0 On 9 11 13 s Figure F3 - Timing diagram The adaptive filter converges in the first 9 seconds, after which it is frozen. Then the training signal stays on for two more seconds in order to get an estimate of the energy in x(n) and e(n) between 9 and 11 seconds. During the period of 11 to 13 seconds, the training signal is absent so that an estimate of the energy in noise ν (n) can be obtained. These three energy estimates are then used to calculate the ERL as per the above formula. Cross-correlation calculation: For this implementation of the echo canceller, it is recommended to compensate for any built-in delay that may be present in the hardware being used. A cross-correlation calculation may be used to identify this delay. The cross-correlation calculation is performed between x(n) and y(n-Dsc), where Dsc is the base value of the sound card delay, which has been obtained in the sound card setup stage. 11 cross-correlation values are calculated as follows: Rxy ( m) = 3999 ∑ x (n + m) ⋅ y(n − D ) , sc m = 0,±1,±2,...,±5 n= 0 If the magnitude of a certain Rxy(m) is larger than that of any other cross-correlation value, m is deemed to be the variation of sound card delay. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 79 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Sample C source code for NLMS Adaptive Echo Canceller: /* Heading */ define define define N_LMS 250 alfa 0.05 epsilon 0.1 /* Variable declaration */ int double double j; beta,mu,mu_for_DC,ex,sum; y,e,beta,h24[N_LMS],x[N_LMS]; /* Initialization (executed once on start-up only) */ ex=32768.0*N_LMS; beta=1.0-1.0/(double)N_LMS; for (j=0; j<N_LMS; j++) { h24[j]=0; x[j]=0; } mu_for_DC= alfa/((double)N_LMS+epsilon); /* Convolution (executed once every sampling interval before and */ /* after coefficients are frozen) */ /* Acquire input data */ y=...; /* Get y(n), already delayed by D samples */ x[0]=...; /* Get x(n) */ /* Do convolution to obtain sum(n) */ sum=h24[N_LMS-1]; for (j=N_LMS-2; j>0; j--) { sum+=h24[j]*x[j]; x[j]=x[j-1]; } sum+=h24[0]*x[0]; /* Convolution complete */ /* Derive error e(n) */ e=y-sum; /* Output result e(n) */ ...=e; /* Coefficient updating (executed once every sampling interval before */ /* coefficients are frozen) */ /* Determine step size mu */ ex=beta*ex+x[0]*x[0]; mu=alfa/(ex+epsilon); /* Derive beta */ beta=mu*e; /* Update coefficients h24[0:N_LMS-2] */ h24[N_LMS-1]+=mu_for_DC*e; for (j=N_LMS-2; j>=0; j--) h24[j]+=beta*x[j+1]; Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 80 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Room Noise (if required) Test Room as per Section 8.6.4.2 Test Table Handsfree Terminal Electrical equivalent of acoustic Rout Artificial Mouth or Meas. Mic (if required) Electrical equivalent of acoustic Sin Head and Torso Simulator or equivalent Sources Digital Set Analog Set 2 Wire/4 Wire Digital Hybrid Either Rin Data Aquisition Sout ISDN/Reference Codec Rin V D.C. Feed Source Figure F4 - Test Bed Block Diagram Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 81 Sout STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex G Detailed Test Methodology For Temporally Weighted ERL G.1 Echo Return Loss Algorithm The temporally weighted echo return loss ERLt measurement method is described. This method requires that the echo and the source signal be recorded over the duration of the measurement, and post processing be used. Real-time measurement techniques are possible, but are not described in this Standard. Freezing the canceller is not recommended for ERL tests. Freetel results1,2 with non-stationary signals have shown that convergence times and subsequent converged ERL when "thawed" depend upon the point in time at which the canceller was frozen. G.1.1 Echo Return Loss, Temporally Weighted (ERLt) Temporally weighted ERL, ERLt, is intended to: • Provide a measure of time dependent echo return loss with peaky behavior, psycho-acoustically weighted; the ERLt. • Provide an estimate of the number of potentially objectionable echo bursts, and the acoustically weighted echo return loss during the bursts. psycho- The echo signal is first filtered to model the frequency selectivity of human hearing at loudness levels of 30 Phons, as described in section G.1.2. This weights the echo power in a way that the human hearing response would. Noise reduction may then be applied and the echo and stimulus files synchronized. Noise reduction is where the noise is measured and subtracted from the echo plus noise to arrive at a better estimate of the echo alone. Such a measurement should occur for at least two seconds after all stimulus activity has stopped. Echo and source are converted into 4 ms power averaged frames allowing adequate resolution and immunity to synchronization errors. If the stimulus is inactive, the algorithm simply skips that frame, and moves on to the next echo and stimulus frames. If the stimulus is declared active, the echo frame is compared with a threshold to determine if an echo event occurs. The period of echo activity between inactive echo states is termed an echo "event". These events are then weighted using psycho-acoustic modeling. By using a threshold of 65dB (5 dB above ulaw noise floor), ERLt can be determined. The actual test algorithm in pseudo code is detailed in annex H. G.1.2 Modeling Echo Audibility [Instantaneous Loudness according to Zwicker?] In modeling echo audibility, the algorithm accounts for 3 fundamental aspects of human hearing behavior: (1) The frequency selectivity of human hearing at a loudness levels of 30 Phons ("Fletcher-Munson" response equivalent to 30 dB at 1 kHz) 11. 11Hearing, Gulick, Gescheider, Frisna, Oxford University Press, 1989. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 82 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Thirty Phons was chosen as it represents echo levels that result from terminals that just fail handset terminals coupling loss specifications (determined using loss planning analysis). Variance from 20 to 50 Phons provide essentially the same weighting within the telephony band. An A weighted filter is used. Note that the use of this exact weighting characteristic assumes headphone/handset type listening, or "mean audible pressure" (MAP) response. Free-field listening such as over a hands-free would require the Robinson and Dadson "mean audible field" (MAF) weighting, but the difference is slight. MAP weighting will be used to better reflect the more common use of handset. The average loss of the filter with white noise is 1.3 dB when measured using ERLs or ERLt. With non stationary signals, the loss will be time dependent. (2) The ear's tendency to combine the loudness of sequential signals even though they may be discrete in time ("temporal combination"). This typically occurs when the two signals are separated by a silent period, which is less than 20 ms 12,13,14. If two bursts of echo are separated by a period of inactivity less than 20 ms, they are considered as one longer echo event as far as loudness is concerned. This continues until the gap between events is at least 20 ms, at which time the echo event is declared over. This can be thought of as a 20 ms hangover for the current echo event. During this hang-over period, echo and stimulus powers are not included as part of the event. An example of temporal combination follows in figure G1 below: New Echo Duration of Echo = 100 ms Echo Amplitude (Power) Activity Threshold Temporal Combination 20 40 60 Time (20 ms/div) 80 100 120 Figure G1 (3) The duration of the total echo event after temporal combination is measured based on the ear's natural temporal integration behavior. The total duration includes any gap(s) between events that are captured by temporal combination, but not the final 20 ms hangover. If the duration is less than 750 ms, the level of the event is reduced to account for the temporal integration behavior of human hearing. An equation describing the relationship was derived based upon audition studies with noise15: Temporal integration weighting = - 23 + 8log(t) in dB 12The LEDE Concept, D. Davis, C. Davis, JAES, 1985. 13The Detection of Reflections, S. Olive, JAES, 1987. 14Modification of Timbre by Resonance, S. Olive, JAES, 1988. 15Auditory Demonstrations, ASA, Philips CD set 1126-061. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 83 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS where t = total duration of echo event (ms), t_750 ms Note that tones result in a slightly different relationship, but it was felt that noise was a much closer approximation to the true nature of the echo than a sine. Relative Loudness Level (dB) A graphical representation of temporal weighting is shown below in figure G2. 0 -10 -20 Broadband Noise 10 100 1000 Duration (ms) Figure G2 If the duration is longer than 750 ms, the level of the total event is left unweighted. Test results have shown echo bursts less than 750 ms to be common occurrences from cancellers. G.1.3 Expressing ERLt Results Traditional ERL methods refer the echo power during the duration of measurement to the source power during the duration of measurement to arrive at the echo return loss. In this method, the final weighted power of echo during each event is referred to the power of the source signal during the same event, to arrive at the "Active ERLt", AERLt, of each event. The echo is referred to the source signal during the event only, as this is the way in which our ear would compare the echo. A long term average of the weighted active echo return loss is found by summing the power of all weighted echo during active events, and comparing to the power of the source as seen during all events only. The result is the "Active Long Term ERLt". For comparison with traditional ERL methods, the power of all weighted echo during events is summed, then referred to the total source power as measured for the entire duration of the measurement. The result is the "Long Term ERLt". Note that the terminology for ERLt results was chosen to be consistent with P.56 nomenclature. Other statistics compiled include minimum and maximum AERLt, standard deviation ("sigma") of AERLt, the mean of AERLt, the total number of echo events (combined events due to the "Haas" effect are Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 84 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS considered one total event), the number of echo events per minute, the percentage of echo event free speech, number of events < 750 ms, the average length of an event and the duration of source inactivity. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 85 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex H ERLt Test Algorithm ERLt is a newly proposed method for evaluating the echo return loss of a terminal using psychoacoustic modeling and for predicting the occurrences of potentially objectionable echoes. It incorporates 3 fundamental aspects of human audition: • • • frequency selectivity of human hearing ("Fletcher Munson" response); temporal addition of level for events within 20 ms of each other ("Haas" effect); temporal integration for stimuli below 750 ms. Audition details leading to the algorithm are available in annex G. The implementation details of the algorithm follow. A source signal as described in section 6 may be used. Speech based stimulus signals are recommended as their results are most representative of real world usage. The system output is always some echo or noise making its way through the system uncancelled. The stimulus and echo should be recorded and made available in digital format. User inputs regarding set type (analog or digital), EPDn and double talk or single talk tests should be available. Calibration parameters should be used to scale echo and stimulus frames to absolute values, and hybrid processing should have removed hybrid echo for 2 wire analog sets. The stimulus and the echo files will be processed as power values averaged over 4 ms frames. The successive stimulus file frames will be termed xi, the echo frames will be denoted yi, where i = 1, 2, 3.... is the actual frame index. Intermediate frames conforming to an "echo event" will be noted as xk, and yk, where k = 1, 2, 3... is the echo event index, and is reset when the event ends a new one commences. Statistics compiled during the ERLt measurement include the Active Long Term ERLt (ALERLt), Long Term ERLt (LERLt), minimum and maximum Active ERLt (MINERL, MAXERL), its sigma and mean, the total number of echo events (combined events due to the "Haas" effect are considered one total event) (NEVENTS), the number of echo events per minute (NEVMIN), the percentage of echo event free speech (PER), number of events < 750 ms (N750), the average length of an event (AVGEVENT), and the duration stimulus was inactive (DUR). The terminology for ERLt results was chosen to be consistent with P.56 nomenclature. The duration of stimulus inactivity is not included in the time-based results. H.1 ERLt Algorithm Step 1 (Optional but recommended) Calculate the correlation of stimulus and echo file to fine tune EPDn. Use the criteria that the present correlation peak occurs at EPDn unless a following correlation peak has a magnitude at least 10 dB greater. This approximate guideline is based upon subjective studies in the JAES on delay detection with multiple impulses. Step 2 Align the echo and stimulus files in time by removing delay equal to EPDn from the echo file. Step 3 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 86 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The individual echo samples are processed through a filter approximating the mean audible pressure equal loudness contour for 30 Phons. This can be accurately approximated (within ± 1 dB from 200 Hz to 2500 Hz) by a first order high pass filter with a -3 dB point of 800 Hz. Step 4 If it can be assumed that the noise in the echo path is stationary and uncorrelated with the echo, the noise is measured for 2 seconds after the stop of source and echo activity. The noise is then subtracted from the echo plus noise to arrive at a better estimate of the echo alone. Step 5 Samples are converted to absolute numbers using the calibration data. The stimulus samples are combined into 4 ms power averaged frames denoted as xi. The weighted, noise filtered echo samples are combined into 4 ms power averaged frames denoted as yi. Step 6 Begin Echo Return Loss Calculations Initialize variables: i = 0 (frame counter) j = 0 (frame counter for inactive signal duration) nk=0 = 0 (number of frames in current echo event) NSAMPS = 0 (accumulated number of frames for all events) HAAS = 0 (counter up to 20 ms) ei=0 = 0 (running summation of all echo power for all events after weighting, as seen at frame counter i) pi=0 = 0 (running summation of all stimulus power during the measurement, as seen at frame counter i) ek=0 = 0 (running summation of echo power during the particular echo event after weighting, as seen at event frame counter k) sk=0 = 0 (running summation of stimulus power during the particular echo event after weighting, as seen at event frame counter k) WEIGHT = 0 (temporal based weight of most recent event) LEVENT = 0 (echo return loss level of most recent event, after weighting) NEVENT = 0 (total number of echo events) N750 = 0 (total number of echo events < 750 ms) MINERL = 75 (minimum echo return loss level of all events) MAXERL = 0 (maximum echo return loss level of all events) EVENT[NEVENT] = 0 (initialize array for all event loss levels (in dB) to zero; used to calculate sigma) TEMPSK = 0 (running sum of stimulus power during all events) SUM = 0 (used in calculating sigma) SQ = 0 (used in calculating sigma) Step 7 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 87 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Increment frame counter and read in 4 ms averaged echo power yi, and 4 ms averaged stimulus power, xi; if there are no more valid inputs and either measurement file is complete, go to step 8. 1 i = i +1 (unless last i, then go to step 8) Sum stimulus powers pi = pi + xi Is stimulus loud enough for a valid echo loss calculation? If not, disregard present frame and move to next frame. 4 If xi < (long term stimulus rms level - 25 dB) j=j+1 i=i+1 Go to 4 Else Test echo against threshold If yi _ -65 dB {5 dB above ulaw noise floor} Increment frame event counter k = k +1 Increment frame event length including any gaps < 20 ms nk = nk +1 + HAAS Reset "Haas kicker" HAAS = 0 Accumulate echo power of event ek = ek + yi Accumulate stimulus power during event sk = sk + xi Go to 1 Else Has there been no event within last 20 ms? If k=0 HAAS = 0 Go to 1 Else There has been an event within the last 20 ms HAAS = HAAS + 1 Has 20 ms without an event elapsed after a recent event? If HAAS*4 < 20 Go to 1 Else An event is over, add an event to the event counter Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 88 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS NEVENT = NEVENT + 1 Increment the total events duration counter by adding the duration in frames of the most recent event NSAMPS = NSAMPS + nk Was the most recent event duration < 750 ms? If nk*4 < 750 Calculate temporal integration weighting for most recent echo event WEIGHT = 8*log10(nk*4) - 23 Increment the counter for the number of events that were temporally weighted N750 = N750 +1 Else Calculate weighted echo return loss of the most recent event in dB LEVENT = 10*log10(sk/ek) - WEIGHT Store the minimum and maximum echo return losses in dB IF LEVENT < MINERL; MINERL = LEVENT IF LEVENT > MAXERL; MAXERL = LEVENT Store the echo return loss of the most recent event in dB for future sigma calculation EVENT(NEVENT) = LEVENT Reconvert the echo return loss of the most recent event into linear; recalculate weighted linear echo power ek = sk/(10**(LEVENT/10)) Accumulate all the echo event powers for future use in calculating ALERLt and LERLt ei = ei + ek Accumulate all the stimulus powers during events for future use in calculating ALERLt TEMPSK = TEMPSK + sk Reset echo event variables k=0 nk = 0 WEIGHT = 0 HAAS = 0 ek = 0 sk = 0 Go to 1 Step 8 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 89 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Calculate Active Long Term ERLt (ALERLt), Long Term ERLt (LERLt), the number of echo events per minute (NEVMIN), the percentage of echo event free speech (PER), the average length of an event (AVGEVENT) and duration during which speech was inactive (DUR). Note: Zero check ei before computing; if ei = 0, set ALERLt and LERLt to 100 dB. ALERLt = 10*log10(TEMPSK/ei) LERLt = 10*log10(pi/ei) NEVMIN = 60*NEVENT/((i-j)*0.004) {number of events per minute} PER = 100*((i-j) - NSAMPS)/(i-j) {percentage of echo free speech) AVGEVENT = NSAMPS*4/NEVENT {average length of an event in milliseconds} DUR = j**0.004 Calculate sigma by analyzing the EVENT array which contains the echo return loss of each event; each event, regardless of duration, is given equal weighting in the sigma calculation; the suggestion is that it is the transition between discreet events and not their duration that is most objectionable. Loop j from 1 to NEVENT SUM=SUM+EVENT(j) SQ=SQ+EVENT(j)**2 ENDLOOP SIGMA = SQRT(SQ/NEVENT - [SUM/NEVENT}**2) Calculate mean of the events MEAN = SUM/NEVENT Step 9 Output statistics Print ALERLt, LERLt, MINERL, MAXERL, NEVENT, NEVMIN, PER, N750, AVGEVENT, DUR, SIGMA, MEAN Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 90 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex I Double Talk Testing I.1 General Testing in double talk mode is somewhat complex in that the test signal required has to allow for discrimination between both transmission directions. The method also has to do this in such a way as to allow the hands-free phone to operate in as normal a fashion as possible. The following methods detail how this can be done as referenced by sections ???. I.1 Canceller Training prior to Double Talk Before double talk testing is entered, all cancellers must be fully converged using training signals for both transmission directions. The test signals should be uncorrelated, and the training period should allow at least 5 seconds of single talk in each direction. The amplitude mask statistics for the training period are shown below: Parameter Rate (%) Over 60 seconds (s) Talk Spurt 38.53 23.12 ~ 24 Pause 61.47 36.88 ~ 36 Double Talk 6.59 3.95 ~ 0 Mutual Silence 22.48 13.49 ~ 12 Table I1 - Temporal Parameters Of Conversational Speech I.1.1 Double Talk Training Activity Masks The exact amplitude masks will now be specified along with signal amplitude characteristics during double talk. Each type of double talk test has special requirements for signal duration and amplitude during double talk. For echo return loss, the duration must be long enough to capture any divergence, but not so long as to be a burden on test system memory resources or so long as to result in an unacceptable computation time. Tests have shown a 20 second double talk duration to be acceptable for double talk echo return loss testing. Once double talk has ended (the talker initiating double talk becomes inactive), the echo return loss measurement may continue for 10 seconds (the talker active just before double talk remains active) to measure recovery after double talk. After that time, two seconds of silence should be played. In this way, the noise in the echo path can be measured. If it can be assumed that the noise and echo are uncorrelated, and that the noise is stationary, the noise measured in the two seconds may be subtracted from the echo plus noise during double talk to arrive at a more precise measure of the echo during double talk. The duration of double talk during double talk attenuation and clipping tests may be much shorter. As all time constants under study should be less than 200 ms, the double talk duration is set at 200 ms. Analysis is continued (the talker active just before double talk remains active) for one second after the end of double talk for the attenuation tests in order to measure any loss removal as single talk is re-entered. There is no need to estimate and correct for noise in the double talk attenuation and clipping tests. The recommended masks are below. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 91 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Activity Mask Talker Active Just before Double Talk Silence (2 sec) Optional: 10+2 sec for recovery after double talk Double Talk Time (s) 0 18 30 48 60 90 Activity Mask Talker Initiating Double Talk Double Talk Time (s) 0 3 80 33 45 60 Figure I1 - Echo Return Loss Test Activity Mask 15 Activity Mask Talker Active Just before Double Talk Time (s) 0 18 30 48 60 61.2 Activity Mask Double Talk Talker Initiating Double Talk (200 ms) Time (s) 0 3 15 33 45 60 Figure I2 - Attenuation Test and Clipping Test Activity Mask I.1.2 Synchronizing the Double Talk Training Activity Masks The timing of the masks must be synchronized to avoid pre-mature double talk. This involves accounting for the 1.5ms air path delay between the artificial mouth and the HFT. The mouth simulator signal should be initiated 1.5 ms before the RETP signal by delaying the stimulus file used on receive by 1.5 ms. The Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 92 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS start of double talk is defined as occurring when the microphone location sees valid send activity and RETP sees valid receive activity. In the activity mask diagrams, the t=0 starting point refers to the beginning of the file applied at RETP. The starting point for the mouth simulator signal can be thought to be t=-1.5 ms, but the terminal will see them synchronized. At the 60 second mark, double talk is entered and double talk testing begins. I.1.3 Compensating for Measurement Filters The double talk test methods require the use of filters injected in the audio path (for details, see I.6). These filters will have an impact on the time domain resolution and the precise moment at which double talk testing can begin. By using filters of known ringing time, the measurement can be put in a wait state while the filter ringing settles. An example implementation and further explanations can be found in annex F, section F.4. I.2 Double Talk Source Signals Double talk testing often imposes conflicting restraints on the type of test signal used. For echo return loss tests, the double talk signal presented to the canceller at RETP (or mouth simulator) should be as similar to the training signal as possible. Cancellers typically freeze adaptation during double talk. For example, if the double talk signal at RETP differed from the training signal, residual echo would typically be unrealistically high. This constraint indicates that the double talk signal at RETP should be the same as the training signal at RETP for echo return loss measurements. Unfortunately, the use of the speech files alone is not acceptable during double talk. The correlated parts between the two "talkers" would invalidate some test results: parts of the one talker's speech may look like echo of the other talker if the parts are correlated. Another problem is that double talk onset must be very accurately detected for attenuation and clipping tests. This would be very difficult to define over repeated tests using different speech files, but is very easy with tones. To overcome these issues, both signal types are used; speech as per the training signal, and tones to accurately define the start of double talk. How they are used depends upon the specific test. These concepts are best explained in the following individual test sections, but will be briefly described here. Speech signals used during training are continued during double talk, as required. A sinusoidal tone is mixed in with the speech or injected on its own to provide an easily measurable reference for attenuation tests or an easily definable start of double talk for clipping tests. By using notch or band-pass filters at SETP (or receive output) at the tone frequency, either just the tone or just the speech can be monitored. When the tone is mixed in with the speech, the power of the tone must be representative of the long term average power of speech at its frequency, so as to not impact the canceller with any gross deviations in spectral energy from that of the training signal. ITU-T Recommendation P.50 specifies an average spectral relationship (third octave values used). The following tones are recommended. Their power is defined as the number of dB below the average active speech energy in the speech file, when measured as per ITU-T Recommendation P.56 (see section 5.2.2.5 g): Tone Frequency Relative Tone Level (dB) below Nominal Speech Level 500 1000 1750 2500 9 14 18 22 I.3 Double Talk Attenuation Testing The example shown determines double talk attenuation in the send direction. The concept is easily extended to the receive direction by reversing signals and monitoring at the receive output. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 93 Amplitude STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Rin Rout Echo Canceller Frequency Adaptive Model echo path - Sout Sin Frequency Echo Frequency Amplitude Amplitude Amplitude + Frequency The methodology is explained below: The set is reset, and trained as described in section I.1. The "talker active before double talk" is the mouth simulator. The "talker initiating double talk" is RETP. Notice that the signals at both the mouth simulator and RETP are shown notch filtered at the tone frequency. This notch filter is not present during the entire training period, but only just before double talk and for the remainder of the measurement. The idea is to mix in a tone at the mouth simulator just before double talk (still in single talk), monitor its rms level, have RETP enter double talk, and continue monitoring the tone level. The double talk attenuation is the difference in tone level before double talk and during double talk. The tone is discriminated by applying a band-pass filter at the tone frequency at SETP. By continuing measurement during double talk, the switching characteristics including rate of insertion and depth can be determined. The rate of attenuation removal can also be determined by making the activity mask for the "talker initiating double talk" low again after the attenuation depth has stabilized. Characteristics of the notch filter will now be described. The notch is required on RETP to ensure that echo at the tone frequency does not impact the measurement of the tone. The notch filter must show enough attenuation to ensure speech at the tone frequency is adequately repressed so as not to impact the level of the tone at the mouth simulator. The notch filter bandwidth must be tight enough to minimize impact on surrounding frequencies so that the signals are not significantly different than the training signals. Example filter types are described in annex F, F.4. The band-pass filter has similar constraints. Taken with the notch filter, it must have enough out of band attenuation to ensure that speech or echo does not impact the tone level. It must also have a short enough impulse response that the time domain impacts is minimized, as the rate of attenuation insertion is also being measured. The exact filter type is described in annex F, F.4. The impulse response of the notch filter does not impact the measurement as the tone mixed in at the mouth simulator will be large enough in level to swamp any residual ringing of the notch's. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 94 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The rms level of the tone is to be measured using an 8 ms sliding rectangular window for smoothing. The window is slid in 4 ms increments for 4 ms of overlap between adjacent points to smooth the results. Amp litude The timing of the measurement must be fine-tuned to account for any ringing of the bandpass filter. The tone should be injected at 60 seconds, minus the bandpass filter's ringing time (< 20 ms assumed), minus 8 ms. The tone reference measurement during single talk is taken starting at 60 seconds minus 8 ms, after the filter ringing has ended. Set delay in the direction of measurement leads to partial measurement during the end of the bandpass filter's ringing. As long as set delay is low (provisionally <5 ms), the amount of ringing effects encountered will be slight and should not impact the measurement. Tone injection for duration of Measurement Double Talk Starts 60 < 20 ms time (s) 8 ms Filter Ringing Reference Measurement When sub-banding techniques are used in the AEC, or the technique is unknown, it is advisable to repeat the test for each frequency shown in section I.2. In many cases, the attenuation test results at any one frequency may not be indicative of subjective quality. As voice has dominant spectral energy in the lower frequency range, we would expect that the switched loss to be more audible there. If the depth of attenuation is frequency independent, it is advised to use higher test frequencies, as the required filters will have less of an impact on the over-all voice levels. I.4 Double Talk Front End Syllabic Clipping Testing The example shown determines double talk front-end syllabic clipping testing in the send direction. The concept is easily extended to the receive direction by reversing signals and monitoring at the receive output. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 95 Amplitude STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS R in Rout Echo Canceller Frequency Adaptive Model echo path - Sout Sin Frequency Echo Amplitude Measure Duration Amplitude Amplitude + Frequency Duration Known Frequency The set is reset, and trained as described in section I.1. The "talker active before double talk" is RETP. The "talker initiating double talk" is the mouth simulator. The notch filter is applied at RETP as described in section I.2. Double talk is entered by applying a tone at the mouth simulator. The tone is injected for only 200ms. The duration of front-end syllabic clipping is 200ms minus the duration of tone activity at SETP. Using a tone ensures that the precise moment that double talk is entered is known. The duration of tone activity at SETP is determined by monitoring its level at SETP, after passing the signal through a band-pass filter to remove any residual echo. Activity is declared when the level of the tone is no quieter than 3 dB below the tone level 200 ms after double talk is entered. It is assumed that all double talk attenuation is fully inserted before 200 ms into double talk. The impact of set and bandpass filter ringing times on duration are minimized by setting the threshold at a high 3 dB. The rms level of the tone is to be measured using a 4 ms sliding window, slid in 2 ms increments for 2 ms of overlap between adjacent points. The use of a 1.75 kHz or 2.5 kHz tone is recommended, at the relative level recommended in section I.2. The frequency is chosen to minimize the impact of the notch filter on the RETP signal. I.5 Double Talk Echo Return Loss Testing The example shown determines echo return loss looking towards the terminal from the network. The concept is easily extended to talker echo path loss by reversing signals and monitoring at the receive output. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 96 Amplitude STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Rin Rout Echo Canceller Frequency Adaptive Model echo path - Sout Sin Frequency Echo Amplitude Echo Amplitude Amplitude + Frequency Frequency The set is reset, and trained as described in section I.1. The "talker active before double talk" is RETP. The "talker initiating double talk" is the mouth simulator. The notch filter is applied at RETP as described in section I.2. Double talk is entered by applying a tone at the mouth simulator. The tone is injected for 20 seconds. The echo return loss is found by first notching out the tone and then measuring the residual echo using one of the techniques in section G.1. Using a tone ensures that the precise moment that double talk is entered is known. Once double talk has ended, the echo return loss measurement is to continue for 10 seconds to measure recovery after double talk. After that time, one second of silence should be played. In this way, the noise in the echo path can be measured. If it can be assumed that the noise and echo are uncorrelated, and that the noise is stationary, the noise measured in the last second may be subtracted from the echo plus noise during double talk to arrive at the echo during double talk. The timing of the measurement must be fine tuned knowing the echo path delay. This delay properly aligns the source and echo. For the example shown, this value is determined by the EPDn test of section 11.6.1. For talker echo path loss, the echo path delay off the network interface (for 2-wire terminals) is found by the EPDa test of section 11.6.1. As per the clipping test, it is recommended that a 1.75 kHz or 2.5 kHz tone be used at the relative level recommended in section I.2. I.6 Double Talk Measurement Filters Double talk testing requires the use of notch and bandpass filters at various frequencies. A recommended implementation is tabulated below. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 97 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS The terms described are: o fpl: lower frequency at which the bandpass or bandstop is at - 3 dB. o fpu: upper frequency at which the bandpass or bandstop is at - 3 dB. o fsl: lower frequency at which the bandpass or bandstop is at - atten dB. o fsu: upper frequency at which the bandpass or bandstop is at - atten dB. o atten: the specified full attenuation of the filter. o atten: the actual full attenuation of the filter. o ripple: ripple of the filter in dB (+/-). o gain: gain of the bandpass filter (linear) in the pass band. o order: filter order in taps (8 kHz sample rate) for the bandpass. The bandpass ringing time is order times 125 us. For the bandstop, order refers to the order of the biquad (elliptical). Filter Type fpl fpu fsl fsu atten atten (actual) ripple gain order 500 Hz 1 kHz 1.75 kHz 2.5 kHz 500 Hz 1 kHz 1.75 kHz 2.5 kHz FIR FIR FIR FIR IIR IIR IIR IIR Bandpass Bandpass Bandpass Bandpass Bandstop Bandstop Bandstop Bandstop 495 990 1733 2475 400 800 1450 2100 505 1010 1767 2525 610 1250 2100 2950 435 900 1611 2302 435 900 1610 2300 570 1100 1900 2715 570 1100 1900 2715 30 30 30 30 30 30 30 30 31 29.5 34 34 30 40 1 0.92 160 1 .9 100 3 .78 80 1 .99 60 1.5 1.5 1.5 1.5 6 6 6 6 The bandpass filter's ringing time will impact the measurement if not accounted for. Measurements must commence only after the filter had stopped ringing due to initial application. This is necessary so that a clean reference measurement can be made for attenuation tests and clipping tests. The longest ringing time is 20 ms for the 500 Hz bandpass filter. Since the averaging window for measurement in attenuation testing is 8 ms, the bandpass filter must be inserted at 20 + 8 = 28 ms before the onset of double talk, or 60 - 0.028 = 59.972 ms in the activity mask of section I.1. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 98 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex J Acoustic Echo Path Tutorial J.1 Acoustic Echo Path The use of a real room with appropriate acoustic characteristics is recommended. Simulated echo paths are not recommended as they will not capture any non-linearities, which limit cancellation ability in real world use. The reverberation time (RT60) of the test room should meet ITU-T Recommendation G.167. When averaged over the transmission bandwidth, RT60, shall be approximately 500 ms; the reverberation time in the lowest octave shall be no more than twice this average value; the reverberation time in the highest octave shall be not less than half this value. The volume of a typical test room shall be of the order of 50 m3 / 1500 ft3. Acoustic canceller works mainly on early room reflections, which show a low density of reflections. Residual echo power is dominated by the reverberant tail of the acoustic system’s room impulse response, with a high density of modes and long duration as shown below. Specification should include both reverberation time as per ITU-T Recommendation G.167 and early/late ratio of room. This is an item worthy of study. Direct Path Discrete Early Reflections 10 Relative Amplitude Reverberant Tail 0 Time -10 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 99 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex K HFT Microphones This annex briefly describes a variety of microphone types and applications as they may relate to HandsFree Telephones (HFTs). The frequency response can be measured as a function of distance, and a function of polar position around an imaginary “front” or “frontal axis” of the hands-free telephone. Similarly, it may be desirable to measure the frequency response for both a number of vertical and horizontal polar positions. Omni directional microphones have a uniform output regardless of the actual position of the microphone itself. The electrical output of an omni directional microphone decreases by 6 dB with every doubling of distance from the sound source in a simulated free field or anechoic environment. Cardioid, hyper cardioid, super cardioid, and other directional microphones generally have both frequency response and polar characteristics which vary as a function of position and distance from the sound source. Directional microphones have two or more input ports. It cannot be assumed that the output of such a microphone will decrease by 6 dB with every doubling of distance from a sound source. For the most part, the frequency characteristics of directional microphones are both polar and distance dependent. The sound source should be placed at a distance and direction that approximates real use conditions or the RTP. The microphone system can be implemented with electronic (automatic) gain control (AGC). To reduce background noise, some hands-free telephone sets have send circuits with variable sensitivity based on the voice signal's presence. (When voice is not present, the send sensitivity is reduced, and when voice is present, this sensitivity is restored to normal). It is particularly important that such microphones be tested at normal use input levels to obtain the proper sensitivity. Some HFT’s, frequently known as "teleconferencers," feature multiple microphone elements, typically 2-4, all within a common housing. This type of assembly may be sensitive to both distance and orientation, and may also feature gating, or the ability to turn the microphone on and off Teleconferencers frequently have optional auxiliary microphones intended to extend the coverage area or reach of the teleconferencer. These auxiliary microphones may be either directional or omni-directional, and may therefore be sensitive to both distance and orientation. If the microphone is a separate device, such as a stand, boundary or a lapel microphone, it should be so tested and so described with the appropriate user position as a reference. This configuration is frequently used in PCs and automobiles. Some HFT’s feature multiple microphones which are combined with electronic processing to produce highly directional fixed or variable polar patterns. Testing and describing such systems may require techniques which differ from traditional measurement methods. For instance, beam steering reacts to sounds above a given level. One purpose for using such a method is to improve the acoustic signal to noise ratio, which may be a useful way to describe the test results. It is especially advisable to perform send loudness rating directionality (SLRD) on HFTs with directional microphone systems. Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 100 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex L 1/3 Octave Passbands Band No. 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Nominal Center Frequency [Hz] 100 125 160 200 250 315 400 500 630 800 1000 1250 1600 2000 2500 3150 4000 5000 6300 8000 10000 Table L1 1/3 Octave Passband [Hz] 89.1 112 141 178 224 282 355 447 562 708 891 1120 1410 1780 2240 2820 3550 4470 5620 7080 8910 Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 101 112 141 178 224 282 355 447 562 708 891 1120 1410 1780 2240 2820 3550 4470 5620 7080 8910 11200 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Annex X DRP TO ERP Corrections For HATS Receive Measurements Prior to the calculation of Receive LR on hands-free telephones, section 8.3.x provisionally recommends the addition of 14 dB to the measured receive response to account for the binaural effect of two ear listening. Because a portion of this effect is due the obstacle effect of the head (which is included as a part of the HATS measurement), a provisional correction of {12 dB ?} to HATS receive measurements is recommended. This is to be added to HATS receive measurements prior to any single figure calculations (i.e., LR, Mid-band Average Sensitivity, etc.). This correction should not be applied to the presented frequency response. Frequency (Hz) 92 97 103 109 115 122 130 137 145 154 163 173 183 193 205 218 230 244 259 274 SDE (dB) 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.1 -0.1 0.0 0.1 0.0 -0.1 -0.2 -0.3 -0.3 SDE SDE Frequency Frequency Frequency (Hz) (dB) (Hz) (dB) (Hz) 290 -0.3 917 -1.3 2901 307 -0.2 972 -1.4 3073 325 -0.2 1029 -1.8 3255 345 -0.2 1090 -2.0 3447 365 -0.4 1155 -2.3 3652 387 -0.5 1223 -2.4 3868 410 -0.4 1296 -2.6 4097 434 -0.6 1372 -3.1 4340 460 -0.3 1454 -3.3 4597 487 -0.7 1540 -3.9 4870 516 -0.6 1631 -4.4 5158 546 -0.6 1728 -4.8 5464 579 -0.6 1830 -5.3 5788 613 -0.6 1939 -6.0 6131 649 -0.8 2054 -6.9 6494 688 -0.8 2175 -7.5 6879 729 -1.0 2304 -8.1 7286 772 -1.1 2441 -9.1 7718 818 -1.1 2585 -9.5 8175 866 -1.2 2738 -10.4 8659 Table X1 - 1/12 Octave Filter Center Frequencies Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 102 SDE (dB) -11.0 -10.5 -10.2 -9.1 -8.0 -6.9 -5.8 -5.0 -4.2 -3.3 -2.7 -2.4 -2.4 -2.5 -3.3 -4.5 -5.9 -9.0 -14.2 -20.7 STANDARD METHOD FOR MEASURING TRANSMISSION PERFORMANCE OF HANDS-FREE TELEPHONE SETS Frequency (Hz) 100 106 112 118 125 132 140 150 160 170 180 190 200 212 224 236 250 265 280 SDE (dB) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 -0.1 -0.1 0.0 0.1 0.0 -0.1 -0.1 -0.2 -0.3 -0.3 Frequency (Hz) 335 355 375 400 425 450 475 500 530 560 600 630 670 710 750 800 850 900 950 SDE (dB) -0.2 -0.3 -0.4 -0.4 -0.5 -0.4 -0.5 -0.7 -0.6 -0.6 -0.6 -0.7 -0.8 -0.9 -1.1 -1.1 -1.2 -1.3 -1.4 Frequency (Hz) 1120 1180 1250 1320 1400 1500 1600 1700 1800 1900 2000 2120 2240 2360 2500 2650 2800 3000 3150 SDE (dB) -2.1 -2.3 -2.5 -2.8 -3.2 -3.6 -4.2 -4.7 -5.2 -5.8 -6.5 -7.2 -7.8 -8.5 -9.3 -9.9 -10.6 -10.7 -10.4 300 -0.2 1000 -1.6 3350 -9.6 315 -0.2 1060 -1.9 3550 -8.5 Frequency (Hz) 3750 4000 4250 4500 4750 5000 5300 5600 6000 6300 6700 7100 7500 8000 8500 9000 9500 10000 Table X2 - ISO R40 Preferred Frequencies Copyright 1999 IEEE. All rights reserved. This is an unapproved IEEE Standards Draft, subject to change. 103 SDE (dB) -7.5 -6.3 -5.3 -4.5 -3.7 -3.0 -2.6 -2.4 -2.5 -2.9 -4.0 -5.3 -7.5 -12.2 -18.6 * * *