IPC Revision WG – Definition Project United States Patent and Trademark Office Rapporteur Proposal Project: F004 Subclass: G10L – Speech Analysis or Synthesis; Speech Recognition; Audio Analysis or Processing Date : 30 June 2011 Title – G10L Definition statement This subclass covers: Determination or detection of speech or voice characteristics Speech synthesis; Text to speech systems Speech recognition Speaker identification or verification Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g., in vocoders; Coding or decoding of speech or audio signals, e.g. for compression or expansion, source filter models or psychoacoustic analysis Processing of the speech or voice signal to produce another audible or non-audible signal, e.g., visual, tactile, in order to modify its quality or its intelligibility Relationship between large subject matter areas. “Audio analysis or processing” specifically covers audio signal analysis-synthesis techniques for redundancy reduction, or audio signal coding or decoding. Classification should be generally directed to appropriate groups, e.g. G10K, G10H, H04R, H04S when audio productions or general audio analysis or processing are of relevance. Devices for the storage of speech or audio signals are covered in subclasses G11B and G11C. References relevant to classification in this subclass This subclass does not cover: Digital data processing methods or equipment specially adapted for handling natural language data Teaching or communicating with the blind, deaf, or mute G06F 17/20 G09B 21/00 Examples of places where the subject matter of this class is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this subclass is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Input/output arrangements for on-board computers G01C 21/36 Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic, or infrasonic waves G01S 3/80 Systems using the reflection or reradiation of acoustic waves G01S 15/00 Compilation or interpretation of high level programme languages G06F 9/45 Individual entry or exit registers G07C 9/00 Teaching speaking G09B 19/04 Electro-acoustic amplifiers H03F 13/00 Coding, decoding, or code conversion; Compression H03M 7/30 Transmission H04B Means associated with receiver for limiting or suppressing noise or interference H04B 1/10 Details of transmission systems, not characterized by the medium used for transmission, for reducing bandwidth of signals H04B 1/66 Reducing echo effects or singing; Opening or closing transmitting path; Conditioning for transmission in one direction or the other H04B 3/20 Transmission systems using ultrasonic, sonic, or infrasonic waves H04B 11/00 Transmission systems not characterized by the medium used for transmission characterized by the use of pulse modulation H04B 14/02 Broadcast distribution systems H04H Time-division multiplex systems in which the transmission channel allotted to a first user may be taken away and re-allotted to a second user if the first user becomes inactive H04J 3/17 Arrangements of transmitters, receivers, or complete sets to prevent eavesdropping, to attenuate local noise or to prevent undesired transmission; Special mouthpieces or receivers therefore H04M 1/19 Devices for calling a subscriber whereby a plurality of signals may be stored simultaneously H04M 1/27 Substation equipment, e.g. for use by subscribers including speech amplifiers H04M 1/60 Simultaneous speech and telegraphic or other data transmission over the same conductors H04M 11/06 Systems for transmission of a pulse code modulated video signal with one or more other pulse code modulated signals, e.g. an audio signal, a synchronizing signal H04N 7/52 Public address systems H04R 27/00 Special rules of classification within this subclass G10L 17/00 takes precedence over G10L 15/00. G10L 19/00 takes precedence over G10L 21/00. Glossary of terms In this subclass, the following terms (or expressions) are used with the meaning indicated: Speech Definite vocal sounds that form words to express thoughts and ideas. Voice Sounds generated by vocal chords or synthetic versions thereof. Synonyms and Keywords In patent documents the following abbreviations are often used: ADPCM Adaptive Differential Pulse Code Modulation AMR Adaptive Multirate AR Autoregressive ASR Automatic Speech Recognition BLP Backward Linear Prediction BP Back Propagation CELP Code Excited Linear Prediction DCT Discrete Cosine Transform DFT Discrete Fourier Transform DPCM Differential Pulse Code Modulation FIR Finite Duration Impulse Response FLP Forward Linear Prediction HVXC Harmonic Vector Excitation Coding LMS Least Mean Square LPC Linear Predictive Coding MBE Multi-Band Excitation MDCT Modified Discrete Cosine Transform MELP Mixed Excitation Linear Prediction MSE Mean Square Error NB – WB Narrowband – Wideband PARCOR Partial Correlation PWI Prototype Waveform Interpolation RELP Residual Excited Linear Prediction TDNN Time Delay Neural Network TTS Text-to-Speech VoIP Voice over Internet Protocol VQ Vector Quantization VSELP Vector Sum Excited Linear Prediction V/UV Voiced/Unvoiced VXML or VoiceXML W3C’s standard XML format Title – G10L 13/00 Definition statement This group covers: Methods for producing synthetic speech; Speech synthesizers Elementary speech units used in speech synthesizers; Concatenation rules Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation, stress or intonation determination Relationship between large subject matter areas NONE. References relevant to classification in this group This group does not cover: Processing or translating of natural language G06F 17/28 Electrophonic musical instruments G10H Examples of places where the subject matter of this group is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this group is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Sound producing toys A63H 5/00 Information retrieval; Database structures therefor G06F 17/30 Electrically operated educational appliances with audible presentation of the material to be studied G09B 5/04 Aids for music G10G Special rules of classification within this group NONE. Glossary of terms In this group, the following terms (or expressions) are used with the meaning indicated: Synonyms and Keywords In patent documents the following abbreviations are often used: Title – G10L 15/00 Definition statement This group covers: Assessment or evaluation of speech recognition systems Feature extraction for speech recognition; Selection of recognition unit Segmentation or word limit detection Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker’s voice Speech classification or search Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech Procedures used during a speech recognition process, e.g. man-machine dialog Speech recognition using non-acoustical features, e.g. position of the lips Speech to text systems Constructional details of speech recognition systems Relationship between large subject matter areas NONE. References relevant to classification in this group This group does not cover: Pattern recognition G06K 9/00 Examples of places where the subject matter of this group is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this group is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Processing or translating of natural language G06F 17/28 Transmission of digital information, e.g. telegraphic communication H04L Wireless communication networks H04W Special rules of classification within this group G10L 15/08 takes precedence over G10L 15/26. G10L 15/14 takes precedence over G10L 15/06. G10L 15/18 takes precedence over G10L 15/14. G10L 21/02 takes precedence over G10L 15/20. Glossary of terms In this group, the following terms (or expressions) are used with the meaning indicated: Synonyms and Keywords In patent documents the following abbreviations are often used: Title – G10L 17/00 Definition statement This group covers: Preprocessing operations, e.g. segment selection; Pattern representation or modeling, e.g. based on linear discriminant analysis (LDC), principal components; Feature selection or extraction Training, model building, enrollment Decision making techniques, pattern matching strategies Hidden Markov Models Artificial neural networks, connectionist approaches Pattern transformations and operations aimed at increasing system robustness, e.g. against channel noise, different working conditions Interactive procedures, man-machine interface, e.g. user prompted to utter a password or predefined text Recognition of special voice characteristics, e.g. for use in a lie detector; recognition of animal voices Systems using speaker recognisers Relationship between large subject matter areas NONE. References relevant to classification in this group This group does not cover: Examples of places where the subject matter of this group is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this group is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Digital computers in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines G06F 15/18 Interactive information services, e.g. directory enquiries H04M 3/493 Centralized arrangements for answering calls; Centralized arrangements for recording messages for absent or busy subscribers H04M 3/50 Special rules of classification within this group NONE. Glossary of terms In this group, the following terms (or expressions) are used with the meaning indicated: Synonyms and Keywords In patent documents the following abbreviations are often used: Title – G10L 19/00 Definition statement This group covers: Dynamic bit allocation Correction of errors induced by the transmission channel, if related to the coding algorithm Multichannel audio signal coding and decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing Comfort noise, silence coding Systems using vocoders Audio watermarking, i.e. embedding inaudible data in the audio signal Using spectral analysis, e.g. transform vocoders, subband vocoders Using predictive techniques Gain coding, post filtering design, vocoder structure Relationship between large subject matter areas NONE. References relevant to classification in this group This group does not cover: Electronic musical instruments G10H Signal processing not specific to the method of recording or reproducing; Circuits therefor G11B 20/00 Error correction in communication networks H04L Spatial sound recording H04R 5/00 Spatial sound reproduction H04S Examples of places where the subject matter of this group is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this group is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Special rules of classification within this group G10L 25/90 takes precedence over G10L 19/083. NONE. Glossary of terms In this group, the following terms (or expressions) are used with the meaning indicated: Synonyms and Keywords In patent documents the following abbreviations are often used: Title – G10L 21/00 Definition statement This group covers: Changing voice quality (pitch and formant) Speech enhancement, e.g. noise reduction, echo cancellation. Time compression or expansion Transformation of speech into a non-audible representation, e.g. speech visualization, speech processing for tactile aids Relationship between large subject matter area NONE. References relevant to classification in this group This group does not cover: Devices or methods enabling ear patients to replace direct auditory perception by another kind of perception A61F 11/04 Examples of places where the subject matter of this group is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this group is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Animation effects G06T 15/70 Reducing echo effects in line transmission systems H04B 3/20 Echo suppression in hand-free telephones H04M 9/08 Hearing aids H04R 25/00 Special rules of classification within this group G10L 15/26 takes precedence over G10L 21/06. Glossary of terms In this group, the following terms (or expressions) are used with the meaning indicated: Synonyms and Keywords In patent documents the following abbreviations are often used: Title – G10L 25/00 Definition statement This group covers: Extracted parameters, e.g. technique for evaluating correlation coefficients, zero crossing, prediction coefficients, formant information Analysis technique, e.g. neural network, fuzzy, chaos, genetic algorithm, coding technique Analysis window (window function) Application of voice analysis technique, e.g. being used for comparison and discrimination, relating to evaluation of synthetic sound, relating to transmitting result of analysis Modeling of vocal tract parameter Detection of presence or absence of speech signals Pitch determination of speech signals Discriminating between voiced and unvoiced parts of speech signals Relationship between large subject matter areas NONE. References relevant to classification in this group This group does not cover: Measuring for diagnostic purposes A61B 5/00 Examples of places where the subject matter of this group is covered when specially adapted, used for a particular purpose, or incorporated in a larger system: Places in relation to which this group is residual: Informative references Attention is drawn to the following places, which may be of interest for search: Special rules of classification within this group G10L 25/90 takes precedence over G10L 25/93. Glossary of terms In this group, the following terms (or expressions) are used with the meaning indicated: Synonyms and Keywords In patent documents the following abbreviations are often used: Title – G10L 99/00 Subject matter not provided for in other groups of this subclass