15 - WIPO

advertisement
IPC Revision WG – Definition Project
United States Patent and Trademark Office
Rapporteur Proposal
Project: F004
Subclass: G10L – Speech Analysis
or Synthesis; Speech Recognition;
Audio Analysis or Processing
Date : 30 June 2011
Title – G10L
Definition statement
This subclass covers:






Determination or detection of speech or voice characteristics
Speech synthesis; Text to speech systems
Speech recognition
Speaker identification or verification
Speech or audio signal analysis-synthesis techniques for redundancy reduction,
e.g., in vocoders; Coding or decoding of speech or audio signals, e.g. for
compression or expansion, source filter models or psychoacoustic analysis
Processing of the speech or voice signal to produce another audible or non-audible
signal, e.g., visual, tactile, in order to modify its quality or its intelligibility
Relationship between large subject matter areas.
“Audio analysis or processing” specifically covers audio signal analysis-synthesis techniques
for redundancy reduction, or audio signal coding or decoding. Classification should be
generally directed to appropriate groups, e.g. G10K, G10H, H04R, H04S when audio
productions or general audio analysis or processing are of relevance.
Devices for the storage of speech or audio signals are covered in subclasses G11B and G11C.
References relevant to classification in this subclass
This subclass does not cover:
Digital data processing methods or equipment specially adapted
for handling natural language data
Teaching or communicating with the blind, deaf, or mute
G06F 17/20
G09B 21/00
Examples of places where the subject matter of this class is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this subclass is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Input/output arrangements for on-board computers
G01C 21/36
Direction-finders for determining the direction from which
infrasonic, sonic, ultrasonic, or electromagnetic waves, or
particle emission, not having a directional significance, are being
received using ultrasonic, sonic, or infrasonic waves
G01S 3/80
Systems using the reflection or reradiation of acoustic waves
G01S 15/00
Compilation or interpretation of high level programme languages
G06F 9/45
Individual entry or exit registers
G07C 9/00
Teaching speaking
G09B 19/04
Electro-acoustic amplifiers
H03F 13/00
Coding, decoding, or code conversion; Compression
H03M 7/30
Transmission
H04B
Means associated with receiver for limiting or suppressing noise or
interference
H04B 1/10
Details of transmission systems, not characterized by the medium
used for transmission, for reducing bandwidth of signals
H04B 1/66
Reducing echo effects or singing; Opening or closing transmitting
path; Conditioning for transmission in one direction or the other
H04B 3/20
Transmission systems using ultrasonic, sonic, or infrasonic
waves
H04B 11/00
Transmission systems not characterized by the medium used for
transmission characterized by the use of pulse modulation
H04B 14/02
Broadcast distribution systems
H04H
Time-division multiplex systems in which the transmission
channel allotted to a first user may be taken away and re-allotted
to a second user if the first user becomes inactive
H04J 3/17
Arrangements of transmitters, receivers, or complete sets to
prevent eavesdropping, to attenuate local noise or to prevent
undesired transmission; Special mouthpieces or receivers
therefore
H04M 1/19
Devices for calling a subscriber whereby a plurality of signals
may be stored simultaneously
H04M 1/27
Substation equipment, e.g. for use by subscribers including
speech amplifiers
H04M 1/60
Simultaneous speech and telegraphic or other data transmission
over the same conductors
H04M 11/06
Systems for transmission of a pulse code modulated video signal
with one or more other pulse code modulated signals, e.g. an
audio signal, a synchronizing signal
H04N 7/52
Public address systems
H04R 27/00
Special rules of classification within this subclass
G10L 17/00 takes precedence over G10L 15/00.
G10L 19/00 takes precedence over G10L 21/00.
Glossary of terms
In this subclass, the following terms (or expressions) are used with the meaning indicated:
Speech
Definite vocal sounds that form words to express
thoughts and ideas.
Voice
Sounds generated by vocal chords or synthetic versions
thereof.
Synonyms and Keywords
In patent documents the following abbreviations are often used:
ADPCM
Adaptive Differential Pulse Code Modulation
AMR
Adaptive Multirate
AR
Autoregressive
ASR
Automatic Speech Recognition
BLP
Backward Linear Prediction
BP
Back Propagation
CELP
Code Excited Linear Prediction
DCT
Discrete Cosine Transform
DFT
Discrete Fourier Transform
DPCM
Differential Pulse Code Modulation
FIR
Finite Duration Impulse Response
FLP
Forward Linear Prediction
HVXC
Harmonic Vector Excitation Coding
LMS
Least Mean Square
LPC
Linear Predictive Coding
MBE
Multi-Band Excitation
MDCT
Modified Discrete Cosine Transform
MELP
Mixed Excitation Linear Prediction
MSE
Mean Square Error
NB – WB
Narrowband – Wideband
PARCOR
Partial Correlation
PWI
Prototype Waveform Interpolation
RELP
Residual Excited Linear Prediction
TDNN
Time Delay Neural Network
TTS
Text-to-Speech
VoIP
Voice over Internet Protocol
VQ
Vector Quantization
VSELP
Vector Sum Excited Linear Prediction
V/UV
Voiced/Unvoiced
VXML or VoiceXML
W3C’s standard XML format
Title – G10L 13/00
Definition statement
This group covers:



Methods for producing synthetic speech; Speech synthesizers
Elementary speech units used in speech synthesizers; Concatenation rules
Text analysis or generation of parameters for speech synthesis out of text, e.g.
grapheme to phoneme translation, prosody generation, stress or intonation
determination
Relationship between large subject matter areas
NONE.
References relevant to classification in this group
This group does not cover:
Processing or translating of natural language
G06F 17/28
Electrophonic musical instruments
G10H
Examples of places where the subject matter of this group is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this group is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Sound producing toys
A63H 5/00
Information retrieval; Database structures therefor
G06F 17/30
Electrically operated educational appliances with audible
presentation of the material to be studied
G09B 5/04
Aids for music
G10G
Special rules of classification within this group
NONE.
Glossary of terms
In this group, the following terms (or expressions) are used with the meaning indicated:
Synonyms and Keywords
In patent documents the following abbreviations are often used:
Title – G10L 15/00
Definition statement
This group covers:



Assessment or evaluation of speech recognition systems
Feature extraction for speech recognition; Selection of recognition unit
Segmentation or word limit detection







Creation of reference templates; Training of speech recognition systems, e.g.
adaptation to the characteristics of the speaker’s voice
Speech classification or search
Speech recognition techniques specially adapted for robustness in adverse
environments, e.g. in noise, of stress induced speech
Procedures used during a speech recognition process, e.g. man-machine dialog
Speech recognition using non-acoustical features, e.g. position of the lips
Speech to text systems
Constructional details of speech recognition systems
Relationship between large subject matter areas
NONE.
References relevant to classification in this group
This group does not cover:
Pattern recognition
G06K 9/00
Examples of places where the subject matter of this group is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this group is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Processing or translating of natural language
G06F 17/28
Transmission of digital information, e.g. telegraphic
communication
H04L
Wireless communication networks
H04W
Special rules of classification within this group
G10L 15/08 takes precedence over G10L 15/26.
G10L 15/14 takes precedence over G10L 15/06.
G10L 15/18 takes precedence over G10L 15/14.
G10L 21/02 takes precedence over G10L 15/20.
Glossary of terms
In this group, the following terms (or expressions) are used with the meaning indicated:
Synonyms and Keywords
In patent documents the following abbreviations are often used:
Title – G10L 17/00
Definition statement
This group covers:









Preprocessing operations, e.g. segment selection; Pattern representation or
modeling, e.g. based on linear discriminant analysis (LDC), principal
components; Feature selection or extraction
Training, model building, enrollment
Decision making techniques, pattern matching strategies
Hidden Markov Models
Artificial neural networks, connectionist approaches
Pattern transformations and operations aimed at increasing system robustness, e.g.
against channel noise, different working conditions
Interactive procedures, man-machine interface, e.g. user prompted to utter a
password or predefined text
Recognition of special voice characteristics, e.g. for use in a lie detector;
recognition of animal voices
Systems using speaker recognisers
Relationship between large subject matter areas
NONE.
References relevant to classification in this group
This group does not cover:
Examples of places where the subject matter of this group is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this group is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Digital computers in which a programme is changed according to
experience gained by the computer itself during a complete run;
Learning machines
G06F 15/18
Interactive information services, e.g. directory enquiries
H04M 3/493
Centralized arrangements for answering calls; Centralized
arrangements for recording messages for absent or busy
subscribers
H04M 3/50
Special rules of classification within this group
NONE.
Glossary of terms
In this group, the following terms (or expressions) are used with the meaning indicated:
Synonyms and Keywords
In patent documents the following abbreviations are often used:
Title – G10L 19/00
Definition statement
This group covers:


Dynamic bit allocation
Correction of errors induced by the transmission channel, if related to the coding
algorithm







Multichannel audio signal coding and decoding, i.e. using interchannel correlation
to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
Comfort noise, silence coding
Systems using vocoders
Audio watermarking, i.e. embedding inaudible data in the audio signal
Using spectral analysis, e.g. transform vocoders, subband vocoders
Using predictive techniques
Gain coding, post filtering design, vocoder structure
Relationship between large subject matter areas
NONE.
References relevant to classification in this group
This group does not cover:
Electronic musical instruments
G10H
Signal processing not specific to the method of recording or
reproducing; Circuits therefor
G11B 20/00
Error correction in communication networks
H04L
Spatial sound recording
H04R 5/00
Spatial sound reproduction
H04S
Examples of places where the subject matter of this group is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this group is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Special rules of classification within this group
G10L 25/90 takes precedence over G10L 19/083.
NONE.
Glossary of terms
In this group, the following terms (or expressions) are used with the meaning indicated:
Synonyms and Keywords
In patent documents the following abbreviations are often used:
Title – G10L 21/00
Definition statement
This group covers:




Changing voice quality (pitch and formant)
Speech enhancement, e.g. noise reduction, echo cancellation.
Time compression or expansion
Transformation of speech into a non-audible representation, e.g. speech
visualization, speech processing for tactile aids
Relationship between large subject matter area
NONE.
References relevant to classification in this group
This group does not cover:
Devices or methods enabling ear patients to replace direct
auditory perception by another kind of perception
A61F 11/04
Examples of places where the subject matter of this group is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this group is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Animation effects
G06T 15/70
Reducing echo effects in line transmission systems
H04B 3/20
Echo suppression in hand-free telephones
H04M 9/08
Hearing aids
H04R 25/00
Special rules of classification within this group
G10L 15/26 takes precedence over G10L 21/06.
Glossary of terms
In this group, the following terms (or expressions) are used with the meaning indicated:
Synonyms and Keywords
In patent documents the following abbreviations are often used:
Title – G10L 25/00
Definition statement
This group covers:








Extracted parameters, e.g. technique for evaluating correlation coefficients, zero
crossing, prediction coefficients, formant information
Analysis technique, e.g. neural network, fuzzy, chaos, genetic algorithm, coding
technique
Analysis window (window function)
Application of voice analysis technique, e.g. being used for comparison and
discrimination, relating to evaluation of synthetic sound, relating to transmitting
result of analysis
Modeling of vocal tract parameter
Detection of presence or absence of speech signals
Pitch determination of speech signals
Discriminating between voiced and unvoiced parts of speech signals
Relationship between large subject matter areas
NONE.
References relevant to classification in this group
This group does not cover:
Measuring for diagnostic purposes
A61B 5/00
Examples of places where the subject matter of this group is covered when specially adapted,
used for a particular purpose, or incorporated in a larger system:
Places in relation to which this group is residual:
Informative references
Attention is drawn to the following places, which may be of interest for search:
Special rules of classification within this group
G10L 25/90 takes precedence over G10L 25/93.
Glossary of terms
In this group, the following terms (or expressions) are used with the meaning indicated:
Synonyms and Keywords
In patent documents the following abbreviations are often used:
Title – G10L 99/00
Subject matter not provided for in other groups of this subclass
Download