Number Systems - My FIT - Florida Institute of Technology

advertisement
101101010010
100101010010
100100100101
010010010101
001001010101
111000001111
Veton Këpuska
Florida Institute of Technology
Digital Signal Processing:
From Theory to Practical
Audio and Video Applications
Digital Signal Processing:
From Theory to Practical Audio and Video
Applications
Table of Contents
5.
Chapter
3
5.1
Introduction
4
5.2
Numbers and Numerals
5
5.2.1
Number Systems
6
5.2.2
The Babylonian Systems
6
5.2.3
The Egyptian System
6
5.2.4
Maya Indians
7
5.2.5
The Greek System - Abacus
7
5.2.6
Roman System
7
5.2.7
Hindu-Arabic Numerals
7
5.3
Numbers
8
5.3.1
Whole Numbers
8
5.3.2
Integer Numbers
8
5.3.3
Fractions or Rational Numbers
8
5.3.4
Irrational Numbers
8
5.3.5
Real Numbers & Complex Numbers
9
5.4
Positional Number Systems
9
5.5
Sampling and Reconstruction of Signals
10
5.6
Scalar Quantization
13
5.6.1
Quantization Noise
17
5.6.1.1
Granular Distortion
17
5.6.1.2
Overload Distortion
18
5.6.1.3
Analysis of Quantization Noise
18
5.6.2
Signal-to-Noise Ratio
22
5.6.3
Transmission Rate
24
5.6.4
Nonuniform Quantizer
25
5.6.5
Companding
26
5.7
Data Representations
27
5.8
Fixed-Point Number Representations
28
5.8.1
Sign-Magnitude Format
29
5.8.2
One’s-Complement Format
30
5.8.3
Two’s-Complement Format
31
5.9
Fixed-Point DSP’s
32
5.10
Fixed-Point Representations Based on Radix-Point.
33
5.10.1
Dynamic Range
37
5.10.2
Precision
38
6.
Implementation Considerations
41
6.1
Assembly
42
6.2
C – Language Support for Fractional Data Types
42
6.3
C++ – Language Support for Fractional Data Types
47
6.4
C vs. C++ Important Distinctions
48
2.Chapter
2
Chapter
Digital Signal
Representations
To bridge the gap from theory to practice one has to master the
conventions used to represent the data in a DSP processor.
T
2.1
he details of digital representations of discrete-time signals are presented in this
chapter bridging the gap from the abstract discrete-time signal notation
presented earlier in the book, x[n], and its representation in a digital processor,
and more specifically a digital signal processor (DSP).
Introduction
Continuous1 signals are necessarily sampled at discrete time intervals as well as
approximated by a finite number of discrete magnitude values in order to be represented
digitally.
Because digital processing devices process data at discrete time steps, continuous signals
must be sampled at discrete time intervals. It turns out that it is possible to sample
continuous signals at discrete time intervals, producing discrete-time signals, without any
loss or degradation as compared to original signal. Converted continuous signals from
their discrete time representation are identical to its original if certain conditions are met.
Those conditions are described in Sampling Theorem presented in Chapter 1 [1][2].
Additional limitation of digital processing devices, degree of which is dictated by their
architecture, is the restriction that the data must be represented by a finite number of
digits, or more specifically by finite number of bits. Typically, digital processors are
designed to store and process data that have fixed specific minimal and maximal number
Continuous signals are referred in literature also as Analog signals. Both terms are used here interchangeably
unless stated otherwise.
1
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
of bits allocated for each representation. These restrictions impose representations to
have finite-precision.
The process of representing a continuous actual value by its discrete representation is
known as Quantization. When finite-precision is used to represent actual values the
following considerations are necessary in order to access the quantization effects on the
output [1][3][4].
1.
Quantize in time and magnitude continuous input value x(t) of the signal to
obtain discrete-time sequence x[n],
2.
Quantize actual values of the coefficients {Ak, k=0,..,N} representing a DSP
system (e.g., filter) with a finite-precision representation {ak, k=0,..,N}, and
3.
Consider effects of arithmetic operations using finite-precision representations
on the output and modify implementation as necessary to obtain optimal
result.
The effects of quantization on the continuous signal and finite-precision operations are
well studied and understood [2][1][4][3]. Consequently, it is possible to convert
continuous signals to digital, process it, and reconstruct it back to continuous
representation with desired quality. Reconstructed signals typically have characteristics
that fulfill certain quality criteria that are superior to analog counterparts.
In the proceeding sections all three enumerated issues regarding representation of data
with finite-precision are discussed. However, it is also important to understand
development of abstract concept of numbers and historical roots of such representation.
Discussion of numbers and number systems is introduced from a historical perspective
that it is believed will shed light into fundamental concepts of numbers and number
systems that shaped current understanding of the numbers and how they are
represented.
2.2
Numbers and Numerals
The development of human civilization is closely followed by the development of
representations of numbers [5]. Numbers are represented by numerals2. In the past there
were several kinds of numeral notations, notions and symbols.
In the early days one pile of items was considered equivalent to another pile of different
number of items of different kind. This value system was used for trading of goods.
Further development was achieved with standardization of “value”; a fixed number of
items of one kind (e.g., 5) placed in a special corresponding place, and it was considered
Webster Dictionary defines numeral as:
Function: noun
1 : a conventional symbol that represents a number
2
5
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
equivalent to one item of special kind placed in other place. This correspondence led to
earlier ways of representing numbers in written form. Since the early days the way we do
arithmetic is intimately related to the way we represent the numbers [6][5].
2.2.1 Number Systems
Earlier Number Systems named after the cultures/civilizations that used it [5] are listed
below:






Babylonian
Egyptian
Maya
Greek
Roman
Hindu-Arabic
The Babylonian Systems
The earliest recorded numerals are on Sumerian clay tables dating from the first half of
the third millennium B.C. The Sumerian system was later taken over by the Babylonians.
Everyday system for relatively small numbers was based on grouping by tens, hundreds,
etc. inherited from Mesopotamian civilizations. Large numbers were seldom used. More
difficult mathematical problems were considered by using sexagesimal (radix 60)
positional notation. Sexagesimal notation was highly developed as early as 1750 B.C. This
notation was unique in that it was actually a floating point form of representation with
exponentials omitted. Proper scale factors or power of sixty was to be supplied by the
context. The Babylonian cuneiform3 script was formed by impressing wedge-shaped
marks in clay tables.
It is because the ancients made astronomical calculations in base 60 that we still use this
system for measuring time. One hour is comprised of 60 minutes, 1 minute of 60
seconds. Circle comprises 360° degrees (°) because earth circles the sun in roughly 360
days. Due to Babylonians, each degree is divided into 60 minutes (‘) and each minute
into 60 seconds (‘’), and each second into 60 thirds (‘’’) [5]. Babylonian notation was
positional (e.g., place value notation). The same symbol may mean 1, 60, 602, or …
according to its position. Since they had no concept of zero this notation could be
confusing because of ambiguity.
The Egyptian System
The Egyptian system used | for 1, ||||| for 5, ∩ for 10, ∩∩∩∩∩ for 50, etc. Because
they used a different symbol for ones, tens, hundreds, thousands, etc. the range of
3
From Latin cuneus - wedge
6
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
numbers that could be represented was limited. Note that later Romans adopted this
system to represent their numbers.
Maya Indians
From ancient civilizations only Maya Indians have used concept of “zero” as a quantity
sometime around 200 A.D. They have also introduced fixed point notation as early as 1
century A.D. Their number system was a radix-20 system.
The Greek System - Abacus
Greek numerals from about 5th century B.C. used alphabetic characters (24 characters)
to represent numbers. Since 27 symbols were needed three letters of Semitic origin were
adopted. Greek Abacus originates at about 2nd century B.C. Row and Columns of
pebbles organized in a matrix that correspond to our decimal system. Written form did
not follow the positional notation of the decimal system. On the other hand, Greek
astronomers make use of a sexagesimal positional notation for fractions, adapted from
Babylonians.
Roman System
Because Roman numerals were in use in Europe for over a thousand years we are still
familiar with them and use them in certain instances (clock faces, enumerated lists in
written documents, monuments, etc.). The Roman number system was based on
Etruscan letter notations I, V, X, L, L, C, D, and M for 1, 5, 10, 50, 100, 500, 1000.
Subtractive principle, whereby 9 and 40 are written as IX and XL became popular during
medieval times since it was hardly used by the Romans. It is interesting to note that
original symbol for M (1000) was . The symbol ∞ is corruption of . In 1655 John
Wallis proposed that this symbol be used for “infinity” [5].
Hindu-Arabic Numerals
The numeration we use now: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, is often referred to as Arabic
notation, but it is of Hindu origin. It is transmitted to Europe by Arab scholars. The
value of a digit depends on its position in the system (its place in the number determines
its value). Consequently zero is needed to be able to represent numbers unambiguously.
For example 704 compared to 74. In fact it was this way that the concept of zero was
forced itself onto Indian mathematicians. In theory, zero is also needed occasionally in
the Babylonian system, but as the base is much larger, the context would usually supply
the missing information. Consequently, Babylonians struggled on without zero for over
a thousand years.
Such earlier notations were inconvenient for performing arithmetic operations except
for the simplest cases. Analysis of those earlier number systems also reveals two distinct
approaches: sign-value notation (e.g., Roman Numeral System) and positional
notation or place-value notation that is commonly used today. Furthermore, the
abstract concept of a number and the objects being counted were not separable for a
long time as exemplified by many languages. In those languages there are many names
7
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
for number of particular objects but not for the idea of numbers. For example, Fiji
Islanders use “bolo” for ten boats, but “koro” for ten coconuts. In English language
couple refers to two people, a century to 100 years, etc.
2.2.2 Types of Numbers
In order to understand how the numbers are represented in modern digital computing
systems it is important to know what kind of possible numbers are in use.
Whole Numbers
Whole numbers are 1, 2, 3, 4, …, defining the set ℕ, also called the counting or natural
numbers. 0 is sometimes included in the list of "whole" numbers, but there seems to be
no general agreement. Some authors also interpret "whole number" to mean "a number
having fractional part of zero," making the whole numbers equivalent to the integers.
Integer Numbers
Advancement of mathematics brought by the discipline of algebra forced the recognition
of negative number (e.g., to obtain solution of the following equation 2x+9=3 requires
introduction of negative numbers). The set of whole numbers when extended with zero
and negative whole numbers defines the set ℤ of integers: -4, -3, -2, -1, 0, 1, 2, 3, 4, … .
Fractions or Rational Numbers
A fraction or a rational number is defined as the ratio of two whole numbers p, q:
p
.
q
The set of all rational numbers is denoted with ℚ, derived from German word Quotient
which can be translated as ratio.
Most of the early systems used and named only few obvious common fractions. In the
famous Rhind papyrus4, a famous document from the Egyptian Middle Kingdom that
1 1 1 1
dates to 1650 BC, only simple names for the unit fractions were used: , , , ,  ,
2 3 4 5
2
and for . Other fractions when required were obtained by adding these simple
3
5 1 1 1
fractions. For example:    .
7 2 7 14
Irrational Numbers
Discovery of irrational numbers is attributed to Pythagoras, who found that the diagonal
of a square is not a rational multiple of its side (diagonal = 2 of a square with sides
equal to 1). In other words, the ratio of diagonal to side cannot be expressed by whole
4
It was found in the memorial temple (or mortuary temple) of Pharaoh Ramesses II.
8
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
numbers. Irrational numbers have decimal expansions that neither terminate nor
become periodic. Examples of irrational numbers are 2 , 3 ,  , e .
Real Numbers & Complex Numbers
The collection of rational and irrational numbers defines the set ℝ of real numbers. Real
numbers can be extended to complex numbers with the addition of imaginary number
i   1 . A complex number z is expressed as:
z  x  iy
Where x, y are real numbers, and i imaginary number.
2.2.3 Positional Number Systems
In the positional notation, the value of a number depends on the numeral as well as its
position within the number. Typically, the value of the position is the power of ten. For
example, number represented by the numeral 1957 is equal to 7 one’s, 5 ten’s, 9
hundred’s, and 1 thousand’s. This concept leads to generalization of the value
represented by a numeral as follows:
Explicit Position of Radix Point
x   d n1 B n1    d 2 B 2  d1 B1  d 0 B 0  d 1 B 1  d 2 B 2    d m B  m
where: ± is sign of the number, di ∈ {0, 1, 2, …, B-1} are the set of numerals, “•” is
decimal, or in general radix point, and B is the base of the number system.
Note that the number to the left of the radix point, called integral part, denotes an integer
part of the number represented by n numerals. The number to the right of the radix
point, called fractional part, represents fractional number less than 1 represented by m
numerals. With this notation, the set of real numbers ℝ can be represented [7].
Computers, can only use a finite subset of the numbers due to finite resources available
to represent a number. Consequently, only finite and limited set of numbers can be
represented. This set is defined by the total number of elements that it can represent as
well as the range of values that it covers. Native representation of a numeral in a
computer is in Binary system, or base B=2. The numerical value in our accustomed,
reference base 10, number system of a base 2 (or binary) number is given by the
following expression:
x  bn1 2 n1    b2 2 2  b1 21  b0 2 0  b1 2 1  b2 2 2    bm 2  m
9
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
where: ± is sign of the number, bi ∈ {0, 1} takes values from the set of binary numerals,
and “•” is binary point. The range of values and their precision is defined by n, number
of bits used to represent the integer portion of a number, and m, number of bits to
represent fractional part of the number.
2.3
Sampling and Reconstruction of Signals
Typical DSP system interfaces with continuous world via Analog-to-Digital (ADC) and
Digital-to-Analog (DAC) Converters as depicted in Figure 2.1.
Analog Signal Processing
Sensor
Analog
Signal
Conditioning
Digital Signal Processing
Digital
Signal
Conditioning
ADC
DSP
DAC
Figure 2.1 DSP interfacing with continuous signal that is optionally conditioned by the
sensor conditioner.
In order to satisfy Sampling Theorem requirements, continuous input signal must be
ensured to be band-limited. Thus, the ADC is preceded by a low-pass-filter [1]. This prefiltering is critical step in any digital processing system. It ensures that the effects of
aliasing are minimized to the levels that are not perceptible by the intended audience.
The filter is implemented as analog low-pass filter. The band-limited signal is then
sampled by a fixed sample rate or equivalently sampling frequency fs. The sampling is
performed by sample-and-hold device. This signal is then quantized and represented in
a digital form as a sequence of binary digits/bits that have values of 1’s and 0’s. Quantized
representation of the data is then converted to a desired digital representation of a DSP
to facilitate further processing. The conversion process is depicted in Figure 2.2.
10
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
a)
b)
x(t)
Analog
Low-pass
Filter
Sample
and
Hold
Analog to
Digital
Converter
DSP
c)
Figure 2.2 Analog-to-Digital Conversion. a) Continuous Signal xt  . b) Sampled
signal xa nT  with sampling period T satisfying Nyquist rate as specified
by Sampling Theorem. c) Digital sequence xn obtained after sampling
and quantization
Example 2.1
Assume that the input continuous-time signal is pure periodic signal represented by the
following expression:
xt   A sin  0 t     A sin 2f 0 t   
where A is amplitude of the signal, 0 is frequency in radians per second (rad/sec),  is
phase in radians, and f0 is frequency in cycles per second measured in Hertz (Hz).
Assuming that the continuous-time signal x(t) is sampled every T seconds or alternatively
with the sampling rate of fs=1/T, the discrete-time signal x[n] representation obtained
by t=nT will be:
xn  A sin 0nT     A sin 2f 0nT   
11
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Alternative representation of x[n]:


f
xn  A sin  2 0 n     A sin 2F0 n     A sin 0n   
fs


reveals additional properties of the discrete-time signal.
The F0= f0/fs defines normalized frequency, and 0 digital frequency defined as:
0  2F0  0T , 0  0  2
A DSP processor performs a programmed operation, typically a complex algorithm, on
the suitably represented input signal. The result is obtained as a sequence of digital values.
Those values after being converted into an appropriate data representation (e.g., 24 bit
signed integers) are converted back into continuous domain via digital-to-analog
converter (DAC). The procedure is depicted in the Figure 2.3.
a)
b)
DSP
y[n]
c)
Digital to
Analog
Converter
ya(nT)
Analog
Low-pass
Filter
y(t)
Figure 2.3 Digital-to-Analog Conversion. a) Processed digital signal y[n]. b)
Continuous signal representation ya(nT). c) Low-pass filtered continuous
signal y(t).
Quantization in time, via sampling, as well as in amplitude of continuous input signals,
x(t), to discrete-time signal x[n], as well as coefficients of digital signal processing
structures requires also resolving how the numbers are represented by a digital signal
processor.
The next section will discuss issues of quantization, numbers and their representations.
12
D I G I T A L
2.4
S I G N A L
R E P R E S E N T A T I O N S
Scalar Quantization
The component of the system that transforms an input value x[n] into one of a finite set
of prescribed values x̂n is called scalar quantization. As depicted in Figure 2.2, this
function is depicted with ideal sample-and-hold followed by Analog to Digital Converter.
This function can be further refined by the representation depicted in Figure 2.4. The
ideal C/D converter represents the sampling performed by the sample-and-hold, and
quantizer and coder combined represent ADC.
x(t)
C/D
Quantizer
Coder
Figure 2.4 Conceptual representation of ADC.
This conceptual abstraction allows us to assume that the sequence xn is obtained with
infinite precession. Those values of xn are scalar quantized to a set of finite precision
amplitudes denoted here by xˆQ n . Furthermore, quantization allows that this finiteprecision set of amplitudes to be represented by corresponding set of (bit) patterns or
symbols, x̂n . Without loss of generality, it can be assumed that input signals cover
finite range of values defined by minimal, xmin and maximal values xmax respectively. This
assumption in turn implies that the set of symbols representing x̂n is finite. The
process of representing finite set of values to a finite set of symbols is know as encoding;
performed by the coder, as in Figure 2.4. Thus one can view quantization and coding as
a mapping of infinite precision value of xn to a finite precision representation x̂n
picked from a finite set of symbols.
Quantization, therefore, is a mapping of a value x[n], xmin  x  xmax, to x̂n . The
quantizer operator, denoted by Q(x), is defined by:
xˆ[n]  xˆi  Qx[n],
xi-1  x[n]  xi
where x̂i denotes one of L possible quantization levels where 1 ≤ i ≤ L and xi represents
one of L +1 decision levels.
13
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
The above expression is interpreted as follows; If xi-1  x[n]  xi , then x[n] is quantized
to the quantization level x̂i and x̂n is considered quantized sample of xn . Clearly from
the limited range of input values and finite number of symbols it follows that
quantization is characterized by its quantization step size i defined by the difference of
two consecutive decision levels:
 i  xi  xi 1
Example 2.2
Assume there are L = 24 = 16 reconstruction levels. Assuming that input values xn
fall within the range [xmin=-1, xmax=1] and that the each value in this range is equally
likely5. Decision levels and reconstruction levels are equally spaced; =i,= (xmax-xmin)/L
i=0, …, L.-1,

Decision Levels:
 15 13 11 9 7 5 3   3 5 7 9 11 13 15 
 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 

Reconstruction Levels:
 8,7,6,5,4,3,2,,,2,3,4,5,6,7,8
xˆ  Qx 
5
14
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Figure 2.5 Example of a uniform quantization for L=16 levels. As it is discussed in
following sections, L=16 levels, require 4 bits of codeword to represent
each level. Because the distribution of the input values is uniform the
decision and reconstruction levels are uniformly spaced.
In previous Example 2.2 a uniform quantizer was described. Here a uniform quantizer
is formally defined as one whose decision and reconstruction levels are uniformly spaced.
Specifically:
   i  xi  x i 1,
x  xi 1
xˆi  i
,
2
1 i  L
1 i  L
Thus, , the step size equal to the spacing between any two consecutive decision levels,
is constant for any two consecutive reconstruction levels in a uniform quantizer.
Each reconstruction level is attached a symbol – or the codeword. Binary numbers are
typically used to represent the quantized samples. The term Codebook, refers to
collection of all codewords or symbols. In general, with B-bit binary codebook there are
2B different quantization (or reconstruction) levels. This representational issue is detailed
in the following sections.
When designing or applying a uniform scalar quantizer, the knowledge of the maximum
value of the sequence is required. Typically the range of the input signal (e.g., speech,
audio, video), is expressed in terms of the standard deviation, x, of the probability
density function (pdf) of the signals’ amplitudes. Specifically, it is often assumed that the
range of input values is equal to: -4x≤x[n]≤4x where x is signal’s standard deviation.
In addition to quantization, many algorithms depend on accurate yet simple
mathematical models describing statistics of signals. Several studies have been conducted
on speech signals assuming that speech signal amplitudes are realizations of a random
process. More recently, accuracy of several pdf models were evaluated as function
duration of speech segment used for capturing speech statistics [9].
The following functions, also depicted in Figure 2.6, are evaluated as models of speech
signal pdf’s:
G A M M A
D I S T R I B U T I O N

3
f x   
 8 x x
L A P L A C I A N
1

2
3x
 exp  

 2 x


D I S T R I B U T I O N
15

,    x  


D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
 1
f x   
 2 x
G A U S S I A N

 2x
 exp  
 

x



,    x  


D I S T R I B U T I O N
 1
f x   
 2  x
2



 exp   x 2 ,    x  



 2 x 

where x - is Standard Deviation.
Figure 2.6. Pdf models of Speech sample distributions.
For speech signals, under the assumption that speech samples obey Laplacian pdf,
approximately 0.35% of speech samples fall outside of this range defined by 4x.
Example 2.3.
Assume B-bit binary codebook having thus 2B codewords or symbols. Maximum signal
value is set to xmax = 4x. What is the quantization step size of a uniform quantizer?
2 xmax
2x
 2 B  2 xmax   2 B    max

2B
16
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
From the discussion presented this far it is clear that quality of representation is related
to step size of the quantizer, , which in turn depends on the number of bits B used to
represent a signal value. The quality of quantization typically is expressed as function of
the step size  and relates directly to the notion of quantization noise.
2.4.1 Quantization Noise
There are two classes of quantization noise:


Granular Distortion, and
Overload Distortion
Granular Distortion
Granular distortion occurs for the values of x[n], unquantized signal, which fall within
the range of the quantizer [xmin, xmax]. The quantization noise, e[n], is the error that
occurs because infinite precision value x[n] is approximated with finite precision value
of quantized representation x̂n . Specifically, quantization error e[n] is defined as
difference of quantized value x̂n from true value x[n]:
en  xˆn  xn
For a given step size  the magnitude of the quantization noise e[n], can be no greater
than /2, that is:



 en  
2
2
Example 2.4.
For the periodic sine-wave signal use 3-bit and 8-bit quantizer values. The input periodic
signal is given with the following expression:
xn  cos0n , 0  2F0  0.76  2
MATLAB fix function is used to simulate quantization. The following figure depicts
the result of the analysis.
17
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Figure 2.7 Plot a) represents sequence x[n] with infinite precision, b) represents
quantized version x̂n , c) represents quantization error e[n] for B=3
bits (L=8 quantization levels), and d) is quantization error for B=8 bits
(L=256 quantization levels).
Overload Distortion
Overload distortion occurs when the samples fall outside range covered by the quantizer.
Those samples are typically clipped and they incur a quantization error in excess of
/2. Due to the small number of clipped samples it is common to neglect the
infrequent large errors in theoretical calculations.
Often the goal of signal processing in general and specifically audio or image processing
is to maintain the bit rate as low as possible while maintaining a required level of quality.
Meeting those two criteria requires fulfilling competing requirements.
Analysis of Quantization Noise
Desired approach in analyzing the quantization error in numerous applications is to
assume quantization error is an ergodic white-noise random process. This implies that
the process, that is quantization error e[n], is uncorrelated. In addition, it is also assumed
that the quantization noise and the input signal are uncorrelated, i.e., E(x[n]e[n+m])=0,
18
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
 m. Final assumption is that the pdf of the quantization noise is uniform over the
quantization interval:


 1
,  e 
pe    
2
2
 0,
otherwize 
Stated assumptions are not always valid. Consider a slowly varying input signal x[n], then
quantization error e[n] is also changing linearly, thus being signal dependent as depicted
in the Figure 2.8. Furthermore, correlated quantization noise can be annoying (e.g., image
sequences – tv, or audio).
Figure 2.8. Example of slowly varying signal that causes quantization error to be
correlated. Plot a) represents sequence x[n] with infinite precision, b) represents
quantized version x̂n , c) represents quantization error e[n] for B=3 bits (L=9
quantization levels), and d) is quantization error for B=8 bits (L=256 quantization levels).
Note reduction in correlation level with increase of number of quantization levels which
implies degrease of step size .
19
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
As illustrated in Figure 2.8, when quantization step  is small then assumptions for the
noise being uncorrelated with itself and the signal are roughly valid particularly when the
signal fluctuates rapidly among all quantization levels. In this case, quantization error
approaches a white-noise process with an impulsive autocorrelation and flat spectrum.
Next figure demonstrates quantization effects on the speech6 signal.
Figure 2.9. Example of speech signal demonstrating the effect of step size to the degree
of correlation of quantization error. Plot a) represents sequence x[n] with
infinite precision, b) represents quantized version x̂n , c) represents
quantization error e[n] for B=3 bits (L=9 quantization levels) that is clearly
highly correlated with original signal x[n], and d) is quantization error for B=8
bits (L=256 quantization levels). Note reduction in correlation level with
increase of number of quantization levels which implies degrease of step size
.
6
The signal was taken from the file: TEST\DR3\FPKT0\si1538.wav of the TIMIT corpus.
20
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Figure 2.10 Histogram of quantization error for speech signal. Plots depict distribution
of quantization errors with a) L = 23, b) L = 28 and c) L = 216 quantization levels. Note
the reduction of error magnitude as well as increase of uniformity of the distribution
with increase of number of quantization levels.
As depicted in the Figure 2.10, with increase in number quantization levels L, degrease
of correlation marked as flattening of the distribution approaching to a uniform can be
observed.
Additional approach can be used to force e[n] to be white-noise and uncorrelated with
x[n] by adding white-noise to x[n] prior to quantization. The effect of this approach is
demonstrated in the next Figure 2.11, obtained by adding insignificant amount of
Gaussian noise with zero mean and variance of 5 to the original signal. Dramatic
improvement is clearly visible particularly for the L = 216 quantization level by comparing
distributions with the one in previous Figure 2.10.
21
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Figure 2.11. Histogram of quantization error for speech signal after adding Gaussian
noise with zero mean and variance of 5. Plots depict distribution of
quantization errors with a) L = 23, b) L = 28 and c) L = 216 quantization levels.
Note the increase of uniformity of the distributions compared to the case
where no noise was added.
Process of adding white noise is known as Dithering. This de-correlation technique was
shown to be useful not only in improving the perceptual quality of the quantization noise
of speech signals but also with image signals.
2.4.2 Signal-to-Noise Ratio
A measure to quantify severity of the quantization noise is signal-to-noise ratio (SNR). It
relates the strength of the signal to the strength of the quantization noise, and it is
formally defined as:
1 N 1 2
x [ n]
 x2 E x 2 [n] N 
n 0
SNR  2 

1 N 1 2
 e E e 2 [ n]
 e [ n]
N n 0




Given the following assumptions:
22
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S

Quantizer range: 2xmax, and

Quantization interval: = 2xmax /2B, for a B-bit quantizer, and

Uniform pdf of the quantization error e[n],
it can be shown that (see Example 5.5)

2 xmax 2 B
2
 

12
12
2
e

2
2
xmax

32 2 B
Thus SNR can be expressed as:
SNR 
2B

 x2
322 B
2  32





x
2
2

 e2
 xmax  xmax  x 
or in decibels dB as:
 2 
x2
SNRdB   10 log 10  x2   10log 10 3  2 B log 10 2  10 log 10 max2
x
e 
x
 6 B  4.77  20 log 10 max
x
Assuming that maximal value xmax, obtained from the pdf of the distribution of x[n], is
set to xmax = 4x, then SNR(dB) expression becomes:
SNRdB  6B  7.2
Example 2.5.
For uniform quantizer with the quantization interval , derive the variance of the error
signal. Consider that the signal is random with uniform probability distribution within
the interval defined by  as defined in the figure below.
The mean and variance of the p(e) are first two moments, m, of the random process
defined as expected value of random variable e:
E e

m
   e pede
m

Thus mean and variance of the p(e) are:
23
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Figure 2.12 Uniform probability distribution function p(e) of error signal in the range [/2, /2].

me 

2
1
  de  0


2

2


2


2
1 2
1
2e 
2 
2
2
   e de  2  e de 
  3  2   12



3
 

0
0
3
2
e

3
2
2.4.3 Transmission Rate
Another important factor in utilizing the DSP processors is the Bit rate R, defined as
the number of bits per second streamed from the input into the DSP. Bit rate is
computed with the following expression where fs is sample rate in Hz or samples per
second, and B is the number of bits used to represent a sample:
R  Bf s
Presented quantization scheme is called pulse code modulation (PCM), where B-bits
per sample are transmitted as a codeword.
24
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Advantages of this scheme are:


It is instantaneous (no coding delay)
Independent of the signal content (voice, music, etc.)
Disadvantages:

It requires high bit rate for good quality.
Example 2.6.
For “toll quality” (equivalent to a typical telephone quality) of signal minimum of 11 bits
per sample is required. For 10000 Hz sampling rate, the required bit rate is: B=(11
bits/sample)x(10000 samples/sec)=110,000 bps=110 kbps. For CD quality signal with
sample rate of 20000 Hz and 16-bits/sample, SNR(dB) =96-7.2=88.8 dB and bit rate of
320 kbps.
Because sampling rate is fixed for most applications this goal implies that the bit rate be
reduced by decreasing the number of bits per sample. This area is of significant
importance for communication systems and is know as Coding [1][5]. However, this
coding refers to the information encoding procedures beyond the representation of the
numerical values that is being discussed here.
However, as indicated earlier, uniform quantization is optimal only if distribution of
input samples x[n] is uniform. Thus, uniform quantization may not be optimal in general
- SNR can not be as small as possible for certain number of decision and reconstruction
levels. Consider for example speech signal for which x[n] is much more likely to be in
one particular region than in other (low values occurring much more often than the high
values), as exemplified by Figure 2.6. This implies that decision and reconstruction levels
are not being utilized effectively with uniform intervals over xmax. Clearly, optimal
solution must account for distribution of input samples.
2.4.4 Nonuniform Quantizer
A quantization that is optimal (in a least-squared error sense) for a particular pdf is
referred to as the Max Quantizer. For a random variable x with a known pdf, it is
required to find the set of M quantizer levels that minimizes the quantization error.
Therefore, finding the decision and reconstruction levels xi and x̂i , respectively, that
minimizes the mean-squared error (MSE) distortion measure:
25
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S

2
D  E xi  xˆi 

E-denotes expected value and x̂i is the quantized version of xi, would give us optimal
decision levels. It turns out that optimal decision levels are given by the following
expression:
xk 
xˆ k 1  xˆ k
,
2
1  k  L-1
On the other hand, the optimal reconstruction level xk is the centroid of px(x) over the
interval xk-1 ≤ x ≤xk computed by the following expression:




xk
p x x  

~
xˆ k    xk
xdx   p x x dx
xk 1
xk 1
  p x xdx 
 xk 1

xk
The above expression is interpreted as the mean value of x over interval xk-1 ≤ x ≤xk for
p x x  .
the normalized pdf ~
Solving last two equations for xk and x̂ k is a nonlinear problem in these two variables.
There are iterative solution which requires obtaining pdf ‘s of x; accurate estimate of
which can be difficult [1][10].
2.4.5 Companding
The idea behind companding is based on the fact that uniform quantizer is optimal for a
uniform pdf, thus, if a nonlinearity transformation T is applied to unquantized input x[n]
to form a new sequence y[n] whose pdf is uniform. Uniform quantizer can be applied
to y[n] to obtain ŷn as depicted in the Figure 2.13.
A companding operation compresses dynamic range of input samples for encoding and
expands dynamic range on decodeing. Optimal application of compading prodecude
requires accurate estimation of pdf of input x[n] values from which non-linear
transformation T can be derived. In practice, however, such transformations are
standardized under CCITT international standard coder at 64 kbps; specifically A-law
and –law companding. A-law is used in Europe while –law in North America.
The -law transformation is given by:
26
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
T ( x[n])  xmax

xn 

log 1  

x
max 

sign ( x[n])
log 1   
The -law transformation for 255, North American PCM standard, is followed by 8bit uniform quantization, 7-bits for value and 1-bit for sign, achieves “toll quality of
speech” in telephone channels. Achieved toll quality is equivalent quality to straight
uniform quantization using 12 bits.
Due to standardization, in digital telephone networks and voice modems, standard
CODEC7 chips are used in which audio is digitized in a 8-bit format.
x[n]
Nonlinearity
Transformation
T
y[n]
c’[n]
Decoder
Uniform
Quantizer Q[y]
c[n]
Encoder
Nonlinearity
Transformation
T-1
Figure 2.13 Block diagram of companding in the transmitting and receiving DSP system.
x[n] is unquanitzed input sample with a nonuniform pdf of values, y[n] is
the value obtained after nonlinear transformation with uniform pdf values,
ŷn is quantized sample, c[n] is encoded binary representation of this
sample value. This binary encoded stream is typically transmitted to a
receiving system where it is converted back to original by decoding the input
ˆ n, and applying inverse of the non-linear
encoded samples c’[n] to y'
-1
transformation T obtaining sequence x’[n]. If c’[n] = c[n], x’[n] differs
from x[n] by amount of introduced quantization noise.
2.5
Data Representations
The DSP’s, similarly to general computer processors, support a number of data formats.
The variety of data formats and computational operations determine DSP capabilities.
Most general classification of DSP processors is in terms of their hardware support of
7
CODEC is derived from CODer-DECoder.
27
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
data types for various operations (e.g., addition, subtraction, multiplication and division).
DSP’s are thus categorized as fixed-point or floating-point devices. Fixed-point data
are computer representations of integer numbers. Floating-point data types are
computer representations of real numbers.
2.6 Fixed-Point Number Representations
In theory the range of values that a number can take is unlimited. That is, an integer
number can take values ranging from  ∞ to  ∞:
 , ,3,2,1,0,1,2,3,,
Due to limitations of the hardware, the integer representations in a computer are
restricted to a range that is directly dependent on the number of bits allocated for
numbers. For example, if a processor uses 4 bits to represent a number, there are total
of 24 = 16 possible distinct combinations. If one would use those 4 bits to represent non
negative integers (called unsigned data type in conventional programming languages like
C/C++, Java, Fortran, etc.), the range of values that can be represented is thus (0, …,
15). If positive as well as negative numbers are needed, half of the bits are used to
represent positive and the remaining half represent negative numbers. It is necessary,
therefore to use one bit from the set of bits allocated (e.g., typically Most Significant Bit
or MSB is used, in this case bit number 3) to represent the sign of a number.
There are several different binary number representational conventions for signed and
unsigned numbers. Most notable are:
1. Sign Magnitude
2. One’s Complement
3. Two’s Complement
Example of 4-bit signed numbers is presented in the Table 2.1 below for three formats
listed above:
Decimal Value
+7
+6
+5
+4
+3
+2
+1
+0
Sign Magnitude
0111
0110
0101
0100
0011
0010
0001
0000
One’s Complement
0111
0110
0101
0100
0011
0010
0001
0000
28
Two’s Complement
0111
0110
0101
0100
0011
0010
0001
0000
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
-0
-1
-2
-3
-4
-5
-6
-7
-8
1000
1001
1010
1011
1100
1101
1110
1111
-
1111
1110
1101
1100
1011
1010
1001
1000
-
1111
1110
1101
1100
1011
1010
1001
1000
Table 2.1 Example of 4 bit number representations.
2.6.1 Sign-Magnitude Format
As depicted in Table 2.1, signed integers (positive and negative values) in this format use
the MSB bit to represent the sign of the number and the remaining bits are used to
represent its magnitude. A 16-bit sign magnitude formant representation is depicted in
the Figure 2.14 below.
Signed Number Representation
Bit
Position
Position
Value
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
-215 214
213
212
211
210
29
28
27
26
25
24
23
22
21
20
Mangitude
Sign Bit
Radix
Point
Figure 2.14 Sign Magnitude Format
This format thus, has two possible representations for 0, one with positive sign and one
with negative sign as depicted in Table 2.1. This poses additional complications in
designing the hardware to carry out operations. This issue is discussed further in
following sections.
With 4 bits the range of values cover the interval [-7,+7]. With 16 bits the range is defined
with following interval [-32,767, +32,767]. In general, with n bits in sign-magnitude
format the integers in the range from –(2n-1-1) to +(2n-1-1) can only be represented.
This format has two drawbacks. The first drawback already mentioned is that it has two
different representations for 0. The second drawback is that it requires two different
rules one for addition and one for subtraction, and a way to compare magnitudes to
determine their relative values prior to applying subtraction. This in turn would require
a more complex hardware to carry out those rules.
29
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
2.6.2 One’s-Complement Format
As depicted in Table 2.1, negative values –x are obtained by negating or complementing
each bit of the binary representation of a positive integer x. For an n-bit binary
representation of a number x, the following is its one’s-complement:
x ˆ bn 1 b2b1b0 , n  bit representa tion of a number x
x ˆ bn 1 b2b1b0 , n  bit representa tion of one' s complement of x
where bi is complement of bit bi . Clearly, the following holds:
x  x  1111  2 n  1
2-1
Similarly to sign-magnitude formant, MSB is used to represent the sign of a number.
Positive number will have MSB value of “0” which after complement operation will
become “1” leading to a negative integer number. Clearly, remaining n-1 bits will
represent a number itself if positive; otherwise they will represent one’s complement.
Applying equation 2.1 the following expression depicts one’s-complement
representation format:
 x,
x1  ˆ 
x

 n
2

 n
2
x0
x0
x0
x,

1  x

x0

x,
1  x
x0
x0

2-2
Similarly to sign-magnitude representation, with 4 bits we can represent integers in the
range defined by the interval [-7,+7], also as depicted in Table 2.1. In general, with n bits
one’s-complement format can represent the integers in the range from –(2n-1-1) to +(2n1
-1).
One’s-complement format is superior to sign-magnitude formant in that the addition
and subtraction require only one rule; specifically that of the addition, since subtraction
can be carried out by performing addition on one’s-complemented number as depicted
below by applying equation 5.2:


z1   x1   y1   x1   | y1  |  x1   2n  1  y1 
30
5-1
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
It turns out that addition of one’s-complement numbers is a bit complicated to
implement in hardware. Also, additional extra bit is required to represent the least
significant bit (20) to manage overflow. This problem is alleviated by two’s-complement
representation discussed next.
2.6.3 Two’s-Complement Format
The n-bit two’s-complement number ~
x of a positive integer x is defined by the following
expression:
~
x  x  1  2n  x
x~
x  2n
2-3
As depicted in Table 2.1, the disadvantage of having two representations for the number
zero is eliminated. As before, MSB is used to represent the sign of the number. Using
equation 2.3 the two’s-complement format representation is given by:
 x,
x2  ˆ  ~
x
x0
x0
x0
 x,
 n
2  x x0
x0
 x,
 n
2 x x0
 
applyingEq. 2.3
2-4
 
With two’s-complement representation, obtained by shifting to the right by
incrementing one’s-complement representation, the problem of two zeros is alleviated.
Consequently the range of the negative numbers has increased by one compared to
previous representations, as depicted in Table 2.1. With 4 bits the range of integer values
is defined by the interval [-8, +7], also as depicted in Table 2.1. In general, with n bits
two’s-complement format can represent the integers in the range from –(2n-1) to +(2n-11).
The following lists the advantages of the two’s-complement formant representation:
1. It is compatible with the notion of negation, that is, the complement of complement
is the number itself.
2. It unifies the subtraction and addition operations since subtractions are essentially
additions of two’s-complement representation of a number.
31
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
3. For summation of more than two numbers, the internal overflows do not affect the
final result so long as the result is within the range; adding two positive number
results in positive number, and adding two negative numbers gives a negative result.
Due to these properties, two’s-complement is the preferred format in representing
negative integer numbers. Consequently, almost all current processors, including DSP’s
implement signed arithmetic using this format and provide special functions to support
it.
2.7
Fixed-Point DSP’s
The fixed-point DSP’s hardware supports only fixed-point data types. Their hardware is
thus more restrictive performing basic operations only on fixed-point data types. With
software emulation, the fixed-point DSP’s can execute floating-point operations.
However, floating-point operations are done at the expense of the performance due to
lack of floating point hardware.
Lower-end fixed point DSP’s are 16-bit architectures. That is, the processors’ word
length is 16 bit and its basic operations use 16-bit data types. Typically, 16-bit DSPs also
support double precision – 2x16=32-bit data type. This extended support may come at
the expense of performance of the processor depending on their hardware architecture
and design. The 16-bit signed and unsigned data type’s format is given below in the
Error! Reference source not found..
There are a number of possible fixed-point representations that DSP hardware may
support. One example was presented in Table 2.1.
For 16 bit representations the ranges of numbers are given in the following
Table 2.2. Analog Devices’ BF533 family architecture supports two’s-complement
integer formats.
Clearly, the range of values (commonly referred in the literature as dynamic range) is
proportional to the number of bits used to represent a number. If the result of the
operation exceeds the precision of the data type, in the worst case scenario the resulting
number will overflow and wrap around generating a large error, or at best if it is handled
(by hardware, setting the overflow flag in processors arithmetic and logic unit’s status
register or by software, checking the input values before the operation to detect potential
overflow) it will be saturated to the maximal/minimal value of corresponding data type
leading to truncation error. Since most common DSP operations require multiply and
accumulate operations, these kinds of representations where the magnitude of the
number is directly mapped in the processor requires special handling to avoid truncation
effects. In addition, exceeding the precision provided by the dynamic range of the data
type typically introduces non-linear effects producing large errors and sometimes breaks
the algorithm.
32
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Unsigned Number Representation
Bit
Position
Position
Value
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
215
214
213
212
211
210
29
28
27
26
25
24
23
22
21
20
Radix
Point
Signed Number Representation
Bit
Position
Position
Value
15
-2
15
14
13
12
11
10
14
13
12
11
10
2
2
2
2
2
9
2
9
8
2
8
7
2
7
6
2
6
5
2
5
4
2
4
3
2
3
2
2
2
1
2
1
Sign Bit
0
20
Radix
Point
Figure 2.15 The 16-bit Unsigned and Signed data type’s representation.
16-bits
MIN VALUE
MAX VALUE
32-bits
MIN VALUE
MAX VALUE
16-bits
MIN VALUE
MAX VALUE
32-bits
MIN VALUE
MAX VALUE
Unsigned Fixed Point Numbers
Sign Magnitude
One’s Complement
0
2 =0
N/A
216=65536
N/A
20=0
N/A
232=4294967296
N/A
Signed Fixed Point Numbers
Two’s Complement
N/A
N/A
N/A
N/A
-216-1+1= -32767
+216-1-1= +32767
-216-1+1= -32767
+216-1-1= +32767
-216-1= -32768
+216-1-1= +32767
-232-1+1 = -2147483647
+232-1-1 = +2147483647
-232-1+1 = -2147483647
+232-1-1 = +2147483647
-232-1 = -2147483648
+232-1-1 = +2147483647
Table 2.2 Range of values represented by a 16 bit DSP.
33
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
2.8 Fixed-Point Representations Based on
Radix-Point
One way to view possible fixed-point representations by a processor is based on the
implied position of radix point. In the examples of integer fixed-point representations
discussed previously, zero bits were used after the radix point. This implies the following
representation depicting the integer formants in DSP:
Implicit Position of Radix Point
x   d n B n    d 2 B 2  d1 B 1  d 0 B 0 
Example 2.7.
A DSP uses 4 bits to represent input fixed-point all integer numbers. Two’s complement
format is used for negative numbers. The tables below indicate the resulting number if
the operations are carried using 4-bits.
Operand 1
4 –bits Binary
(Decimal)
Operation
Operand 2
4-bits Binary
(Decimal)
0011(+3)
0110(+6)
1101(-3)
1010(-6)
+
+
+
+
0010(+2)
0101(+5)
0111(+7)
1000(-8)
Resulting
Number
4-bits Binary
(Decimal)
0101(+5)
1011(-5)
0100(+4)
0010(+2)
Comment
No Overflow
Overflow
No Overflow
Overflow
In the cases of overflow it becomes necessary to handle it in order to minimize the
resulting error. In the table above the resulting overflow errors are: 6+5=11 compared
to erroneous result of (-5), or -6-8=-14 vs. +2; both being 16. Overflow can be avoided
by doubling the precision of the resulting number; that is using 8 bits as depicted in the
table below.
Operand 1
4 –bits Binary
(Decimal)
Operation
Operand 2
4-bits Binary
(Decimal)
Resulting Number
8-bits Binary (Decimal)
0011(+3)
0110(+6)
1101(-3)
+
+
+
0010(+2)
0101(+5)
0111(+7)
0000 0101(+ 5)
0000 1011(+11)
0000 0100(+ 4)
34
D I G I T A L
S I G N A L
1010(-6)
R E P R E S E N T A T I O N S
+
1000(-8)
1111 0100(-12)
Next table depicts results of the multiplication of two 4-bit numbers with 4-bit resulting
number. The errors due to overflow are large even when 4 most significant bits (MSB)
are used for resulting numbers. For example (+7)x(+6)=(+42) but the resulting number
from the 4 MSB’s is 2 which introduces an error of 40. Similarly, (-6)x(-5)=(-35) with
the resulting number (-2) and error of 33.
Operand 1
4-bits Binary
(Decimal)
Operation
Operand 2
4-bits Binary
(Decimal)
0010(+2)
0111(+7)
1101(-3)
1010(-6)
x
x
x
x
0011(+3)
0110(+6)
0010(+2)
1011(-5)
Resulting
Number
4-bits Binary
(Decimal)
0110(+ 6)
0010(+ 2)
0110(+ 6)
1110(- 2)
Comment
No Overflow
Overflow
No Overflow
Overflow
Similarly to addition operation, the overflow can be avoided for multiplication if the
resulting precision is doubled compared to input operands. The table below
demonstrates this case.
Operand 1
4-bits Binary
(Decimal)
Operation
Operand 2
4-bits Binary
(Decimal)
Resulting Number
8-bits Binary (Decimal)
0010(+2)
0111(+7)
1101(-3)
1010(-6)
x
x
x
x
0011(+3)
0110(+6)
0010(+2)
1011(-5)
0000 0110(+ 6)
0010 1010(+42)
0000 0110(+ 6)
1110 0010(-30)
In all but trivial algorithms is possible to keep advancing the precision of the resulting
number. Those algorithms require a number of iterations involving intermediate data
from input data to generate resulting output. In order to achieve error free operation
each intermediate output has to have double number of bits compared to its inputs.
Iterative application of this approach will quickly exceed the hardware capabilities of the
DSP (e.g., 32 bit int data type). However, there are several techniques that enable DSP
developers to bound the errors to within the tolerated margins as established by the
application. This issue will be discussed further in next chapter.
35
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
There are alternative representations that have better properties then the common
integer formats presented and in proceeding sections. One such representation requires
all numbers to be scaled within the interval [-1, 1). This is obviously all fractional
representation using fixed-point architecture. Note, this representation is not to be
confused with fractional numbers in floating point representations which use different
format and rules in the hardware to perform floating point operations.
In all fractional representation, allocated bits are used to cover fixed dynamic range
between -1 and 1. Clearly, larger the number of bits used for fractional numbers finer
the representation (finer granularity). This stands in contrast to previous magnitude
representation where the granularity is fixed and is equal to 1 – constant difference of
any two consecutive numbers. Imposing a constant range may potentially be considered
a drawback of this fractional representation since it may require keeping track of the
scaling factor used to translate the original range of values to fixed [-1, 1) range. On the
other hand, this representation provides much better properties in terms of truncation
error as well as overflow.
Truncation error and overflow requires special consideration in fixed point integer
representation discussed earlier. On the other hand, the 16-bit fractional representations
do not require overflow handling in multiplication; multiplication of two fractional
numbers with values between -1 and 1 will produce a resulting number also in the same
range. The only consideration with the 16-bit fractional representation is underflow that
typically does not require special handling. Underflow incurs the error when the result
of an operation is less than the granularity of the representation. If this error can not be
tolerated additional rescaling of the intermediate data is required, otherwise the effect of
error falls below the granularity of the representation; i.e., smallest represent able
number.
A fractional fixed-point representation assumes radix-point to be in the left-most
position implying that all bits have positional values less than one. The general notation
of this fractional representation is:
Implicit Position of Radix Point
x    d 1 B 1  d 2 B 2    d m1 B m1
This representation is also depicted in the following Figure 5-1.
s
Figure 5-1. Fractional Fixed-Point Representation.
36
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Presented integer and fractional fixed-point representations depict two possible number
representational schemes utilizing two extreme positions of implied radix-point. Since
radix-point position defines the notation, a formal definition of such representational
scheme is based precisely on it. Let N be the total number of bits used to represent a
number. Also let p denote the number of bits to the left of the radix point specifying the
integral portion of a number, and with q number of bits to the right of radix-point
specifying the fractional portion of a number. Notation Qp.q specifies the format of the
representation used as well as the position the implied radix-point as well as the precision
of the representation. For example, the unsigned integer fixed-point format is expressed
as Q16.0 since all bits lay to the left of radix-point. Consequently, signed 16-bit integer
fixed-point format is denoted by Q15.0 with 1 bit used to represent the sign of a number.
The all fractional representation uses Q0.16 and Q0.15 format for unsigned and signed
numbers respectively.
In general, for unsigned numbers the relationship between total number of bits N and
p, q is:
N  p  q  unsigned
For signed numbers the following relationship holds:
N  p  q  1  signed
In light of introduced notation, a number represented by Qp.q of a binary signed format
has a value that can be computed by the following expression:
Num   bN 1 2 N 1  bN  2 2 N  2  bN 3 2 N 3    b0 2 0 2  q  bN 1 2 p   bk 2 k  q
N 2
k 0
For unsigned numbers the following expression can be used:
Num  bN 1 2 N 1  bN 2 2 N 2  bN 3 2 N 3    b0 20 2 q   bk 2 k q
N 1
k 0
2.8.1 Dynamic Range
Now a formal definition of the dynamic range of a data representation can be stated.
Dynamic, range given in dB scale, is defined as the ratio of the largest number (Max) and
the smallest positive number greater then zero (Min) of a data representation. It is
computed by the following expression:
 Max 
DR [dB]  20 log 10 

 Min 
37
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Dynamic Range of signed and unsigned integer and fractional 16-bit representations are
given in the following table:
Dynamic Range
Unsigned Integer
(16-bits)
Dynamic Range [dB]
 216  1 
20 log 10  0   96dB
 2 
[0,65535]
Precision
1
[-32768,32767]
 215  1 
20 log 10  0   90dB
 2 
1
Unsigned
Fractional
(16-bits)
[0, 0.9999847412109375]
 1  2 16 
20 log 10  16   96dB
 2

2-16
Signed Fractional
(16-bits)
[-1, 0.999969482421875]
 1  2 15 
20 log 10  15   90dB
 2

2-15
Signed Integer
(16-bits)
Table 5- 1. Dynamic Range and Precision of 16-bit signed and unsigned integer and
fractional representations.
2.8.2 Precision
Earlier, we have introduced the concept of the granularity of a representation. Here, it is
formally defined as the precision of a representation. The precision of a representation
is thus the difference of two consecutive numbers. It has to be noted that this difference
is the smallest step between any two representations.
In the Table 5- 2 below, largest and smallest positive and negative values as well as
corresponding precessions for fractional and integer 16-bit data types are given.
Unsigned Fractional and Integer 16-bit Representations
Format Number Number of
Largest Value
Smallest Value
Precision
of Integer Fractional
Bits
Bits
Q0.16
0
16
0.9999847412109375
0.0 0.0000152587890625
Q16.0
16
0
65535.000000000000000
0
1.00000000000000
Signed Fractional and Integer 16-bit Representations
Format Number Number of
Largest Positive Value
Least Negative
Precision
of Integer Fractional
Value
Bits
Bits
Q0.15
0
15
0.999969482421875
-1.0
0.000030517578125
Q15.0
15
0
32767.000000000000000
-32768
1.00000000000000
Table 5- 2. Maximal, Minimal, and Precision values for Integer and Fractional 16-bit
fixed-point representations.
38
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
In addition to 16-bit signed and unsigned representations already discussed; namely
Q16.0, Q15.0 integer and Q0.16 and Q0.15 fractional, there is a whole range of
representations in between that could be used that combine pure integer and pure
fractional representations.
Different Qp.q formats are depicted in the next Figure 5- 2. The following table uses
format definitions in the figure to present the ranges of possible 16-bits signed numbers
that can be represented in a DSP.
Signed Number Representation
Bit
Position
Position
Q0.15 Value
15
14
13
12
11
10
0
-1
-2
-3
-4
-5
-2
2
9
-6
8
-7
7
-8
6
-9
5
4
3
2
1
-10
-11
-12
-13
-14
2
2
2
0
2-15
2
2
2
2
2
2
2
2
2
2
2-1
2-2
2-3
2-4
2-5
2-6
2-7
2-8
2-9
2-10 2-11 2-12 2-13 2-14
2-1
2-2
2-3
2-4
2-5
2-6
2-7
2-8
2-9
2-10 2-11 2-12 2-13
2-1
2-2
2-3
2-4
2-5
2-6
2-7
2-8
2-9
2-10 2-11 2-12
2-1
2-2
2-3
2-4
2-5
2-6
2-7
2-8
2-9
2-10 2-11
2-1
2-2
2-3
2-4
2-5
2-6
2-7
2-8
2-9
2-10
27
26
25
24
23
22
21
20
2-1
Sign Bit
Radix
Point
Position
Q1.14 Value
-21
20
Sign Bit Radix
Point
Position
Q2.13 Value
-22
21
Radix
Point
Sign Bit
Position
Q3.12 Value
-23
20
22
21
Radix
Point
Sign Bit
Position
Q4.11 Value
-24
20
23
22
21
Radix
Point
Sign Bit
Position
Q5.10 Value
-25
20
24
23
22
21
Radix
Point
Sign Bit
Position
Q14.1 Value
-214 213
20
212
211
210
29
28
Sign Bit
Position
Q15.0 Value
-215 214
Radix
Point
213
212
211
210
29
28
27
26
25
24
23
22
Sign Bit
21
20
Radix
Point
Figure 5- 2. Possible Qp.q format representations of 16-bit data length.
39
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
The Table 5- 3 below summaries the properties of all possible16-bit representations in
terms of number of integer bits (p), number of fractional bits (q), largest positive value,
least negative value and precision for various Q formats.
Format Number Number of
of Integer Fractional
Bits
Bits
Q0.15
0
15
Q1.14
1
14
Q2.13
2
13
Q3.12
3
12
Q4.11
4
11
Q5.10
5
10
Q6.9
6
9
Q7.8
7
8
Q8.7
8
7
Q9.6
9
6
Q10.5
10
5
Q11.4
11
4
Q12.3
12
3
Q13.2
13
2
Q14.1
14
1
Q15.0
15
0
Signed16-bit Representations
Largest Positive Value
Least Negative
in Decimal
Value in Decimal
(0x7FFF)
(0x8000)
0.999969482421875
-1.0
1.999938964843750
-2.0
3.999877929687500
-4.0
7.999511718750000
-8.0
15.999511718750000
-16.0
31.999023437500000
-32.0
63.998046875000000
-64.0
127.996093750000000
-128.0
255.992187500000000
-256.0
511.984375000000000
-512.0
1023.968750000000000
-1024.0
2047.937500000000000
-2048.0
4095.875000000000000
-4096.0
8191.750000000000000
-8192.0
16383.500000000000000
-16384.0
32767.000000000000000
-32768.0
Precision
0.000030517578125
0.000061035156250
0.000122070312500
0.000244140625000
0.000488281250000
0.000976562500000
0.001953125000000
0.003906250000000
0.007812500000000
0.015625000000000
0.031250000000000
0.062500000000000
0.125000000000000
0.250000000000000
0.50000000000000
1.00000000000000
Table 5- 3. Maximal, Minimal, and Precision values for Integer and Fractional 16-bit
fixed-point representations.
40
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
3.Implementation
Considerations
41
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
3
Chapter
Implementation
Considerations
To properly implement a design in specific DSP hardware
familiarity with specifics of development tools that relate to it is
crucial.
M
ost DSP processors use two’s complement fractional number representations
in different Q formats. The native formats for the Blackfin DSP family are a
signed fractional Q1.(N-1) and unsigned fractional Q0.N format, where N is
the number of bits in the data word. Depending on the compiler setting the
following must be considered
3.1
Assembly
Since the assembler only recognizes integer values the programmer must keep track of
the position of the binary point when manipulating fractional numbers [11]. The
following steps provided below, can be used to convert a fractional number in Q format
into an integer value that can be recognized by the assembler.
1. Normalize the fractional number to the range determined by the desired Q
format.
2. Multiply the normalized fractional number by 2n, where n is the total number of
fractional bits.
3. Round the product to the nearest integer.
3.2 C – Language Support for Fractional Data
Types
The C/C++ run-time environment of Blackfin DSP processor family uses the intrinsic
C/C++ data types and data formats that are listed in the table below.
42
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Type
Size in Bits
Data Representation
Char
Unsigned char
Short
Unsigned short
Int
Unsigned int
Long
Unsigned long
long long
Unsigned long long
Pointer
function pointer
Float
Double
long double
fract16
fract32
8 bits signed
8 bits unsigned
16 bits signed
16 bits unsigned
32 bits signed
32 bits unsigned
32 bits signed
32 bits unsigned
64 bits signed
64 bits unsigned
32 bits
32 bits
32 bits
64 bits
64 bits
16 bits signed
32 bits signed
8-bit two’s complement
8-bit unsigned magnitude
16-bit two’s complement
16-bit unsigned magnitude
32-bit two’s complement
32-bit unsigned magnitude
32-bit two’s complement
32-bit unsigned magnitude
64-bit two’s complement
64-bit unsigned magnitude
32-bit two’s complement
32-bit two’s complement
32-bit IEEE single-precision
64-bit IEEE double-precision
64-bit IEEE
Q1.15 fraction format
Q1.31 fraction format
sizeof
return
in Number of
Bytes
1
1
2
2
4
4
4
4
8
8
4
4
4
8
8
2
4
Table 5- 4. Data types supported by Blackfin DSP processor and VisualDSP++
integrated development environment.
It is important to note that the floating-point and 64-bit data types are implemented
using software emulation. Thus, it must be expected to run more slowly than hardwaresupported native data types. The emulated data types are float, double, long double,
long long and unsigned long long.
The fract16 and fract32 are not actually intrinsic data types— they are typedefs to short
and long, respectively.
In C, built-in functions must be used to do basic arithmetic operations (See “Fractional
Value Built-In Functions in C++” in [12]). The following operation in C program
fract16*fract16 will not lead a correct result. This is consequence of limitations of C
programming language that does not support function overloading. Therefore “*” built
in operator in “fract16*fract16” invokes standard multiplication with short data type
operands8.
Because fractional arithmetic uses slightly different instructions to normal arithmetic, one
cannot normally use the standard C operators on fract data types and get the right result.
The built-in functions described here to work with fractional data.
The fract.h header file provides access to the definitions for each of the built-in
functions that support fractional values. These functions have names with suffixes
8
Note that fract16 data types are not intrinsic data types rather they are typedef to short.
43
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
 _fr1x16 for single fract16,
 _fr2x16 for dual fract16, and
 _fr1x32 for single fract32.
All the functions in fract.h are marked as inline, so when compiling with the compiler
optimizer, the built-in functions are inlined.
The list of build in functions used for fractional 16-bint data types, fract16, with a brief
description is given in the Table 5- 5 below:
Built in functions for fract16 operands
Description
fract16 add_fr1x16(fract16 f1,fract16 f2)
fract16 sub_fr1x16(fract16 f1,fract16 f2)
fract16 mult_fr1x16(fract16 f1,fract16 f2)
fract16 multr_fr1x16(fract16 f1,fract16 f2)
fract32 mult_fr1x32(fract16 f1,fract16 f2)
fract16 abs_fr1x16(fract16 f1)
fract16 min_fr1x16(fract16 f1, fract16 f2)
fract16 max_fr1x16(fract16 f1, fract16 f2)
fract16 negate_fr1x16(fract16 f1)
fract16 shl_fr1x16(fract16 src, short shft)
fract16 shl_fr1x16_clip(fract16 src, short shft)
fract16 shr_fr1x16(fract16 src, short shft)
44
Performs 16-bit addition of the two input
parameters (f1+f2)
Performs 16-bit subtraction of the two input
parameters (f1-f2)
Performs 16-bit multiplication of the input
parameters (f1*f2). The result is truncated to 16
bits.
Performs a 16-bit fractional multiplication (f1*f2) of
the two input parameters. The result is rounded to
16 bits. Whether the rounding is biased or
unbiased depends on what the RND_MOD bit in
the ASTAT register is set to.
Performs a fractional multiplication on two 16-bit
fractions, returning the 32-bit result.
Returns the 16-bit value that is the absolute value
of the input parameter. When the input is 0x8000
(2’s complement representation of largest negative
number), saturation occurs and 0x7fff is returned.
Returns the minimum of the two input parameters.
Returns the maximum of the two input parameters.
Returns the 16-bit result of the negation of the
input parameter (-f1). If the input is 0x8000 (2’s
complement representation of largest negative
number), saturation occurs and 0x7fff is returned.
Arithmetically shifts the src variable left by shft
places. The empty bits are zero-filled. If shft is
negative, the shift is to the right by abs(shft) places
with sign extension.
Arithmetically shifts the src variable left by shft
(clipped to 5 bits) places. The empty bits are zero
filled. If shft is negative, the shift is to the right by
abs(shft) places with sign extension.
Arithmetically shifts the src variable right by shft
places with sign extension. If shft is negative, the
shift is to the left by abs(shft) places, and the empty
bits are zero-filled.
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
fract16 shr_fr1x16_clip(fract16 src, short shft)
fract16 shrl_fr1x16(fract16 src, short shft)
fract16 shrl_fr1x16_clip(fract16 src, short shft)
int norm_fr1x16(fract16 f1)
Arithmetically shifts the src variable right by shft
(clipped to 5 bits) places with sign extension. If shft
is negative, the shift is to the left by abs(shft)
places, and the empty bits are zero-filled.
Logically shifts a fract16 right by shft places. There
is no sign extension and no saturation – the empty
bits are zero-filled.
Logically shifts a fract16 right by shft (clipped to 5
bits) places. There is no sign extension and no
saturation – the empty bits are zero-filled.
Returns the number of left shifts required to
normalize the input variable so that it is either in the
interval 0x4000 to 0x7fff, or in the interval 0x8000 to
0xc000. In other words:
fract16 x;
shl_fr1x16(x,norm_fr1x16(x));
returns a value in the range 0x4000 to 0x7fff, or in
the range 0x8000 to 0xc000.
Table 5- 5. Built in functions for 16-bit fractional, fract16, operands.
The list of build in functions used for fractional 32-bit data types, fract32, with a brief
description is given in the Table 5- 6 below:
Built in functions for fract32 operands
fract32 add_fr1x32(fract32 f1,fract32 f2)
fract32 sub_fr1x32(fract32 f1,fract32 f2)
fract32 mult_fr1x32x32(fract32 f1,fract32 f2)
fract32 mult_fr1x32x32NS(fract32 f1,fract32 f2)
fract32 abs_fr1x32(fract32 f1)
fract32 min_fr1x32(fract32 f1, fract32 f2)
45
Description
Performs 32-bit addition of the two input
parameters (f1+f2)
Performs 32-bit subtraction of the two input
parameters (f1-f2)
Performs 32-bit multiplication of the input
parameters (f1*f2). The result (which is
calculated internally with an accuracy of 40
bits) is rounded (biased rounding) to 32 bits
Performs 32-bit non-saturating multiplication of
the input parameters (f1*f2). This is somewhat
faster than mult_fr1x32x32. The result (which is
calculated internally with an accuracy of 40 bits)
is rounded (biased rounding) to 32 bits.
Returns the 32-bit value that is the absolute
value of the input parameter. When the input is
0x80000000 (2’s complement
representation of largest negative number),
saturation occurs and 0x7fffffff is
returned.
Returns the minimum of the two input
parameters.
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
fract32 max_fr1x32(fract32 f1, fract32 f2)
fract32 negate_fr1x32(fract32 f1)
fract32 shl_fr1x32(fract32 src, short shft)
fract32 shl_fr1x32_clip(fract32 src, short shft)
fract32 shr_fr1x32(fract32 src, short shft)
fract32 shr_fr1x32_clip(fract32 src, short shft)
fract16 sat_fr1x32(fract32 f1)
fract16 round_fr1x32_clip(fract32 f1)
Returns the maximum of the two input
parameters.
Returns the 32-bit result of the negation of the
input parameter (-f1). If the input is
0x80000000 (2’s complement
representation of largest negative number),
saturation occurs and 0x7fffffff is
returned.
Arithmetically shifts the src variable left by shft
places. The empty bits are zero-filled. If shft is
negative, the shift is to the right by abs(shft)
places with sign extension.
Arithmetically shifts the src variable left by shft
(clipped to 5 bits) places. The empty bits are
zero filled. If shft is negative, the shift is to the
right by abs(shft) places with sign extension.
Arithmetically shifts the src variable right by shft
places with sign extension. If shft is negative,
the shift is to the left by abs(shft) places, and
the empty bits are zero-filled.
Arithmetically shifts the src variable right by shft
(clipped to 5 bits) places with sign extension. If
shft is negative, the shift is to the left by
abs(shft) places, and the empty bits are zerofilled.
If f1>0x00007fff (216-1), it returns
0x7fff (216-1).
If f1<0xffff8000 -(216-1), it returns
0x8000 -(216-1).
Otherwise, it returns the lower 16 bits of f1.
Rounds the 32-bit fract to a 16-bit fract using
biased rounding.
Returns the number of left shifts required to
normalize the input variable so that it is either in
the
interval
0x40000000
to
0x7fffffff, or in the interval
0x80000000 to 0xc0000000. In other
words:
fract32 x;
shl_fr1x32(x,norm_fr1x32(x)
);
int norm_fr1x32(fract32 f1)
returns a value in the range
0x40000000 to 0x7fffffff (positive),
or in the range
0x80000000
to
0xc0000000
(negative).
Returns the top 16 bits of f1—it truncates f1 to
16 bits.
fract16 trunc_fr1x32(fract32 f1)
Table 5- 6. Built in functions for 32-bit fractional, fract32, operands.
46
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
VisualDSP++ also provides support for European Telecommunications Standards
Institute (ETSI) support routines. For further information consult [ADSPBF53x/BF56x Blackfin Processor Programming Reference] manual.
3.3 C++ – Language Support for Fractional
Data Types
In C++, for fract data, the classes “fract” and “shortfract” define the basic arithmetic
operators. This in turn means that “*” operation is overloaded and will invoke a proper
hardware operation on fract operands.
The fract class uses a fract32, C-language type storage for the fractional 32-bit data, while
shortfract uses fract16 for the fractional 16-bit data.
Instances of the shortfract and fract class can be initialized using values with the “r”
suffix, provided they are within the range [-1,1). The fract class is represented by the
compiler as representing the internal type fract. For example
#include <fract>
int main ()
{
fract X = 0.5r;
}
Instances of the shortfract class can be initialized using “r” values in the same way, but
are not represented as an internal type by the compiler. Instead, the compiler produces
a temporary fract, which is initialized using the “r” value. The value of the fract class is
then copied to the shortfract class using an implicit copy and the fract is destroyed.
The fract and shortfract classes contain routines that allow basic arithmetic operations
and movement of data to and from other data types. The example below shows the use
of the shortfract class with * and + operators.
// C++ initialization of the data with fractional
// constants.
#include <shortfract>
#include <stdio.h>
47
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
#define N 20
shortfract x[N] = {
.5r,.5r,.5r,.5r,.5r,
.5r,.5r,.5r,.5r,.5r,
.5r,.5r,.5r,.5r,.5r,
.5r,.5r,.5r,.5r,.5r
};
shortfract y[N] = {
.0r,.1r,.2r,.3r,.4r,
.5r,.6r,.7r,.8r,.9r,
.0r,.1r,.2r,.3r,.4r,
.5r,.6r,.7r,.8r,.9r
};
shortfract fdot(int n, shortfract *x, shortfract *y)
{
int j;
shortfract s;
s = 0;
for (j=0; j<n; j++) {
s += x[j] * y[j]; // Overloaded “*” operator
}
return s;
}
int main(void)
{
fdot(N,x,y);
}
3.4
C vs. C++ Important Distinctions
If coding in C mode fractional constants can be used to initialize the fractional variables.
Note that fract16 and fract32 are typedef of short int and long int built in data types.
Initialization is accomplished by normalizing the fractional number to the range
determined by Q format.
48
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
Example of Q0.159. Conversion from float to fract16:
fract16 x= (0.75 * 32767.0); // fractional representation of 0.75
//Use of built in conversion function float_to_fr16(float)
fract16 y = float_to_fr16(19571107.945)
In the second example the number will be saturated to frac16 precision; that is to 32767.
This implies that the numbers that are converted must be scaled to fit the corresponding
data type range (e.g., 216-1=32767 for 16 bit or 232-1=2147483647 for 32 bit data).
In C, no special conversion is needed from 16 bit signed integer to fract16. However,
proper functions that perform operations on fract data must be used since in C mode
there is no operator overloading. In C++ due to overloading of the built in operators
the proper operations will be invoked as long as data types are declared properly. To
avoid potential problems it is advisable to always use fractional functions explicitly even
in C++ programming mode when using fract data types.
Note that VisualDSP++ for Blackfin DSP processor family supports only singed fractional data types as
presented earlier in this chapter.
9
49
D I G I T A L
S I G N A L
R E P R E S E N T A T I O N S
References
[1] Jayant & Noll, “Digital Coding of Waveforms-Principles and
Applications to Speech and Video”, Chapter 3, Sampling and
Reconstruction of Bandlimited Waveforms, Prentice Hall, 1984.
[2] Oppenheim, Schafer & Buck, “Discrete-Time Signal Processing“,
Chapter 6, Overview of Finite –Precision Numerical Effects, Prentice
Hall, 1999
[3] Ingle & Proakis, “Digital Signal Processing using MATLAB”, Chapter
9, Finite Word-Length Effects, Thomson, 2007.
[4] Udo Zölzer, “Digital Audio Signal Processing, Chapter 2,
Quantization, Wiley, 1998.
[5] J. H. Conway, R. K. Guy, “The Book of Numbers”,
[6] D. Knuth, “The Art of Computer Programming”, Volume 2,
Seminumerical Algorithms, Third Edition, Addison-Wesley, 1997.
[7] http://en.wikipedia.org/wiki/Real_number
[8] Kondoz, “Digital Speech, Coding for Low Bit Rate Communication
Systems, Wiley, 2004.
[9] Jensen, Batina, Hendriks & Heusdens, “A Study of the Distribution of
Time-Domain Speech Samples and Discrete Fourier Coefficients”,
Proceedings of SPS-DARTS, 2005.
[10] Quatieri, “Discrete-Time Speech Signal Processing: Principles and
Practice, Prentice Hall, 2002.
[11] Kuo & Gan, “Digital Signal Processors-Architectures,
Implementations, and Applications”, Chapter 3, Implementation
Considerations, Prentice Hall, 2005.
[12] VisualDSP++ C/C++ Compiler and Library Manual for Blackfin
Processors, Analog Devices, Inc. One Technology Way Norwood,
Mass. 02062-9106, www.analog.com
50
Index
51
Download