sound(x, fs)

advertisement
Chapter 13 Sounds and signals
• basics of computer sound
• perception and generation of sound
• synthesizing complex sounds
• sampling sound signals
• simple example of signal processing
Sources used: (a) Chapter 13 of the text (b) Daisy Fan, Cornell University
Role of audio in computers
• audio is an important sensory signal
• crucial component of multimedia data – audio, music
• tools for interacting with digital computers for visually
impaired persons
• music analysis and synthesis
• speech processing and synthesis
Basics of computer sound in Matlab
• Matlab can open a file in .wav format:
>> [x, fs, bits] = wavread(‘fh.wav’);
>> sound(x, fs);
% plays the sound clip
Computing with sound in Matlab requires that we first convert
the wav format data into simple numeric data—the job of the
function wavread.
Variable fs above represents the number of samples per second
and bits represents the number of bits used represent each
sample.
Basics of computer sound
>> fs
fs =
44100
>> bits
bits =
16
>> x(100:105)
ans =
-0.0768
0.0174
0.0345
0.0008
-0.0060
0.0026
>> plot(1:length(x), x);
Basics of computer sound
Some basic questions:
• why does the sound waveform range in amplitude from
-1 to 1?
• what role does the sampling frequency play in the
quality of the sound?
• what happens if we play back the sound at a different
sampling frequency?
Computing with sound requires digitization
• Sound is (analog) continuous; capture its essence by sampling
• Digitized sound is a vector of numbers
Sampling rate affects the quality
If sampling not frequent enough, then the discretized sound will
not capture the essence of the continuous sound and the quality
will be poor.
Sampling Rate
• Given human perception, 20000 samples/second
is pretty good (20000Hz or 20kHz).
• 8,000 Hz required for speech over the
Telephone
• 44,100 Hz required for audio CD
• 192,400 Hz required for HD-DVD audio tracks
Resolution also affects the quality
• Typically, each sampled value is encoded as an 8bit integer in the .wav file.
• Possible values: -128, -127,…,-1,0,1,…,127
• Loud: -120, 90, 122, etc.
• Quiet: 3, 10, -5
• 16-bit used when very high quality is required.
• Wavread converts 8 bit values into real numbers in [-1,1].
Amplitude, frequency and phase
P(t) = A sin( 2*pi*f*t + phi)
f = frequency
phi = phase (shift)
A = amplitude
• a single sine wave models a pure musical tone.
• amplitude determines the loudness.
• human perception of sound is in the range 50 Hz to 15000 Hz.
• most sounds (and music) are complicated mixture of various frequencies.
Calculating frequencies of notes
• The author of the text has recorded a whistle sound in the file named
whistle.wav.
>> [w, fs, bits] = wavread('whistle.wav');
>> length(w)
ans =
66150
>> fs
fs =
11025
Question: What is the duration of the whistle? What are the
various frequencies he has used?
whistle.wav – wave
Zooming in on one region
Question: what is the frequency? Note the sample length is 100
and the waveform has repeated about 9 times. 9 cycles/100
samples. This works out to about 1KHz. Can you see why?
Synthesis of sound – combining pure notes
• A pure tone can be synthesized as follows:
>> t = 0: 1/1000: 1;
fs = 200;
>> x = sin(2*pi* t* fs);
>> sound(x)
What will happen if we try?
>> x = [x x];
>> sound(x)
We can also generate superposition of two frequencies.
“Adding” Sinusoids
Middle C:
  262
+
A above
middle C:
  440
Insight Through Computing
=
“Adding” Sinusoids  averaging the sine values
Fs = 32768; tFinal = 1;
t = 0:(1/Fs):tFinal;
C3
yC3
A4
yA4
y
=
=
=
=
=
261.62;
sin(2*pi*C3*t);
440.00;
sin(2*pi*A4*t);
(yC3 + yA4)/2;
sound(y,Fs)
Insight Through Computing
Creating audio file containing note sequence
simplest example of music synthesis is to write down a sequence of
notes, generate the sine wave corresponding to the notes.
Should introduce appropriate delay so that the notes are played in
sequence.
Insight Through Computing
Equal-Tempered Tuning
0 A
55.00
1 A#
58.27
2 B
61.74
3 C
65.41
4 C#
69.30
5 D
73.42
6 D#
77.78
7 E
82.41
8 F
87.31
9 F#
92.50
10 G
98.00
11 G# 103.83
12 A 110.00
110.00
116.54
123.47
130.81
138.59
146.83
155.56
164.81
174.61
185.00
196.00
207.65
220.00
220.00
233.08
246.94
261.63
277.18
293.67
311.13
329.63
349.23
369.99
391.99
415.31
440.00
440.00
466.16
493.88
523.25
554.37
587.33
622.25
659.26
698.46
739.99
783.99
830.61
880.00
Entries are frequencies. Each column is an octave.
Magic factor = 2^(1/12). C3 = 261.63, A4 = 440.00
Insight Through Computing
880.00
932.33
987.77
1046.50
1108.73
1174.66
1244.51
1318.51
1396.91
1479.98
1567.98
1661.22
1760.00
1760.00
1864.66
1975.53
2093.01
2217.46
2349.32
2489.02
2637.02
2793.83
2959.95
3135.96
3322.44
3520.00
star.m contains a script to play twinkle twinkle ...
Insight Through Computing
Exercise: Write a program in Matlab that plays a
sequence of music clips in succession.
Possible solution:
playList ={‘whistle.wav’,'song.wav‘, ... , };
for k=1:length(playList)
[y,rate] = wavread(playList{k});
sound(y,rate)
end
Problem: audio will start playing song before whistle
finishes playing.
Correct way to solve the problem is to introduce appropriate delay
after each song.
pause(x) will introduce a delay of x seconds.
Calculate the delays based on the sampling rate and the number of
samples.
>> [x, fs] = wavread(file1);
>> [y, fs1] = wavread(file2);
>> sound(x, fs); pause( length(x)/fs); sound(y, fs1);
A simple application of signal analysis
Each button has its own
2-frequency “fingerprint”!
A phone dial pad has a frequency is associated with each row &
column.
So two frequencies are associated with each button.
Signal for button 5
fs = 32768;
tFinal = .5;
t = 0:(1/fs):tFinal;
yR = sin(2*pi*770*t);
yC = sin(2*pi*1336*t)
y = (yR + yC)/2;
sound(y, fs)
Received signal should be decoded to determine
the digits.
signal viewed in frequency domain
>> phone
Time Response
1
Signal
0.5
0
-0.5
-1
0
0.01
0.02
0.03
Time (sec)
0.04
0.05
Spectrum
5
Signal Power
10
0
10
-5
10
-10
10
Insight Through Computing
0
500
1000
Frequency (Hz)
1500
2000
What does the signal look like for a multi-digit call?
“Perfect” signal
Each band
matches one of
the twelve
“fingerprints”
Buttons pushed at equal time intervals
Insight Through Computing
One of the most difficult problems is
how to segment the multi-button
signal!
“Noisy” signal
Each band approximately
matches one of the twelve
“fingerprints.” There is noise
between the button pushes.
Buttons pushed at unequal time intervals
Insight Through Computing
Sending and deciphering noisy signals

Randomly choose a button





Choose random row and column numbers
Construct the real signal (MakeShowPlay)
Add noise to the signal (SendNoisy)
Compute cosines to decipher the signals
(ShowCosines)
See Eg13_2
Insight Through Computing
Exercise 13.1:
The audio file whistle.wav waveform is an eight-note ascending
scale. Use reversal and concatenation to generate an ascending
and descending scale.
Exercise 13.1:
The audio file whistle.wav waveform is an eight-note ascending
scale. Use reversal and concatenation to generate an ascending
and descending scale.
>> [x, fs] = wavread('whistle.wav');
>> y = x(length(x):-1:1);
>> sound(x, fs); pause(length(x)/fs); sound(y, fs);
Exercise 13.2
Find the lowest frequency signal you can hear.
Idea:
Everyone without hearing impairment can hear a sound at freq =
1000 Hz. (What if this is not true?)
So perform a binary search for the frequency that you can’t hear
by setting low = 1, high = 1000
Play the mid frequency and ask if the user can hear. Continue the
search in the lower or higher half of the range.
function res = lowsearch
upper = 1000; lower = 0; fs = 22050;
timelength=1.0; amp=1.0;
nsamps = timelength.*fs+1;
t = linspace(0, timelength, nsamps); f = upper;
sig = amp.*sin(2.*pi*f.*t);
sound(sig,fs); response = input('Did you hear that? (y or n)', 's');
if response ~= 'y'
error('The equipment is not working.');
end;
for k=1:10
middle = (lower + upper)./2;
sig = amp.*sin(2.*pi*middle.*t);
sig = sig.*(sin(pi.*t./timelength)).^2; sound(sig, fs); disp(middle);
response = input('Did you hear that? (y or n)', 's');
if response == 'n'
lower = middle;
else
upper = middle;
end;
end;
res = (upper + lower)./2;
Exercise 13.3 For a particular frequency, find the lowest amplitude
that you can hear.
Use the same binary search, this time on a range of amplitudes.
(Keep the frequency unchanged.)
Exercise 13.4
Write a program that plays the seven-note musical scale starting at
a given frequency. Note that the frequencies of successive notes in
the scale are separated by 2^(1/12).
function exer134(basefreq,dur)
if nargin < 2
dur = 0.5;
end
fs = 22050;
sig = [];
notes = [0,2,4,5,7,9,11,12];
t = linspace(0,dur - (1./fs), fs.*dur);
for k=1:length(notes)
note = sin(2.*pi.*t.*basefreq.*(2.^(notes(k)/12)));
sig = [sig, note];
end;
sound(sig,fs);
Exercise 13.5 (Home work problem) Your program should be
able to handle long sequence of digits.
Idea:
0. create the files ‘one.wav’, ‘two.wav’, etc. (You can do it by
recording all the sounds into a single file, then split into
separate files.)
1. Convert the input number to ASCII string.
2. For each character in the string, look up the sound file that is
relevant. Read in that sound file.
3. Concatenate the signals corresponding to digits.
Download