Chapter 13 Sounds and signals • basics of computer sound • perception and generation of sound • synthesizing complex sounds • sampling sound signals • simple example of signal processing Sources used: (a) Chapter 13 of the text (b) Daisy Fan, Cornell University Role of audio in computers • audio is an important sensory signal • crucial component of multimedia data – audio, music • tools for interacting with digital computers for visually impaired persons • music analysis and synthesis • speech processing and synthesis Basics of computer sound in Matlab • Matlab can open a file in .wav format: >> [x, fs, bits] = wavread(‘fh.wav’); >> sound(x, fs); % plays the sound clip Computing with sound in Matlab requires that we first convert the wav format data into simple numeric data—the job of the function wavread. Variable fs above represents the number of samples per second and bits represents the number of bits used represent each sample. Basics of computer sound >> fs fs = 44100 >> bits bits = 16 >> x(100:105) ans = -0.0768 0.0174 0.0345 0.0008 -0.0060 0.0026 >> plot(1:length(x), x); Basics of computer sound Some basic questions: • why does the sound waveform range in amplitude from -1 to 1? • what role does the sampling frequency play in the quality of the sound? • what happens if we play back the sound at a different sampling frequency? Computing with sound requires digitization • Sound is (analog) continuous; capture its essence by sampling • Digitized sound is a vector of numbers Sampling rate affects the quality If sampling not frequent enough, then the discretized sound will not capture the essence of the continuous sound and the quality will be poor. Sampling Rate • Given human perception, 20000 samples/second is pretty good (20000Hz or 20kHz). • 8,000 Hz required for speech over the Telephone • 44,100 Hz required for audio CD • 192,400 Hz required for HD-DVD audio tracks Resolution also affects the quality • Typically, each sampled value is encoded as an 8bit integer in the .wav file. • Possible values: -128, -127,…,-1,0,1,…,127 • Loud: -120, 90, 122, etc. • Quiet: 3, 10, -5 • 16-bit used when very high quality is required. • Wavread converts 8 bit values into real numbers in [-1,1]. Amplitude, frequency and phase P(t) = A sin( 2*pi*f*t + phi) f = frequency phi = phase (shift) A = amplitude • a single sine wave models a pure musical tone. • amplitude determines the loudness. • human perception of sound is in the range 50 Hz to 15000 Hz. • most sounds (and music) are complicated mixture of various frequencies. Calculating frequencies of notes • The author of the text has recorded a whistle sound in the file named whistle.wav. >> [w, fs, bits] = wavread('whistle.wav'); >> length(w) ans = 66150 >> fs fs = 11025 Question: What is the duration of the whistle? What are the various frequencies he has used? whistle.wav – wave Zooming in on one region Question: what is the frequency? Note the sample length is 100 and the waveform has repeated about 9 times. 9 cycles/100 samples. This works out to about 1KHz. Can you see why? Synthesis of sound – combining pure notes • A pure tone can be synthesized as follows: >> t = 0: 1/1000: 1; fs = 200; >> x = sin(2*pi* t* fs); >> sound(x) What will happen if we try? >> x = [x x]; >> sound(x) We can also generate superposition of two frequencies. “Adding” Sinusoids Middle C: 262 + A above middle C: 440 Insight Through Computing = “Adding” Sinusoids averaging the sine values Fs = 32768; tFinal = 1; t = 0:(1/Fs):tFinal; C3 yC3 A4 yA4 y = = = = = 261.62; sin(2*pi*C3*t); 440.00; sin(2*pi*A4*t); (yC3 + yA4)/2; sound(y,Fs) Insight Through Computing Creating audio file containing note sequence simplest example of music synthesis is to write down a sequence of notes, generate the sine wave corresponding to the notes. Should introduce appropriate delay so that the notes are played in sequence. Insight Through Computing Equal-Tempered Tuning 0 A 55.00 1 A# 58.27 2 B 61.74 3 C 65.41 4 C# 69.30 5 D 73.42 6 D# 77.78 7 E 82.41 8 F 87.31 9 F# 92.50 10 G 98.00 11 G# 103.83 12 A 110.00 110.00 116.54 123.47 130.81 138.59 146.83 155.56 164.81 174.61 185.00 196.00 207.65 220.00 220.00 233.08 246.94 261.63 277.18 293.67 311.13 329.63 349.23 369.99 391.99 415.31 440.00 440.00 466.16 493.88 523.25 554.37 587.33 622.25 659.26 698.46 739.99 783.99 830.61 880.00 Entries are frequencies. Each column is an octave. Magic factor = 2^(1/12). C3 = 261.63, A4 = 440.00 Insight Through Computing 880.00 932.33 987.77 1046.50 1108.73 1174.66 1244.51 1318.51 1396.91 1479.98 1567.98 1661.22 1760.00 1760.00 1864.66 1975.53 2093.01 2217.46 2349.32 2489.02 2637.02 2793.83 2959.95 3135.96 3322.44 3520.00 star.m contains a script to play twinkle twinkle ... Insight Through Computing Exercise: Write a program in Matlab that plays a sequence of music clips in succession. Possible solution: playList ={‘whistle.wav’,'song.wav‘, ... , }; for k=1:length(playList) [y,rate] = wavread(playList{k}); sound(y,rate) end Problem: audio will start playing song before whistle finishes playing. Correct way to solve the problem is to introduce appropriate delay after each song. pause(x) will introduce a delay of x seconds. Calculate the delays based on the sampling rate and the number of samples. >> [x, fs] = wavread(file1); >> [y, fs1] = wavread(file2); >> sound(x, fs); pause( length(x)/fs); sound(y, fs1); A simple application of signal analysis Each button has its own 2-frequency “fingerprint”! A phone dial pad has a frequency is associated with each row & column. So two frequencies are associated with each button. Signal for button 5 fs = 32768; tFinal = .5; t = 0:(1/fs):tFinal; yR = sin(2*pi*770*t); yC = sin(2*pi*1336*t) y = (yR + yC)/2; sound(y, fs) Received signal should be decoded to determine the digits. signal viewed in frequency domain >> phone Time Response 1 Signal 0.5 0 -0.5 -1 0 0.01 0.02 0.03 Time (sec) 0.04 0.05 Spectrum 5 Signal Power 10 0 10 -5 10 -10 10 Insight Through Computing 0 500 1000 Frequency (Hz) 1500 2000 What does the signal look like for a multi-digit call? “Perfect” signal Each band matches one of the twelve “fingerprints” Buttons pushed at equal time intervals Insight Through Computing One of the most difficult problems is how to segment the multi-button signal! “Noisy” signal Each band approximately matches one of the twelve “fingerprints.” There is noise between the button pushes. Buttons pushed at unequal time intervals Insight Through Computing Sending and deciphering noisy signals Randomly choose a button Choose random row and column numbers Construct the real signal (MakeShowPlay) Add noise to the signal (SendNoisy) Compute cosines to decipher the signals (ShowCosines) See Eg13_2 Insight Through Computing Exercise 13.1: The audio file whistle.wav waveform is an eight-note ascending scale. Use reversal and concatenation to generate an ascending and descending scale. Exercise 13.1: The audio file whistle.wav waveform is an eight-note ascending scale. Use reversal and concatenation to generate an ascending and descending scale. >> [x, fs] = wavread('whistle.wav'); >> y = x(length(x):-1:1); >> sound(x, fs); pause(length(x)/fs); sound(y, fs); Exercise 13.2 Find the lowest frequency signal you can hear. Idea: Everyone without hearing impairment can hear a sound at freq = 1000 Hz. (What if this is not true?) So perform a binary search for the frequency that you can’t hear by setting low = 1, high = 1000 Play the mid frequency and ask if the user can hear. Continue the search in the lower or higher half of the range. function res = lowsearch upper = 1000; lower = 0; fs = 22050; timelength=1.0; amp=1.0; nsamps = timelength.*fs+1; t = linspace(0, timelength, nsamps); f = upper; sig = amp.*sin(2.*pi*f.*t); sound(sig,fs); response = input('Did you hear that? (y or n)', 's'); if response ~= 'y' error('The equipment is not working.'); end; for k=1:10 middle = (lower + upper)./2; sig = amp.*sin(2.*pi*middle.*t); sig = sig.*(sin(pi.*t./timelength)).^2; sound(sig, fs); disp(middle); response = input('Did you hear that? (y or n)', 's'); if response == 'n' lower = middle; else upper = middle; end; end; res = (upper + lower)./2; Exercise 13.3 For a particular frequency, find the lowest amplitude that you can hear. Use the same binary search, this time on a range of amplitudes. (Keep the frequency unchanged.) Exercise 13.4 Write a program that plays the seven-note musical scale starting at a given frequency. Note that the frequencies of successive notes in the scale are separated by 2^(1/12). function exer134(basefreq,dur) if nargin < 2 dur = 0.5; end fs = 22050; sig = []; notes = [0,2,4,5,7,9,11,12]; t = linspace(0,dur - (1./fs), fs.*dur); for k=1:length(notes) note = sin(2.*pi.*t.*basefreq.*(2.^(notes(k)/12))); sig = [sig, note]; end; sound(sig,fs); Exercise 13.5 (Home work problem) Your program should be able to handle long sequence of digits. Idea: 0. create the files ‘one.wav’, ‘two.wav’, etc. (You can do it by recording all the sounds into a single file, then split into separate files.) 1. Convert the input number to ASCII string. 2. For each character in the string, look up the sound file that is relevant. Read in that sound file. 3. Concatenate the signals corresponding to digits.