Mark Antony Elphinstone-Hoadley 04131216
Module U08840
Laboratory 4 – Analysing and Editing Audio Files Using ‘Audition’
1) 8840 lab file_1
This file is exactly 12.244 seconds long and has a sample rate of 22050 Hz, with a sample size of 16 bits. The dominant frequencies filled on the spectral diagram reside mostly around the 0-2000 Hz region.
On close analysis of this file, it is noticeable that there is a lot of high and low end background noise, with a clear distinguishable vocal sample over the background noise. The voice is stuttering through the spoken words.
The first task would be to normalise the file to ensure we have a clear way of finding the noise to reduce, and also to make the file suitable to a normal range.
Next we would want to remove the background noise. Using the noise reduction facility in Audition, a profile of the background noise was captured using the ‘capture profile’ facility in the noise reduction facility.
See fig 1-2
1
Mark Antony Elphinstone-Hoadley
Fig 1-2
04131216
The noise was removed using a flat noise reduction with a dB cut off of 60 Hz. This was to ensure that all noise was removed from the file. Finally all that is left to be done is to edit down the clip so that there are no stutters, by deleting those parts. This is a trivial thing to do, as it only improves the file content not the quality.
2
Mark Antony Elphinstone-Hoadley 04131216
2) 8840 lab file_2
This file is exactly 7.5 seconds long and has a sample rate of 22050 Hz, with a sample size of 16 bits. The dominant frequencies filled on the spectral diagram reside mostly around the 0-4000 Hz region.
This file has a normal vocal sample, with a static sound mixed in at around the same level, which disturbs the noise and quality of the file. The vocal sample only reaches
4000 Hz mainly, according to the spectral view.
First the static needs to be removed from the file. Considering the static sound fills a large frequency range from 0-11025 Hz, a good noise reduction profile needs to be taken of the sound file to remove the static. This profile capture was done at 1.211 seconds to 1.579 seconds. Once this profile was captured we removed the noise using the noise reduction facility, concentrating on the high end frequencies that the vocal audio was not in the region of. The reduction was done at around 40 dB. See fig 2-2
3
Mark Antony Elphinstone-Hoadley
Fig 2-2
04131216
Now we are left with a metallic slight background sound. Seeing as the vocals mostly range up to 4000 Hz, we can remove all the frequency by deleting this from the spectral view, to minimalise the amount of background sound.
4
Mark Antony Elphinstone-Hoadley 04131216
Finally a parametric equaliser is used to strengthen the high and mid frequencies in the new broken down sample.
3) 8840 lab file_3
This file is exactly 5.159 seconds long and has a sample rate of 22050 Hz, with a sample size of 16 bits. The dominant frequencies filled on the spectral diagram reside mostly around the 0-3000 Hz region, although there are strengths in other regions.
This file has a normal distinguishable vocal sample with very loud low frequency from a bad connection going through it. The noise of the low frequency is only between 0-100 Hz and does not go into the vocal sample range.
To improve the quality of this file we begin by using a high range graphic equaliser to remove the levels of noise coming through. To ensure no noise could be heard when cutting off frequencies, I made sure that the range was set to 60 dB (+/- 30 dB). All frequency from 0-100 Hz was then cut off, leaving a clear sample with a little high end background noise. See fig 3-1
5
Mark Antony Elphinstone-Hoadley
Fig 3-1
04131216
To fix the issue of the high-end background noise, I then ran a profile capture from
0.334 seconds to 0.697 seconds. A flat noise reduction was then initialised, to the power of 40 dB.
6
Mark Antony Elphinstone-Hoadley 04131216
The file was then cut down at the beginning and end to reduce the file size, and then normalised at 95%.
4) 8840 lab file_4
This file is exactly 3.105 seconds long and has a sample rate of 22050 Hz, with a sample size of 16 bits. All frequency sits within 0-1250 Hz, as if all frequencies above this range have been deleted in the spectral view. This may signify that the original recording was converted to a 2500 Hz sample rate file, and then saved as a 22050 Hz sample rate file.
This file is very difficult to distinguish seeing as most of the vocal part is removed in frequency. You can only just make out what it is saying.
Here, the vocals in this file are booming and so the lower frequencies from this need to be reduced in the frequency range 0-500 Hz to improve any of the sound and make it more distinguishable (with a range of 36 dB (+/- 18 dB)). See fig 4-1
7
Mark Antony Elphinstone-Hoadley 04131216
Then finally I ran a parametric equaliser on the entire file to boost frequencies around
400 Hz upwards, as seen below.
8
Mark Antony Elphinstone-Hoadley 04131216
5) 8840 lab file_5
This file is exactly 2.4 seconds long and has a sample rate of 22050 Hz, with a sample size of 16 bits. The sound file is clear and so all frequencies fill a normal range in the spectral view.
This file contains a vocal sample with added echo already applied. There is a slight detectable noise level.
First the background noise was captured from the point 2.111 to 2.400 seconds. Then the noise was removed at flat level from the entire sample at 60 dB reduction range.
The echo was then to be removed to give the vocals a natural sound rather than a distractive echo effect. This was done by reducing the volume where the echo is using the amplify/fade utility. The echo was reduced by -30 dB to create a literal silence.
See fig 5-2
9
Mark Antony Elphinstone-Hoadley
Fig 5-2
04131216
At the points before the dB cut was made, I faded out the words then to make them sound more natural. Empty space at the end was then deleted for file space.
10
Mark Antony Elphinstone-Hoadley 04131216
6) 8840 lab file_6
This file is exactly 10 seconds long and has a sample rate of 44100 Hz, with a sample size of 16 bits. The overall file on the spectral view goes up to a maximum of 17800
Hz, with scattered frequencies and strengths below.
This file is a short sample of a Jazz piece which cuts off quickly at the beginning.
According to Nyquist’s theorem, this file’s sample rate could be brought down to
35600 Hz. This would not affect the audio and will reduce file size. In the case of the number of bytes needed to store this file in its original form would be the following calculation.
Song
Sample rate
=
=
10 seconds
44100 Hz
Sample size
Channel
=
=
16 bit
Mono
10 * 16 * 44100 * 1 = 7056000 kilobits per second
7056000 / 8 = 882,000 bytes = 861.328 kilobytes.
My compressed file (of 35600 Hz sample rate) would give you:
10 * 16 * 35600 * 1 = 5696000 kilobits per second
5696000 / 8 = 712,000 bytes = 695.312 kilobytes
11