SMCS-14-11

advertisement

Richard Dobson

Dr Archer Endrich

Composers Desktop Project

CAS Wiltshire Hub

Kingdown School

Warminster

14 November 2012

Plato

The Science of Sound – a Micro-history

We stand on the shoulders of many giants .

“Musical training is a more potent instrument than any other, because rhythm and harmony find their way into the inward places of the soul, on which they mightily fasten, imparting grace, and making the soul of him who is rightly educated graceful.

Pythagoras

Hermann von Helmholtz

Max Mathews

Guido d ’Arezzo

Some topics in SMC

Digital Audio – sampling, synthesis, processing

Music Representation and Analysis

Performance and Interactive Composition

Languages for Music – Algorithmic Composition

Software and Hardware Design

Acoustics and Psychoacoustics

Sonification and Audification

The Shapes of Sound

A sound wave is bipolar .

• A wave comprises alternating displacements from a central “zero” position.

• For a sound wave the zero line corresponds to silence.

• Displacements are both positive and negative, and should sum to zero.

area above

= area below

Sampling Sound 1

• The overall process is generally called “digitising”

• Two aspects: we need to digitise both amplitude and time

• Quantisation of amplitude to discrete levels (represented in N-bit words)

• Sampling properly refers strictly to discretising of time (sampling rate)

(Hence technical literature will refer to

“periodic sampling”, “discrete-time”, etc)

• Quantisation introduces “quantisation error”, which manifests as

“quantisation noise”

• Sampling depends on a very accurate clock. Errors in timing are known as “jitter”; not something we need to worry about.

• Soundcard clocks are based on crystals, just as CPUs are.

• Nothing is perfect; one 44100 Hz clock may not exactly match another. Independent devices will drift out of sync over time.

Sound Example 1 – quantisation noise for N =

16, 12, 10,8,6,4,2,1

Quantisation – the Challenge

Integers: N bits gives us 2 N levels - an even number. Where is the middle?

Standard quantisation is called “ mid-tread ” qval = floor(val + 0.5)

• Includes zero valued sample

• = twos complement arithmetic

• Asymmetrical: e.g. 16-bit range = -32768 to +32767

• Tiny values quantise to zero, so are lost

• Standard choice for audio codecs

4-bit quantisation

The alternative is “mid-rise” quantisation qval = floor(val) + 0.5

•No zero value

•Symmetric for all level values

•Bipolar one-bit quantisation possible

•Tiny values quantise to quasi square wave

Sampling Sound 2: Nyquist

The “modified” Nyquist-Shannon sampling theorem f input

< sr

2

“Perfect reconstruction” requires phase independence.

(what the textbooks usually show) (what the textbooks usually don't show) cosine phase : amplitude = 1 sine phase : amplitude = 0

For perfect reconstruction, input frequencies must be below Nyquist.

The Nyquist limit itself (sr/2) defines the onset of frequency aliasing, where the Nyquist frequency aliases with DC .

Put another way: we need more than two samples per cycle.

Sampling Sound 3: anti-aliasing

Aliasing is now impossible to demonstrate using consumer hardware .

The crystal

This is a Cirrus Logic 8ch 192KHz Sigma-Delta oversampling ADC

Anti-Alias filter is integrated into the device

Even cheap chips do this now

So whatever sample rate you set, the input is correctly filtered

Examples of aliasing

To sample analogue audio without anti-alias filters we can use older types of

ADC (e.g. using the method of successive approximation ), or industrial data acquisition systems.

Sound Examples 2

These examples were prepared by Dr R.W. Stewart,

University of Strathclyde, for his CDROM project “DSPedia”.

• On the other hand, aliasing is very easy to demonstrate using digital sound synthesis .

• The dominant sources of aliasing these days are synthesis and processing, not recording.

Aliasing - a synthetic example

• We need a program to generate a plain sine frequency sweep

( “chirp signal”) and write it to a sound file.

• Use a low sampling rate : 11025 Hz

• Let the sweep rise to an extreme value : 16000 Hz!

• Listen to it….

• And view it in the frequency domain (Audacity)

Sound Example 3

Reconstruction : the Digital to Analogue Converter

• The DAC is strangely absent from most CS curricula

• but it is more important than the ADC:

• We can manage without audio input

• but not without audio output!

• With oversampling (as in the ADC), the final analogue reconstruction filter can be very simple – and cheap

• It restores the required smooth curves of the underlying waveform

Periodic Waves: Time v Distance

The Time Domain

The speed of sound in air is approximately 340 M/sec. So we can measure frequency either in terms of distance or in terms of time.

• Wavelength – literally the length of a cycle

• Period – duration of one cycle

• Frequency = speed of sound / wavelength

• Frequency = 1 / period

•Frequency is not a measure of either length or duration. It is therefore best to avoid labelling either wavelength or period directly as “frequency”.

Audio Data Representation –Time Domain

• Two basic forms – data stream, and a file format.

• Two primary number representations:

• Integer (e.g. -32768 to 32767)

• Floating point

• These days, the ± 1.0 floating point normalised representation is the most important.

• We can display amplitude (V scale) either as normalised sample values, or in decibels (dB)

Normalised - Audacity Decibel (logarithmic) scale – Adobe Audition

To convert an amplitude a to dB: dBval = 20.log

10

( a )

Amplitude Display – the dB log scale

Using the standard display, most of the signal is invisible.

The ear senses both loudness and frequency on a logarithmic scale e.g. from a maximum of 1.0 (0 dB) to less than 0.0001 (-80 dB).

Where does the sound finish?

Here?

Here!

Representation – Frequency Domain

We have two primary and complementary ways to represent sound.

• Time Domain : amplitude / time

• Frequency Domain : two related forms.

• Spectrum : amplitude / frequency

• Spectrogram (or sonogram) : spectrum / time

• Audacity supports both, with linear/logarithmic options

• Again, the log frequency scale reflects how we hear – e.g. axis marks in octaves

– or musical notes.

The figures below display a sine “log frequency sweep” (without aliasing)

Log vertical scale Linear vertical scale

Digital Sound Synthesis

A basic definition : using algorithms (and some maths) to generate audio data

Two computer-based approaches:

• Real-time, e.g. using “soft” or hardware-based synthesisers

• Offline – writing data to a soundfile for later playback.

• Many possible approaches; most are technically difficult, maths-heavy, and

(especially for real-time) computationally demanding – need fast hardware, and compilers able to generate very fast code.

One (relatively) simple but classic approach identifies three fundamental ingredients:

• Sine waves

• Noise

• Time-varying Control functions (“automation”, “breakpoint data”)

Together, these form the basis of additive synthesis . This means, quite literally, arbitrarily or algorithmically adding sound waves together – also known as “mixing”.

Music Synthesis and Algorithmic Composition

Concentrates on the control aspect.

• Most common route is the algorithmic generation of MIDI data, in real time or written as a standard MIDI file.

• Many free domain-specific languages are available. Some support both direct synthesis and algorithmic score generation using a library of freely arranged modules. The (arguably) pre-eminent example is Csound .

Algorithmic composition can be very complex, but can also be very simple, such as loop-based generation of scale and chord patterns.

The auto-arpeggiator built into many synths and home organs is a simple example of a musical automaton.

• MIT Scratch: supports basic soundfile playback and MIDI note generation. Loose timing limits scope to simple patterns.

• Python: many extension libraries available, for both synthesis and MIDI programming. It includes standard modules for basic soundfile i/o.

Sonification and Audification

The rendering of non-audio data as sound in order to reveal patterns and features.

• Audification : source data already has a time dimension.

• e.g. seismic, volcanic, astrophysics, even stock price movements .

• Sonification: applied to any arbitrary numeric data.

• Generally applied to large data sets which are already a challenge to analyse.

For example, we have worked on particle collision data from the Large Hadron Collider

(searching for the Higgs boson), as part of the LHCsound outreach project 1 .

• However, it can be applied to small data sets and processes too:

• Any algorithms involving lists, iteration and loops

• Shapes of mathematical functions and formulae

• Whether the output is sonification or algorithmic composition depends entirely on your intention and interest – the process itself is the same.

(Examples of simple sonification were presented in Scratch)

1 http://people.bath.ac.uk/masrwd/lhcsoundresources.html and http://www.lhcsound.com

Download