Transcript SMCS-14-11
Richard Dobson
Dr Archer Endrich
Composers Desktop Project
CAS Wiltshire Hub
Kingdown School
Warminster
14 November 2012
The Science of Sound – a Micro-history
We stand on the shoulders of many giants.
“Musical training is a more potent instrument than any other, because rhythm and harmony
find their way into the inward places of the soul, on which they mightily fasten, imparting
grace, and making the soul of him who is rightly educated graceful.”
Plato
Pythagoras
Guido d’Arezzo
Hermann von Helmholtz
Max Mathews
Some topics in SMC
Digital Audio – sampling, synthesis, processing
Music Representation and Analysis
Performance and Interactive Composition
Languages for Music – Algorithmic Composition
Software and Hardware Design
Acoustics and Psychoacoustics
Sonification and Audification
The Shapes of Sound
A sound wave is bipolar.
• A wave comprises alternating displacements from a central “zero” position.
• For a sound wave the zero line corresponds to silence.
• Displacements are both positive and negative, and should sum to zero.
area above
= area below
Sampling Sound 1
•
The overall process is generally called “digitising”
•
Two aspects: we need to digitise both amplitude and time
•
Quantisation of amplitude to discrete levels (represented in N-bit words)
•
Sampling properly refers strictly to discretising of time (sampling rate)
•
(Hence technical literature will refer to “periodic sampling”, “discrete-time”, etc)
•
Quantisation introduces “quantisation error”, which manifests as
“quantisation noise”
•
Sampling depends on a very accurate clock. Errors in timing are
known as “jitter”; not something we need to worry about.
•
Soundcard clocks are based on crystals, just as CPUs are.
•
Nothing is perfect; one 44100 Hz clock may not exactly match
another. Independent devices will drift out of sync over time.
Sound Example 1 –
quantisation noise for N =
16, 12, 10,8,6,4,2,1
Quantisation – the Challenge
Integers: N bits gives us 2N levels - an even number. Where is the middle?
Standard quantisation is called “mid-tread”
qval = floor(val + 0.5)
•
•
•
•
•
Includes zero valued sample
= twos complement arithmetic
Asymmetrical: e.g. 16-bit range = -32768 to +32767
Tiny values quantise to zero, so are lost
Standard choice for audio codecs
4-bit quantisation
The alternative is “mid-rise” quantisation
qval = floor(val) + 0.5
•No zero value
•Symmetric for all level values
•Bipolar one-bit quantisation possible
•Tiny values quantise to quasi square wave
Sampling Sound 2: Nyquist
The “modified” Nyquist-Shannon sampling theorem
sr
f input <
2
“Perfect reconstruction” requires phase independence.
(what the textbooks usually show)
cosine phase : amplitude = 1
(what the textbooks usually don't show)
sine phase : amplitude = 0
For perfect reconstruction, input frequencies must be below Nyquist.
The Nyquist limit itself (sr/2) defines the onset of frequency aliasing,
where the Nyquist frequency aliases with DC.
Put another way: we need more than two samples per cycle.
Sampling Sound 3: anti-aliasing
Aliasing is now impossible to demonstrate using consumer hardware.
The crystal
This is a Cirrus Logic 8ch 192KHz Sigma-Delta oversampling ADC
Anti-Alias filter is integrated into the device
Even cheap chips do this now
So whatever sample rate you set, the input is correctly filtered
Examples of aliasing
To sample analogue audio without anti-alias filters we can use older types of
ADC (e.g. using the method of successive approximation), or industrial data
acquisition systems.
Sound Examples 2
These examples were prepared by Dr R.W. Stewart,
University of Strathclyde, for his CDROM project “DSPedia”.
• On the other hand, aliasing is very easy to demonstrate
using digital sound synthesis.
• The dominant sources of aliasing these days are synthesis
and processing, not recording.
Aliasing - a synthetic example
•
We need a program to generate a plain sine frequency sweep
(“chirp signal”) and write it to a sound file.
•
Use a low sampling rate : 11025 Hz
•
Let the sweep rise to an extreme value : 16000 Hz!
•
Listen to it….
•
And view it in the frequency domain (Audacity)
Sound Example 3
Reconstruction : the Digital to Analogue Converter
•
The DAC is strangely absent from most CS curricula
•
but it is more important than the ADC:
•
•
We can manage without audio input
but not without audio output!
• With oversampling (as in the ADC), the final analogue
reconstruction filter can be very simple – and cheap
• It restores the required smooth curves of the underlying waveform
Periodic Waves: Time v Distance
The Time Domain
The speed of sound in air is approximately 340 M/sec. So we can measure
frequency either in terms of distance or in terms of time.
•
•
Wavelength – literally the length of a cycle
Period – duration of one cycle
•
•
Frequency = speed of sound / wavelength
Frequency = 1 / period
•Frequency is not a measure of either length or duration. It is therefore best to avoid
labelling either wavelength or period directly as “frequency”.
Audio Data Representation –Time Domain
• Two basic forms – data stream, and a file format.
• Two primary number representations:
•
Integer (e.g. -32768 to 32767)
•
Floating point
• These days, the ± 1.0 floating point normalised
representation is the most important.
• We can display amplitude (V scale) either as
normalised sample values, or in decibels (dB)
Normalised - Audacity
To convert an amplitude a to dB:
Decibel (logarithmic) scale – Adobe Audition
dBval = 20.log10(a)
Amplitude Display – the dB log scale
Using the standard display, most of the signal is invisible.
The ear senses both loudness and frequency on a logarithmic scale
e.g. from a maximum of 1.0 (0 dB) to less than 0.0001 (-80 dB).
Where does the sound finish?
Here?
Here!
Representation – Frequency Domain
We have two primary and complementary ways to represent sound.
•
Time Domain : amplitude / time
•
Frequency Domain : two related forms.
•
Spectrum : amplitude / frequency
•
Spectrogram (or sonogram) : spectrum / time
• Audacity supports both, with linear/logarithmic options
•
Again, the log frequency scale reflects how we hear – e.g. axis marks in
octaves – or musical notes.
The figures below display a sine “log frequency sweep” (without aliasing)
Log vertical scale
Linear vertical scale
Digital Sound Synthesis
A basic definition : using algorithms (and some maths) to generate audio data
Two computer-based approaches:
•
Real-time, e.g. using “soft” or hardware-based synthesisers
•
Offline – writing data to a soundfile for later playback.
•
Many possible approaches; most are technically difficult, maths-heavy, and
(especially for real-time) computationally demanding – need fast hardware, and
compilers able to generate very fast code.
One (relatively) simple but classic approach identifies three fundamental ingredients:
• Sine waves
• Noise
• Time-varying Control functions (“automation”, “breakpoint data”)
Together, these form the basis of additive synthesis. This means, quite literally,
arbitrarily or algorithmically adding sound waves together – also known as “mixing”.
Music Synthesis and Algorithmic Composition
Concentrates on the control aspect.
• Most common route is the algorithmic generation of MIDI data, in
real time or written as a standard MIDI file.
• Many free domain-specific languages are available. Some support
both direct synthesis and algorithmic score generation using a
library of freely arranged modules. The (arguably) pre-eminent
example is Csound.
Algorithmic composition can be very complex, but can also be very simple,
such as loop-based generation of scale and chord patterns.
The auto-arpeggiator built into many synths and home organs is a simple
example of a musical automaton.
• MIT Scratch: supports basic soundfile playback and MIDI note
generation. Loose timing limits scope to simple patterns.
• Python: many extension libraries available, for both synthesis and MIDI
programming. It includes standard modules for basic soundfile i/o.
Sonification and Audification
The rendering of non-audio data as sound in order to reveal patterns and features.
• Audification : source data already has a time dimension.
•
e.g. seismic, volcanic, astrophysics, even stock price movements .
• Sonification: applied to any arbitrary numeric data.
• Generally applied to large data sets which are already a challenge to analyse.
For example, we have worked on particle collision data from the Large Hadron Collider
(searching for the Higgs boson), as part of the LHCsound outreach project1.
• However, it can be applied to small data sets and processes too:
•
Any algorithms involving lists, iteration and loops
•
Shapes of mathematical functions and formulae
• Whether the output is sonification or algorithmic composition depends entirely on
your intention and interest – the process itself is the same.
(Examples of simple sonification were presented in Scratch)
1
http://people.bath.ac.uk/masrwd/lhcsoundresources.html and http://www.lhcsound.com