Week 8 - bYTEBoss

Download Report

Transcript Week 8 - bYTEBoss

Digital Audio Basics
“Any signal can be completely reconstructed from
samples.” - Harry Nyquist
Know your prefixes!







Giga (G) – billion; Mega (M) – million; kilo (k) –
thousand
Bit (b) – a binary digit
Byte (B) – 8 bits
Word – can be several bytes (wordlength),
usually measured in bits
Mbps – Mega bits per second
kHz – kilo Hertz
GB – Giga Bytes
Converting Voltage to
numbers








Microphone converts sound pressure into voltage; the voltage is constantly
changing over time (just like the sound pressure – think analogous)
An Analog to Digital Converter (ADC) measures the voltage at set intervals
in time (snapshots) and records each measured voltage as a number
(numbers = digits)
This process is called sampling. The number of snapshots the ADC takes
every second is called the sampling rate
The computer stores these numbers in such a way that they can be recalled
in the order in which they happened
On playback, these samples are converted back to voltages by the Digital to
Analog Converter (DAC)
The DAC sends out a pulse the amplitude of which is determined by the
value of the sample (Pulse Code Modulation or PCM)
All of these pulses added together recreate the original waveform (motion
picture analogy).
This process can take a certain amount of time, which can cause an audible
delay in the reproduced audio. This delay, caused by processing, is called
latency
(microphone)
TYPICAL SIGNAL FLOW IN A
DIGITAL AUDIO CHAIN
Often the computer acts as the ADC,
Digital Recorder, and DAC
Voltage
Preamp
Numbers
(digits)
ADC
A computer soundcard can be internal
or external. Many external soundcards
use Firewire or USB. All computer
soundcards have ADC’s and DAC’s built
into them. Some even have built-in
preamps.
(speaker)
Digital Recorder
Voltage
DAC
Power Amp
Who was Nyquist and why you
should care

Any signal can be completely reconstructed from samples. In
order to accurately reproduce the original signal,
you must sample the signal at more than twice its
highest frequency








Harry Nyquist
Humans hear frequencies between 20 Hz and 20 kHz. In order to
accurately reproduce all frequencies in this bandwidth, you must
take samples faster than 40 kHz
The sampling rate for CD-quality audio is 44.1 kHz
You must filter out any frequencies above the Nyquist limit (fs/2)
If you do not remove frequencies above the limit, aliasing will occur
Aliasing – an artifact where frequencies which are higher than the
Nyquist limit are folded back into the hearable spectrum (e.g.
Nyquist limit = 20 kHz; 30 kHz becomes 10 kHz)
Generally, the higher the sampling rate, the more frequencies you
are able to reproduce (greater bandwidth)
Sampling rate corresponds with accuracy in reproducing the
frequency component of audio
Importance of bit depth








Computers use data that is stored in binary numbers.
All of the sampled measurements of voltage must be converted to
binary numbers
Computers have a limited number of fixed values that can be used
to represent the measured voltage.
The math: an 8-bit converter has 28 or 256 possible values, a 16-bit
converter has 216 or 65,536 possible values, and a 24-bit converter
has 224 or 16,777,216 possible values
Measurements that fall between these values are rounded off,
affecting the accuracy of the reproduced signals
This can be translated into increased dynamic range, better signalto-error ratio, more headroom and/or better resolution.
Generally, the higher the bit depth, the more accurate the
reproduction of the signal.
Bit Depth corresponds to accuracy in reproducing the
amplitude component of audio.
Role of the word clock







Imagine a world where every clock would measure time differently (a minute
is 45 seconds here and 65 seconds there)
All digital devices have internal clocks. (ex. Your computer’s processor
speed is also known as its clock speed)
When digital devices transfer data, their clocks must be synchronized. In
digital audio, this is accomplished using a word clock.
The sending device sends a word clock signal which overrides the internal
clock of the receiving device, ensuring that the two devices are “on the
same page”.
If the clocks are not synchronized, there will be “clock errors” – audible
clicks and pops in the audio.
In many studios, there is one master clock which controls the clocks in all of
the digital devices in the studio, ensuring that they are all operating at the
same sampling rate and that their clocks are all “ticking” at the same time
Clock signal can be transmitted with the audio or can be sent separately
(generally via a coaxial cable with a BNC-type connector)
Different formats = Alphabet
Soup





There are many different types of digital audio signals. There are
three that are very common.
SPDIF – Sony/Phillips Digital interface. Mostly uses RCA
connectors and carries two channels of digital audio over each
connection. Unbalanced – short cable runs only
AES/EBU – Audio Engineering Society/European Broadcast
Union: Mostly uses XLR connectors and carries two channels over
each connection. Balanced – can accommodate long cable runs
without loss
ADAT – Alesis Digital Audio Tape: A proprietary format from
Alesis. Carries 8 channels of digital audio over a single fibre-optic
connection at sampling rates up to 48 kHz. Can carry 4 channels at
sampling rates of 88 kHz and 96 kHz
These three types of connections carry clock signal embedded in
the data stream
DSP – mixing is math





DSP – Digital Signal Processing
Every change to your digital audio signal – even
something as mundane as changing the volume – is a
mathematical operation on the stored digital audio
samples.
Mixing two signals together is a simple matter of
addition. Changing volume is a multiplication problem.
All kinds of complicated processing is done using math
to change the original sampled data
Some DSP is done in “real-time” while other processing
is file-based meaning that it actually changes the data in
the digital audio file.
Digital Zero – yet another dB
scale!










Most analog meters are measured in Volume Units (dBVU).
0 dBVU usually corresponds to the voltage of a line level signal (+4 dBu)
Digital meters use dBFS. A zero on this meter means the converter has run
out of numbers to represent the waveform.
If you try to go above this level, you will get digital distortion.
Unlike analog distortion, there is never anything pleasant about digital
distortion.
A converter’s sensitivity can be adjusted to correspond to different levels.
Common levels are 0 dBVU = -16 dBFS or 0 dBVU = -18 dBFS.
In the first case, the converter would not be able to digitally represent a
signal that is greater than +16 dBVU – way off the scale of most analog
meters.
On an Analog meter, a reading of 0 dB usually means that you still can
push a bit more level before you seriously distort the signal.
On a Digital meter, 0 dB means you have no values left to represent
the signal
PROPER GAIN STAGING IS EVEN MORE IMPORTANT IN DIGITAL
RECORDING
Digital Audio File Formats






There are many different audio file formats. They fall into two
general categories: compressed and uncompressed.
Uncompressed formats include Wave (.wav, .bwf), Audio
Interchange File Format (AIFF) (.aif), and Sound Designer II (SDII).
These file formats are PCM audio files (Pulse Code Modulation) and
they contain ALL of the samples that make up the digital audio file
exactly as they were recorded.
Compressed formats include MPEG-1 layer III (.mp3), MPEG-2
AAC, RealAudio (.ra), Windows Media Audio File (.wma), and
OggVorbis.
Compressed audio files go through a process known as lossy data
compression - the data that make up the file are completely altered
and much of the information is discarded
THIS IS NOT TO BE CONFUSED WITH USING AN AUDIO
COMPRESSOR IN A STUDIO!!!!
How data compression works




Data Compression is a process where a program analyzes a file to
see how much of the data can be done away with while still retaining
the ability to reconstruct the original data (e.g. a Winzip file)
When compressing an audio file, there is a target bit rate in mind.
An algorithm is called upon to see how much audio data must be
thrown away to reach this bit rate
In a perceptual coder (e.g. an MP3 encoder), the algorithm is
designed to estimate how much of the audio you will actually
perceive, based on knowledge of the frequency response of human
hearing. Enough energy at one frequency may impair your ability to
hear energy at another frequency. Anything that the algorithm thinks
that you won’t miss is gotten rid of.
The Audio is divided into different frequency bins. Certain frequency
bands are often done away with entirely. The audio is often distorted
by the process, since there is a tradeoff between accuracy in the
frequency and time domains.
Summary







Analog voltages are converted to digital values through
sampling; Your sampling rate must be 2x the highest
frequency in your signal
Bit depth is the number of bits used to encode a single
sample; More bits are usually better
Latency is delay caused by processing
PCM audio files (WAV, AIF) preserve every single
sample.
Lossy compression formats (MP3) throw away much of
the audio information that was originally recorded
When recording from a DIGITAL source (recorders,
external converters), make sure your clock is set
correctly!
Beware of the red light: DIGITAL DISTORTION IS NOT A
GOOD THING!