Transcript Lecture 11

Music Hath Charms to Sooth the
Savage Beast
Introduction to Sound Processing
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
1
Some Sources Used
• Richard E Berg: Physics 102: PHYSICS OF MUSIC, University of
Maryland
• Robert Jourdain (1997). Music, the Brain and Ecstasy. Quill.
• Various bits of Wikipedia
• Dolby Sound
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
2
Sound Is Analog
• So there’s infinite variation
• Like a rock thrown into a pond, there are waves:
– Amplitude: how high the waves are -- Loudness
– Frequency: how many waves per second -- Pitch
• Loudness is measured in decibels
– This is a log scale, so 20 is ten times as loud as 10, 30 is ten times
as loud as 20, and so forth.
– You can distinguish from just over 0 dB to 120dB
• 37
• 59
• 76
• 110
• 140
4/11/2016
quiet office (no air-conditioning)
conversation
loud factory
really loud night club or rave
threshold of pain (well, for some)
CSE5060 -- Multimedia on the Web -Lecture11
3
Sound Is Analog, 2
• Pitch is measured in Hz (Hertz, cycles per second) and kHz.
• You can distinguish between a few Hz and 15 - 20 khz (this
is age dependent)
–
–
–
–
–
Lowest note on piano
Highest note on piano
Lowest vocal sound
Highest vocal sound
The A above middle C
27 Hz
4,186 Hz (4.186 kHz)
80 Hz
800 Hz
440 Hz (used to be lower!)
• But just a sine wave at these frequencies sounds sterile: it
lacks the overtones, the harmonics, produced by all
natural sources of sound.
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
4
This is a sine wave, which may represent a
“pure” (and so artificial) sound
Its frequency (tone) is the distance between crests -- hertz
Its amplitude (loudness) is the height of the crests -- decibels
With frequent sampling we can capture both frequency and amplitude in
a single series of numbers
Any sound can be reproduced using a sequence of overlaid sine waves
(Fourier transformations)
Two sine waves interacting
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
5
Notice how Complicated the Vibrations
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
6
Sound Is Analog, 3
• An instrument vibrating produces lots of sounds above the
fundamental tone.
• Many of these are various octaves above the fundamental
– Octave = double the frequency
– To get realistic sound we have to pick up at least the 4th
harmonic, 4x the frequency of the fundamental.
– So we have to pick up to 12kHz for, say a realistic flute sound
(where the highest fundamental is just under 4kHz
– More is better until, say 20kHz where a 5 year old’s hearing cuts
out.
• So how frequently do we need to sample to get “realistic”
sounds?
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
7
Poor Sampling Rate: Graphic
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
8
Nyquist-Shannon Sampling Theorem
• You have to sample at 2x the size of the
smallest difference you want to catch.
• Remember, we are sampling the volume
(loudness) of a sound consisting of lots of
superimposed fundamental and harmonic
frequencies.
• So there are 44,100 samples per second, each
a 2 byte --16 bit between 0 and 64k
• Sampling Demo Program
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
9
MP3
• MPEG-1, Layer 3 sound compression -- intended for
movies on CD and DVD
–
–
–
–
–
90+% compression possible
A typical song (50mb) goes to 5mb
Is a lossy compression, so the quality goes down
No encryption in any way
No “watermark” (watermark = a secret pattern of bits somewhere
which indicates the source of the copy)
– Much music publisher panic with the sudden popularity of the
format.
– Much more music publisher panic with the iPod and friends
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
10
MP3 Sound Isn’t Very Good…
• Having failed at my attempt to demonstrate
how bad MP3 is in a tute
• I will now (fail again?) demonstrate the loss of
quality in MP3 yet again…
• Roll it, monks…..
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
11
Sound Into Bits (ADC)
• Something that always confuses me:
– The 16 bits used (0 - 64k) record the amplitude (loudness)
– The differences between successive 16 bit samples contain the
frequency (pitch)
– Remember, a high wave also has a trough between peaks.
• So how often do we have to sample to get enough
samples
– 2x the maximum difference we want to catch.
– And we want to catch differences up to 20mHz, so we have to
sample at at least 40,000 times a second.
– So your CDs contain music sampled at just over 44,000 times a
second.
– So the digital signal bandwidth must be 16 x 44,000 = 704,000 bits
per second or 88 kbps. With stereo sound, we have to have two
such samples, so a 1x CD-ROM bus goes at 176 kbps, which we
already knew!
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
12
Bits Into Sound (DAC)
• Amplifiers and speakers are analog devices, so
• The CD player does DAC and passes the results as an
analog signal to your stereo system.
• It does the same if you listen to music off your CD-ROM
drive.
• But where does your sound card/chip do the conversion?
• Hummmm…. Later is (far, far) better, because there’s lots
of electrical interference inside your PC. Digital isn’t
affected by this, but analog is.
• So the perfect system would be all digital inside the
computer and have its DAC inside the speakers
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
13
So We Have DAC for both Sound and Video
• At the same time
• By two independent sets of hardware and
software
• Working on two independent files
• How can we guarantee synchronisation???
• This is a problem with Flash!
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
14
MIDI & General MIDI
• MIDI = Music Instrument Digital Interface
• MIDI is to sampling exactly what vector is to raster
graphics
– A language for describing sounds
• The notes
• The instruments, each of which has a number
– 128 instruments
– Plus drum kit
• The note characteristics
–
–
–
–
attack
sustain
decay
release
• 2+ ways of making those notes
– FM synthesis
– Wavetable
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
15
An MDI Studio
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
16
An Audioacoustic Editing Lab
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
17
The Parts of a MIDI Note
•
From the MIDI Manufacturers Homepage
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
18
Making MIDI
• FM Synthesis
– Sterile sine waves
– What gave computer music a bad name
• Wavetable Sound Generation
– The music gives the number of the instrument
– Samples of the sound of that instrument are stored in ROM/RAM
on the sound card
– The samples are processed to give a far better illusion of the
sound of the instrument
– The more samples, the better, so 64mb of samples on ROM are
better than 512k.
– Wavetables may also be downloaded from CD-ROMS
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
19
MIDI Quality
• Well, as always there’s the trade off:
–
–
–
–
4/11/2016
Much smaller file size
Always somewhat less quality
Infinitely cheaper to create -- only one muso necessary
May require significant CPU processing
CSE5060 -- Multimedia on the Web -Lecture11
20
Channels, Voices and Streams
• A channel drives a speaker:
– 2 channels for standard stereo
– 4-5 channels for 3D sound (two may be faked)
– 8+ channels for super sound in theatre movies
• A voice is an instrument, etc. on a channel
– MIDI supports a large number of voices: 32, 64. This is polyphony
– The voices are superimposed, in digital or analog form, and then
sent to the speakers
– Again, multiple voices may load down the CPU
• A stream is half voice and half channel
– Lets you record a sound effect, a stream
– When we need it, we superimpose it on top of the sound going to
a channel
– The sound card and/or CPU do the work
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
21
Channels, Voices and Streams, 2
• The higher the bandwidth into the sound card/chip
– The more channels, voices and streams we can get at once
– And the more processing work has to be done
– So we either do more on-sound-card/chip processing or bog down
the CPU
– (Sound like the issues related to 3D accelerator cards?!)
• Evolution in sound cards/chips rather slow
• Most systems use sound chip on motherboard
• But if you want to play games….
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
22
Games and Computer Sound
• Games are one of several factors driving the evolution of
graphics boards
• Games are almost the only factor driving the evolution of sound
cards
– Who is sneaking up behind me? We need 3D.
– What kind of sound does that alien make when exploded? We need
lots of streams superimposed.
• 3D illusion
– Uses 3 speakers (woofer, + 2 satellite) and an algorithm to fox the ear
by marginally delaying one stereo channel
– Developed by NASA for space flight simulators
– Can work well if don’t move your head
– With 5 speakers, esp. with 4 channels, can work very well indeed
• Note: deep tones non-directional, so we can use just one woofer
• As the musicians are never behind you, not necessary for
music. Whoops, sorry Berlioz, Allegri, Tchkovsky, etc.
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
23
Games and Computer Sound, 2
• Competing 3D positional audio standards
– A3D
• From Aureal Semiconductor
• On their widely used Vortex audio chips
– Audio Extensions .EAX
• From Creative Labs (who brought us Sound Blaster)
– DirectSound3D
• From Microsoft
• Part of the DirectX set of Windows APIs/extensions, including
Direct3D
• Currently the first of these is the standard, but watch out
for the rest.
4/11/2016
CSE5060 -- Multimedia on the Web -Lecture11
24