Audio - KsuWeb
Download
Report
Transcript Audio - KsuWeb
COMPUTER AUDIO
CGDD 4003
What is Sound?
Compressions of air or other media (such as
water or metal) from something vibrating
Sounds are made up of high frequency and low
frequency sounds
Frequency
Don’t confuse pitch (frequency) with volume!
Volume is measured in decibels (dB)
Frequency in Hertz (Hz) = cycles per second
Humans only hear from 20Hz to 20KHz
Strange Fact! Speed of sound (Air: 340m/s; Water:1,230 m/s; Gold: 3,240m/s):
Spatial Sound
1 Channel – “mono”. Can be split to several
speakers; still no direction
2 Channels – “stereo”. Fades from left to right. Can
determine direction
5.1 Audio – Common for home theaters
3D Sound? – Video games (PC). Still has time to
develop
The Human Side
(20Hz-20KHz)
The Equal Loudness Contour
Killa Hurts
Ouch!
A Note about decibels
A decibel is 1/10th of a bel
Abbreviated dB
This is the perceived loudness, which increases
linearly as power increases exponentially
Something
sounds twice as loud?
10·log10(2) = 3.01dB
In gaming, volume usually ranges 0.0f-1.0f
Human Perception
(InterAural Time Difference)
Sound hits both ears
Difference in time
Hasn’t
gotten to
left yet
Hits
Right
Ear
First
How Computers Perceive Sound
Digitization (DAC and ADC)
Computers “listen” to the amplitude a certain number of
times per second (sample rate)
44K is CD
22K is good
8K is lame
Computers have to approximate what they heard and
assign it a number
4 bits = 16 level to approximate to
16 bits = 2 million levels to approximate to
Original Sound
Amplitude (in dB)
Frequency
Low Sampling Rate
TIME
Low Sampling Rate
What the computer hears
TIME
High Sampling Rate
TIME
High Sampling Rate
TIME
2 bits per sample
4 Approximations
TIME
StairStep Effect
Called “quantization errors”
TIME
3 bits per sample
8 Approximations
TIME
Less StairStep
TIME
Signal to Noise Ratio (SNR)
Represents the quanitization error
8-bits
= 128 discrete values (upper-half only)
Sample is rounded up or down
SNR is 256:1
256:1 translates to 48dB (difference in average noise
to max signal)
16-bit = 32K discrete values (upper-half)
SNR = 65,536:1, or 96dB
In General
Sampling rate affects range of frequencies you can
capture (Nyquist)
Bits per sample affects noise level as well as volume
range
What about recording:
Rock?
Mozart
(or anything on NPR for that matter)?
Voice/dialog?
Capturing Sounds
Usually done with:
Computer has sound card
a microphone (such as voice)
Line in
CD
Hollywood Edge®
Input types (RCA, MIDI, mini, ¼”, XLR)
Card has quality (plays 16-bit sound)
Need some kind of software
SoundForge/Audacity
Windows SoundRecorder (gag)
Typical Pipeline
Permanent Storage
Decoding (from mp3, ogg, etc)
Individual Channel
Memory Buffer
Sound Channel Processing (2D/3D effects)
Hardware mixing and DAC
Sample Playback
Playback
Loaded
entirely into memory (called “sample” as well)
Streamed (pre-buffer data using a circular buffer)
Channel properties
Pan
– left/right
Pitch – frequency
Volume
Compressed Audio
Requires a codec (compress/decompress)
Lossless (e.g. .zip files)
Lossy
Bit-reduction (ADPCM, reduces bps from 16 to 4)
Simple
Used on Sony PSP, Wii and Nintendo DS
Physcho-acoustics (.mp3, .ogg, .wma)
Discard sound we don’t normally hear anyway
Hard to implement
CPU intensive
PS3, Xbox 360, PCs
Note: mp3 format requires licensing fees to Franhofer-Thompson!
ADSR Envelopes
Used for defining the volume of a sound
Volume
Sustain
Time
3D Sound
Don’t have 5.1?
Panning is one option
Psycho-acoustic options
Head-Relative Transfer Function (HRTF)
Tweak the frequencies to match your ears
Sounds have position and velocity
There is a listener component (like a camera)
Relationship between the two
Attenuation (with distance)
Occlusion (low-pass filter)
Doppler (relative velocities)
3D Sound
Environmental effects
Reverb
(depends on materials in room)
Echo (depends on size of room)
Occlusion (a wall blocking part of the sound)
Obstruction (no direct path to the listener
Competing reverb technologies
I3DL2
(Interactive 3D Audio Rendering Level 2)
EAX (Creative Labs)
Almost identical
MIDI
(Musical Instrument Digital Interface)
MIDI – a method for representing sounds
electronically
Became popular in the 80’s
Send 16 different channels (tracks) at one time
Have a total of 128 possible instruments
The Keyboard
The MIDI Keyboard
No audible sounds
Generates a series of
1’s a 0’s (on/off)
Signals represent
Note,
loudness
Length, type of instrument…
Signals come out of the keyboard and usually go
into a sequencer
The Sequencer
Can be a PC
Responsible for recording
individual tracks of music
Responsible for playback
Receives input from
keyboard
Sends output to synthesizer
The Synthesizer
Receives 1’s and 0’s from
the sequencer
Interprets the 1’s and 0’s
to produce audible sounds
Piano
Drums…
Saxophone…
Sounds are sent to speakers
Speakers
Like you haven’t
seen these before…
MIDI
01101101000110
0
1
0
MIDI vs Digital Recording
MIDI:
Smaller
file size (like 10-20K)
Change keys/tempo/looping on the fly!
Song sounds different on every sound card
No singing allowed!
Also a DLS format (DownLoadable Sound)
Digital Recording:
Larger
file size (like 5M)
Sound is close approximation to real thing
Sampling
There are two main approaches to synthesis:
Sampling
FM
Synthesis
Sampling
A
sample is a recording of actual instrument/sound
Samples are taken at certain intervals
Samples are then shifted up or down depending on the
note
Sampling
FM Synthesis
Basic waves:
Sine
Square
Saw
Triangle
Noise
FM Synthesis
Start with basic waveform, and have one wave
modulate the other
Here’s volume modulation
440
sine wave, control 2Hz:
440 sine wave, control 880Hz:
440 sine wave, control 3KHz:
Interactive Music
Music adapts based on current state of game
Music broken into chunks
Called
segments (or cues)
Can be played back to back
Can be smoothly cross-faded
Segments are combined into themes
fmod’s Sound Designer can do this
Themes in fmod
Sound Variations
Sounds can be triggered by events
There’s no reason to play the same sound the same
way
Pick
a random sample
Change pitch
Change attenuation
Other technology
Lip-synch
Use
the amplitude of the wave to control mouth
Analyze phonemes of sample (language neutral)
Common Audio Technology
XAudio (free) – cross-platform
OpenAL (free) – cross-platform
XACT (free) – Xbox/Windows
fmod (commercial) – cross-platform