Audio - KsuWeb

Download Report

Transcript Audio - KsuWeb

COMPUTER AUDIO
CGDD 4003
What is Sound?


Compressions of air or other media (such as
water or metal) from something vibrating
Sounds are made up of high frequency and low
frequency sounds
Frequency




Don’t confuse pitch (frequency) with volume!
Volume is measured in decibels (dB)
Frequency in Hertz (Hz) = cycles per second
Humans only hear from 20Hz to 20KHz
Strange Fact! Speed of sound (Air: 340m/s; Water:1,230 m/s; Gold: 3,240m/s):
Spatial Sound




1 Channel – “mono”. Can be split to several
speakers; still no direction
2 Channels – “stereo”. Fades from left to right. Can
determine direction
5.1 Audio – Common for home theaters
3D Sound? – Video games (PC). Still has time to
develop
The Human Side
(20Hz-20KHz)
The Equal Loudness Contour
Killa Hurts
Ouch!
A Note about decibels



A decibel is 1/10th of a bel
Abbreviated dB
This is the perceived loudness, which increases
linearly as power increases exponentially
 Something
sounds twice as loud?
 10·log10(2) = 3.01dB

In gaming, volume usually ranges 0.0f-1.0f
Human Perception
(InterAural Time Difference)


Sound hits both ears
Difference in time
Hasn’t
gotten to
left yet
Hits
Right
Ear
First
How Computers Perceive Sound


Digitization (DAC and ADC)
Computers “listen” to the amplitude a certain number of
times per second (sample rate)




44K is CD
22K is good
8K is lame
Computers have to approximate what they heard and
assign it a number


4 bits = 16 level to approximate to
16 bits = 2 million levels to approximate to
Original Sound
Amplitude (in dB)
Frequency
Low Sampling Rate
TIME
Low Sampling Rate
What the computer hears
TIME
High Sampling Rate
TIME
High Sampling Rate
TIME
2 bits per sample
4 Approximations
TIME
StairStep Effect
Called “quantization errors”
TIME
3 bits per sample
8 Approximations
TIME
Less StairStep
TIME
Signal to Noise Ratio (SNR)

Represents the quanitization error
 8-bits
= 128 discrete values (upper-half only)
 Sample is rounded up or down
 SNR is 256:1
 256:1 translates to 48dB (difference in average noise
to max signal)
 16-bit = 32K discrete values (upper-half)
 SNR = 65,536:1, or 96dB
In General



Sampling rate affects range of frequencies you can
capture (Nyquist)
Bits per sample affects noise level as well as volume
range
What about recording:
 Rock?
 Mozart
(or anything on NPR for that matter)?
 Voice/dialog?
Capturing Sounds

Usually done with:





Computer has sound card



a microphone (such as voice)
Line in
CD
Hollywood Edge®
Input types (RCA, MIDI, mini, ¼”, XLR)
Card has quality (plays 16-bit sound)
Need some kind of software


SoundForge/Audacity
Windows SoundRecorder (gag)
Typical Pipeline
Permanent Storage
Decoding (from mp3, ogg, etc)
Individual Channel
Memory Buffer
Sound Channel Processing (2D/3D effects)
Hardware mixing and DAC
Sample Playback

Playback
 Loaded
entirely into memory (called “sample” as well)
 Streamed (pre-buffer data using a circular buffer)

Channel properties
 Pan
– left/right
 Pitch – frequency
 Volume
Compressed Audio



Requires a codec (compress/decompress)
Lossless (e.g. .zip files)
Lossy

Bit-reduction (ADPCM, reduces bps from 16 to 4)
Simple
 Used on Sony PSP, Wii and Nintendo DS


Physcho-acoustics (.mp3, .ogg, .wma)
Discard sound we don’t normally hear anyway
 Hard to implement
 CPU intensive
 PS3, Xbox 360, PCs

Note: mp3 format requires licensing fees to Franhofer-Thompson!
ADSR Envelopes
Used for defining the volume of a sound
Volume

Sustain
Time
3D Sound

Don’t have 5.1?
Panning is one option
 Psycho-acoustic options
 Head-Relative Transfer Function (HRTF)
 Tweak the frequencies to match your ears




Sounds have position and velocity
There is a listener component (like a camera)
Relationship between the two
Attenuation (with distance)
 Occlusion (low-pass filter)
 Doppler (relative velocities)

3D Sound

Environmental effects
 Reverb
(depends on materials in room)
 Echo (depends on size of room)
 Occlusion (a wall blocking part of the sound)
 Obstruction (no direct path to the listener

Competing reverb technologies
 I3DL2
(Interactive 3D Audio Rendering Level 2)
 EAX (Creative Labs)
 Almost identical
MIDI
(Musical Instrument Digital Interface)




MIDI – a method for representing sounds
electronically
Became popular in the 80’s
Send 16 different channels (tracks) at one time
Have a total of 128 possible instruments
The Keyboard




The MIDI Keyboard
No audible sounds
Generates a series of
1’s a 0’s (on/off)
Signals represent
 Note,
loudness
 Length, type of instrument…

Signals come out of the keyboard and usually go
into a sequencer
The Sequencer





Can be a PC
Responsible for recording
individual tracks of music
Responsible for playback
Receives input from
keyboard
Sends output to synthesizer
The Synthesizer


Receives 1’s and 0’s from
the sequencer
Interprets the 1’s and 0’s
to produce audible sounds
 Piano
 Drums…
 Saxophone…

Sounds are sent to speakers
Speakers

Like you haven’t
seen these before…
MIDI
01101101000110
0
1
0
MIDI vs Digital Recording

MIDI:
 Smaller
file size (like 10-20K)
 Change keys/tempo/looping on the fly!
 Song sounds different on every sound card
 No singing allowed!
 Also a DLS format (DownLoadable Sound)

Digital Recording:
 Larger
file size (like 5M)
 Sound is close approximation to real thing
Sampling

There are two main approaches to synthesis:
 Sampling
 FM

Synthesis
Sampling
A
sample is a recording of actual instrument/sound
 Samples are taken at certain intervals
 Samples are then shifted up or down depending on the
note
Sampling
FM Synthesis

Basic waves:
 Sine
 Square
 Saw
 Triangle
 Noise
FM Synthesis


Start with basic waveform, and have one wave
modulate the other
Here’s volume modulation
 440
sine wave, control 2Hz:
 440 sine wave, control 880Hz:
 440 sine wave, control 3KHz:
Interactive Music


Music adapts based on current state of game
Music broken into chunks
 Called
segments (or cues)
 Can be played back to back
 Can be smoothly cross-faded


Segments are combined into themes
fmod’s Sound Designer can do this
Themes in fmod
Sound Variations


Sounds can be triggered by events
There’s no reason to play the same sound the same
way
 Pick
a random sample
 Change pitch
 Change attenuation
Other technology

Lip-synch
 Use
the amplitude of the wave to control mouth
 Analyze phonemes of sample (language neutral)
Common Audio Technology




XAudio (free) – cross-platform
OpenAL (free) – cross-platform
XACT (free) – Xbox/Windows
fmod (commercial) – cross-platform