Presentation
Download
Report
Transcript Presentation
Polyphonic Audio Key Finding
Using the Spiral Array CEG
Algorithm
RESEARCH BY
ELAINE CHEW AND CHING-HUA CHUAN
U NIVERSITY OF S OUTHERN C ALIFORNIA
PRESENTATION BY
SEAN SWEENEY
D IGIP EN I NSTITUTE OF T ECHNOLOGY
CS 582 / A PRIL 17, 2011
DR. DIMITRI VOLPER
Presentation Flow
Musical Pitch and Key
Human Perception of Pitch
The Spiral Array Model
Pitches
Chords
Keys
The CEG Algorithm
Algorithm
Visualization
Musical Pitch and Key
Pitch
The perceived value of a tone, “Low” to “High”
Psycho-acoustic (subjective) perception of Frequency
Frequency (Hz) is a scientific measurement of period
Key (Western music)
Labels the “center” tone in a section of music
Standard smallest interval: Semitone or “half-step”
Standard pattern of semitones around “center”
Ascending: 2,2,1,2,2,2,1
Human Perception of Pitch
Limited range of perception
Typically 20Hz – 20,000Hz
Range tends to decrease with age
Noticable Difference is coarser at low Hz
Less distance (Hz) between lower sounds
Around 1400 perceivable intervals
Certain frequency distances sound relatively close
Thirds, Fifths, Octaves
The Spiral Array Model
The Spiral Array Model
Helical Structure
Toroidal across Octaves
Distance in 3D model
approximates perceived
closeness between pitch
Pitch, chord and key can
all map to the same space
Chords in the Spiral Array
Standard chords are based
on three supporting tones
Create Triangles in 3D
relative to the model
Triangles are effectively
continuous, as pitch is
Major and Minor chords’
centers thus form helixes
Key in the Spiral Array
Simple keys are based on
three supporting chords
Creates triangles in 3D,
based on supporting
chords’ triangular centers
Triangles are effectively
continuous, as chords are
Major and Minor keys’
centers thus form helixes
Center of Effect
Center of Effect (CE)
Relative location of a chord based on its supporting tones
Notes of different strength change the CE location
Complex chord CE’s will not line up exactly on the model
Center of Effect Generator (CEG) Key-Finding
Center of Effect relates
position of multiple
pitches in model
Spatially closest chord is
most likely key
Correlates input music to
standard key structure
Helping Visualize the CEG Algorithm
Keys exist as a triangle in 3-space
Keys’ centers-of-effect make up two
helixes in the 3D model
In standard intonation, keys are
discrete (12 minor, 12 major)
Helping Visualize the CEG Algorithm
From a complex audio
signal, weighted values are
calculated for bins on each
discrete tone
The weighted values
approximate the current
key’s location on the model
The spatially-closest key is
the most likely match
CEG Key-Finding Algorithm
Pitch detection
Extract pitch class and strength from signal
Key finding
Nearest Neighbor Search in Spiral Array
Fast Fourier Transform
Efficient algorithm to compute Discrete Fourier Transform
O(n log n) vs O(n2)
Transforms function into its Frequency Domain representation
Widely used across many fields
Solving Partial Differential Equations
Data Compression
Polynomial Multiplication
Spectral Analysis
Frequency bands
Algorithm for Pitch Class/Strength from FFT
For each frequency spectrum in a 0.37 second period:
1. For each frequency band find peak value
2. For each pitch-class, k, and its strength at time j:
Fjk, is the sum of all peak values for that frequency
band (and others related by octaves)
3. Normalize
1.
Divide all pitch-strength values by the largest:
2.
Divide all pitch-strength values by their sum:
(k = 0, 1, …, 11)
CEG Key-Finding Algorithm
Pitch detection
Extract pitch class and strength from signal
Key finding
Nearest Neighbor Search in Spiral Array
CEG Algorithm
For pitch class and strength from each 0.37 seconds:
1. Assign pitch-names to pitch classes:
1.
2.
Generate CE for previous 5 seconds; and
Assign pitch-names to current pitch-classes by nearest
neighbor search in Spiral Array Space
2. Determine Key based on pitch names:
1.
Generate the cumulative CE from beginning to current
2.
Perform nearest-neighbor search to find closest key
Questions?
BIBLIOGRAPHY:
• Polyphonic Audio Key Finding Using the Spiral Array CEG
Algorithm
Chuan, C. and Chew, E.
IEEE International Conference on Multimedia & Expo 2005
•
Towards a Mathematical Model of Tonality
Chew, E.
Doctoral dissertation, MIT 2000