Audio to MIDI Conversion

Download Report

Transcript Audio to MIDI Conversion

Audio Signal MIDI Transcription
DEVON BRYANT
CS 525 SEMESTER PROJECT
Music Transcription
 Extraction of “onset”, “duration”, and “pitch”
information from digital audio signals
 Two-phased approach


Extract Temporal Information (Onset & Duration)
Extract Frequency Information (Pitch)
 Many applications
 MIDI representation for low bandwidth
 Sheet music score generation
 Comparison against DB for copyright or search
Extraction of Temporal Events
Extraction of Temporal Events
 Spectral Flux – change in magnitude spectrum
between consecutive frames
2
2
1 N 1
 n  1 N 1
 n 
A(k)    x(n)cosk     x(n)sin k 
 N  N n 0
 N 
N n 0
Magnitude Spectrum

Audio File
Frames
FFT
Mag
 “Onsets” = window start, “Offsets” = window stop
Extraction of Pitch Information
Extraction of Pitch Information
 Process event frames through Fast Fourier
Transform (FFT) to bin frequencies
 Use a-priori instrument knowledge to find
fundamental f0 frequencies in spectrum
f (t)  a0 sin( f 0t)  a1 sin( 2 f 0t)  a3 sin( 3 f 0t)  ...

Event
Window
Frames
FFT
f0
Estimation
MIDI File
Issues Encountered
 Noise artifacts or fluctuations can trigger false onsets
 Frequency resolution on shorter events/notes is poor
Results
 Single Note Scale
 Original audio
 Transcribed MIDI
 Chords
 Original audio
 Transcribed MIDI
 Short Song
 Original audio
 Transcribed MIDI
Questions?