Onset Detection in Audio Music J.-S Roger Jang (張智星) http://mirlab.org/jang MIR Lab, CSIE Dept. National Taiwan University What Are Note Onsets? z Energy profile of a percussive instrument is modeled as ADSR stages z Note onset is the time where the slope is the highest, during the attack time. z Soft onsets via violin, etc, are much harder to define and detect. -2- Difficulty in Onset Detection z Music types y Monophonic Easier y Polyphonic Harder z Instrument types y Percussive instruments Easier y String instruments Harder (soft onsets) -3- Why Onset Detection is Useful? z It is a basic step in music analysis y y y y y Music transcription (from wave to midi) Music editing (Song segmentation) Tempo estimation Beat tracking Musical fingerprinting (the onset trace can serve as a robust id for fingerprinting) -4- Onset Detection Function z ODF (onset detection function) creates a curve of onset strength, aka z Most ODFs are based on time-frequency representation (spectrogram) of y Onset strength curve y Novelty curve y Magnitude of STFT (Short-time Fourier transform) y Phase of STFT y Mel-band of STFT y Constant-Q transform -5- ODF: Spectral Flux z Concept y sum the positive change in each frequency bin X (n 1, k ) X (n, k ) N sf (n) h( X (n, k ) X (n 1, k ) ) k 1 n : timeindex, k : freq index, N : framesize h( x ) x x 2 , aka half - wave rectifier -6- Flowchart of OSC z Steps of OSC y y y y y y Spectrogram Mel-band spectrogram Spectral flux Smoothed OSC via Gaussian smoothing Trend of OSC via Gaussian smoothing Trend-subtracted OSC z Check out wave2osc.m to see these steps. -7- Example of OSC z Try “wave2osc.m” Spectrogram Freq. bin index 40 30 20 10 0.5 1 1.5 2 2.5 Time (sec) OSC (original and smoothed) 3 3.5 4 4.5 0.5 1 1.5 2 2.5 Time (sec) Smoothed OSC and its trend 3 3.5 4 4.5 5 0.5 1 1.5 2 2.5 Time (sec) Trend-subtracted OSC 3 3.5 4 4.5 5 0.5 1 1.5 2 2.5 Time (sec) 3 3.5 4 4.5 5 0.025 0.01 0.005 0 0 -3 8 x 10 Amplitude 6 4 2 0 0 -3 5 x 10 4 Amplitude Amplitude 0.02 0.015 3 2 1 0 0 -8- What Can You Do With OSC... z OSC onsets y Pick peaks to have onsets z OSC tempo (BPM, beats per minute) y Apply ACF (or other PDF) to find the BPM z OSC beat tracking y Pick equal-spaced peaks to have beat positions -9- Beat Tracking z Demos y http://mirlab.org/demo/beatTracking y Try “beatTracking.m” in SAP toolbox -10- Example of Beat Tracking z beatTracking.m -11- Performance Indices of Beat Tracking z Many performance indices of BT y Check out audio beat tracking task of MIREX 1 Computed GT 0.8 FN FN 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 z Mostly adopted ones y Precision, recall, f-measure, accuracy y Try simSequence.m in SAP toolbox FP TP TP 2 3 FP TP FP 5 6 -0.8 -1 0 1 4 Precision = tp/(tp+fp)=3/(3+3) = 0.5 Recall = tp/(tp+fn)=3/(3+2) = 0.6 F-measure = tp/(tp+(fn+fp)/2)=3/(3+(2+3)/2) = 0.545 Accuracy = tp/(tp+fn+fp)=3/(3+2+3) = 0.375 -12-