Onset Detection

advertisement
Onset Detection in Audio Music
J.-S Roger Jang (張智星)
http://mirlab.org/jang
MIR Lab, CSIE Dept.
National Taiwan University
What Are Note Onsets?
z Energy profile of a
percussive instrument is
modeled as ADSR stages
z Note onset is the time
where the slope is the
highest, during the attack
time.
z Soft onsets via violin,
etc, are much harder to
define and detect.
-2-
Difficulty in Onset Detection
z Music types
y Monophonic  Easier
y Polyphonic  Harder
z Instrument types
y Percussive
instruments 
Easier
y String instruments
 Harder (soft
onsets)
-3-
Why Onset Detection is
Useful?
z It is a basic step in music analysis
y
y
y
y
y
Music transcription (from wave to midi)
Music editing (Song segmentation)
Tempo estimation
Beat tracking
Musical fingerprinting (the onset trace
can serve as a robust id for
fingerprinting)
-4-
Onset Detection Function
z ODF (onset
detection function)
creates a curve of
onset strength, aka
z Most ODFs are based
on time-frequency
representation
(spectrogram) of
y Onset strength curve
y Novelty curve
y Magnitude of STFT
(Short-time Fourier
transform)
y Phase of STFT
y Mel-band of STFT
y Constant-Q transform
-5-
ODF: Spectral Flux
z Concept
y sum the positive change
in each frequency bin
X (n 1, k )
X (n, k )
N
sf (n)   h( X (n, k )  X (n  1, k ) )
k 1
n : timeindex, k : freq index, N : framesize
h( x ) 
x x
2
, aka half - wave rectifier
-6-
Flowchart of OSC
z Steps of OSC
y
y
y
y
y
y
Spectrogram
Mel-band spectrogram
Spectral flux
Smoothed OSC via Gaussian smoothing
Trend of OSC via Gaussian smoothing
Trend-subtracted OSC
z Check out wave2osc.m to see these
steps.
-7-
Example of OSC
z Try “wave2osc.m”
Spectrogram
Freq. bin index
40
30
20
10
0.5
1
1.5
2
2.5
Time (sec)
OSC (original and smoothed)
3
3.5
4
4.5
0.5
1
1.5
2
2.5
Time (sec)
Smoothed OSC and its trend
3
3.5
4
4.5
5
0.5
1
1.5
2
2.5
Time (sec)
Trend-subtracted OSC
3
3.5
4
4.5
5
0.5
1
1.5
2
2.5
Time (sec)
3
3.5
4
4.5
5
0.025
0.01
0.005
0
0
-3
8
x 10
Amplitude
6
4
2
0
0
-3
5
x 10
4
Amplitude
Amplitude
0.02
0.015
3
2
1
0
0
-8-
What Can You Do With OSC...
z OSC  onsets
y Pick peaks to have onsets
z OSC  tempo (BPM, beats per minute)
y Apply ACF (or other PDF) to find the BPM
z OSC  beat tracking
y Pick equal-spaced peaks to have beat
positions
-9-
Beat Tracking
z Demos
y http://mirlab.org/demo/beatTracking
y Try “beatTracking.m” in SAP toolbox
-10-
Example of Beat Tracking
z beatTracking.m
-11-
Performance Indices of
Beat Tracking
z Many performance
indices of BT
y Check out audio beat
tracking task of
MIREX
1
Computed
GT
0.8
FN
FN
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
z Mostly adopted ones
y Precision, recall,
f-measure, accuracy
y Try simSequence.m in
SAP toolbox
FP
TP
TP
2
3
FP
TP
FP
5
6
-0.8
-1
0
1
4
Precision = tp/(tp+fp)=3/(3+3) = 0.5
Recall = tp/(tp+fn)=3/(3+2) = 0.6
F-measure = tp/(tp+(fn+fp)/2)=3/(3+(2+3)/2) = 0.545
Accuracy = tp/(tp+fn+fp)=3/(3+2+3) = 0.375
-12-
Download