Identifying “Cover Songs” with Beat-Synchronous Chroma Features 1.

advertisement
Identifying “Cover Songs”
with Beat-Synchronous
Chroma Features
Dan Ellis and Graham Poliner
Laboratory for Recognition and Organization of Speech and Audio
Dept. Electrical Eng., Columbia Univ., NY USA
{dpwe,graham}@ee.columbia.edu
1.
2.
3.
4.
http://labrosa.ee.columbia.edu/
Cover Songs
Chroma Features
Beat Tracking
Matching Cover Songs
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 1 /16
Cover Songs
• “Cover Songs” = reinterpretation of a piece
different instrumentation, character
no match with “timbral” features
Let It Be - Nick Cave
Let It Be / Beatles / verse 1
4
freq / kHz
freq / kHz
Let It Be - The Beatles
3
2
1
0
Let It Be / Nick Cave / verse 1
4
3
2
1
0
• Need a different representation!
2
4
6
8
10 time / sec
2
4
6
8
10
time / sec
beat-synchronous chroma features
Beat-sync chroma features
G
F
chroma
chroma
Beat-sync chroma features
D
C
A
G
F
D
C
A
5
10
15
20
25
beats
Identifying Cover Songs - Ellis & Poliner
5
10
15
20
25
beats
2007-04-20 - 2 /16
Chroma Features
• Chroma features map spectral energy
into one canonical octave
i.e. 12 semitone bins
IF chroma
chroma
freq / kHz
3
2
G
F
D
C
1
A
0
2
4
6
8
10
100
time / sec
200
300
400
500
• Can resynthesize as “Shepard Tones”
all octaves at once
freq / kHz
Piano
scale
Piano chromatic scale
4
4
600
700
time / frames
Shepard tone resynth
3
2
1
0
2
4
6
Identifying Cover Songs - Ellis & Poliner
8
10 time / sec
2007-04-20 - 3 /16
Calculating Chroma Features
G
F
D
C
4
chroma
freq / kHz
chroma
1: Map every STFT bin
• Method
blurs non-tonal energy
3
G
F
2
D
1
C
A
0
A
2: Map only STFT peaks
• Method
still blurry at low frequencies
2
G
F
D
C
A
4
6
8
10 time / sec
4
50
chroma
freq / kHz
chroma
50 100 150 200 fft bin
3
150
200
250
300
time / frame
G
F
2
D
1
C
0
100
A
3: Instantaneous Frequency /t
• Method
escapes frequency resolution limit
(
G
F
D
C
A
0
2000
4000
4
6
8
10 time / sec
)
4
3
2
1
0
2
4
6
8
Identifying Cover Songs - Ellis & Poliner
10 time / sec
chroma
2
freq / kHz
chroma
50 100 150 200 fft bin
→
50
100
150
200
250
300
time / frame
50
100
150
200
250
300
time / frame
G
F
D
C
A
2007-04-20 - 4 /16
Beat Tracking (1)
• Goal: One feature vector per ‘beat’ (tatum)
for tempo normalization, efficiency
• “Onset Strength Envelope”
freq / mel
sumf(max(0, difft(log |X(t, f)|)))
40
30
20
10
0
0
5
10
time / sec 15
• Autocorr. + window → global tempo estimate
0
168.5100BPM 200
0
300
400
500
Identifying Cover Songs - Ellis & Poliner
600
700
800
900
1000
lag / 4 ms samples
2007-04-20 - 5 /16
Beat Tracking (2)
• Dynamic Programming finds beat times {t }
i
optimizes i O(ti) +  i W((ti+1 – ti – p)/)
where O(t) is onset strength envelope (local score)
W(t) is a log-Gaussian window (transition cost)
p is the default beat period per measured tempo
incrementally find best predecessor at every time
backtrace from largest final score to get beats
C*(t)
O(t)
τ
t
C*(t) = γ O(t) + (1–γ)max{W((τ – τp)/β)C*(τ)}
τ
P(t) = argmax{W((τ – τp)/β)C*(τ)}
τ
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 6 /16
Beat Tracking Results
• DP will bridge gaps (non-causal)
there is always a best path ...
freq / Bark band
Alanis Morissette - All I Want - gap + beats
40
30
20
10
• 2nd place in MIREX 2006 Beat Tracking
182
184
186
188
190
192
time / sec
compared to McKinney & Moelants human data
freq / Bark band
Subject #
test 2 (Bragg) - McKinney + Moelants Subject data
40
40
30
20
10
20
0
0
5
Identifying Cover Songs - Ellis & Poliner
10
time / s
2007-04-20 - 7 /16
15
Beat-Synchronous Chroma Features
• Beat + chroma features / 30ms frames
→ average chroma within each beat
compact; sufficient?
34,5-.-6,7
&#
%#
$#
89/,)-/)4,9:);
"#
#
0;48+2-1*9/
"$
"#
(
'
&
$
#
!
"#
)*+,-.-/,0 "!
0;48+2-1*9/
"$
"#
(
'
&
$
!
"#
"!
$#
Identifying Cover Songs - Ellis & Poliner
$!
%#
%!
)*+,-.-1,2)/
2007-04-20 - 8 /16
Matching (1): Little Fragments
• Cover versions may change song structure
multiple local matches at different alignments
• Match query and target as many small pieces?
how big are the
pieces?
G
E
D
C
A
100
200
extract
300
400
beats
500
how do we
combine individual
scores?
do we have all day?
cross-correlate
Candidate
chroma bins
chroma bins
Query
G
E
D
C
A
100
200
300
Identifying Cover Songs - Ellis & Poliner
400
500
beats
2007-04-20 - 9 /16
Matching (2): Global Correlation
• Cross-correlate entire beat-chroma matrices
... at all possible transpositions
implicit combination of match quality and duration
chroma bins
Elliott Smith - Between the Bars
G
E
D
C
A
skew / semitones
chroma bins
100
200
300
400
Glen Phillips - Between the Bars
500
beats @281 BPM
G
E
D
C
A
Cross-correlation
+5
0
-5
-500
-400
-300
-200
-100
0
100
200
300
400 skew / beats
• One good matching fragment is sufficient...?
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 10/16
Filtered Cross-Correlation
• Raw correlation not as important as precise
local match
skew / semitones
looking for large contrast at ±1 beat skew
i.e. high-pass filter
Cross-correlation
+5
0
-5
-500
-400
-300
-200
-100
0
100
200
Cross-correlation @ skew = +2 semitones
0.6
300
400 skew / beats
300
400 skew / beats
raw
0.4
0.2
filtered
0
-500
-400
-300
-200
-100
Identifying Cover Songs - Ellis & Poliner
0
100
200
2007-04-20 - 11/16
Results (1): Ellis 23 set
• 23 pairs of cover songs from uspop2002 +...
one correct match per query
Cover Songs - dpwe23 - 12/23 correct
Take_Me_To_The_River/annie_lennox
Let_It_Be/nick_cave
I_Love_You/faith_hill
I_Can_t_Get_No_Satisfaction/rolling_stones
Hush/milli_vanilli
Grand_Illusion/styx
Gold_Dust_Woman/sheryl_crow
God_Only_Knows/brian_wilson
Query
Faith/limp_bizkit
Enjoy_The_Silence/tori_amos
Day_Tripper/cheap_trick
Come_Together/beatles
Cocaine/nazareth
Claudette/roy_orbison
Cecilia/simon_and_garfunkel
Caroline_No/brian_wilson
Blue_Collar_Man/styx
Between_The_Bars/glen_phillips
Before_You_Accuse_Me/eric_clapton
America/simon_and_garfunkel
All_Along_The_Watchtower/dave_matthews_band
Addicted_To_Love/tina_turner
Abracadabra/sugar_ray
Ab Ad Al Am Be Be Bl Ca Ce Cl Co Co Da En Fa Go Go Gr Hu I_
I_ Le Ta
Test
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 12/16
Results (2): MIREX 06
• Cover song contest
• Found 761/3300
= 23% recall
next best: 11%
guess: 3%
Identifying Cover Songs - Ellis & Poliner
song-set (each row is one query song)
30 songs x 11
versions of each (!)
(data has not been
disclosed)
# true covers in top 10
8 systems compared
(4 cover song
+ 4 similarity)
MIREX 06 Cover Song Results:
# Covers retrieved per song per system
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
8
6
4
2
0
correct
matches
retrieved
CS DE KL1 KL2 KWL KWT LR TP
cover song systems
similarity systems
2007-04-20 - 13/16
Where are the matches?
• Look inside global cross-correlation to find
matching fragments...
xcorr = t f (C1(t, f)⋅C2(t, f)) - view along time
chroma
Let It Be / Beatles (beats 11-441)
G
F
D
C
chroma
A
50
100
150
200
250
Let It Be / Nick Cave (beats 13-443)
50
100
150
200
50
100
150
200
300
350
400 time / beats
250
300
350
400 time / beats
250
300
350
400 time / beats
G
F
D
C
A
0.4
0.2
0
-0.2
0
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 14/16
What are the mistakes?
• False reject - missed true match
cover version is too different, beat tracking wrong ...
• False alarm - invalid match
“Cocaine” (Clapton) vs. “Satisfaction” (Stones)
chroma
Eric Clapton - Cocaine - beats 17:1027
G
F
D
C
A
100
200
300
400
500
600
700
800
900
1000
chroma
Rolling Stones - Satisfaction - beats 1:1011
G
F
D
C
A
100
200
300
400
500
600
700
800
900
1000
100
200
300
400
500
600
700
800
900
1000
2
1
0
-1
-2
0
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 15/16
Conclusions and Future Work
• Beat-synchronous chroma features
are successful for matching cover songs
captures melody-harmony, not instruments
• Further uses:
Beat-chroma fragments
as musical building blocks
e.g. VQ over large body of music
find recurrent motifs
artist identification?
• Code available! Google “matlab cover song”
Identifying Cover Songs - Ellis & Poliner
2007-04-20 - 16/16
Download