Identifying “Cover Songs” with Beat-Synchronous Chroma Features Dan Ellis and Graham Poliner Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,graham}@ee.columbia.edu 1. 2. 3. 4. http://labrosa.ee.columbia.edu/ Cover Songs Chroma Features Beat Tracking Matching Cover Songs Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 1 /16 Cover Songs • “Cover Songs” = reinterpretation of a piece different instrumentation, character no match with “timbral” features Let It Be - Nick Cave Let It Be / Beatles / verse 1 4 freq / kHz freq / kHz Let It Be - The Beatles 3 2 1 0 Let It Be / Nick Cave / verse 1 4 3 2 1 0 • Need a different representation! 2 4 6 8 10 time / sec 2 4 6 8 10 time / sec beat-synchronous chroma features Beat-sync chroma features G F chroma chroma Beat-sync chroma features D C A G F D C A 5 10 15 20 25 beats Identifying Cover Songs - Ellis & Poliner 5 10 15 20 25 beats 2007-04-20 - 2 /16 Chroma Features • Chroma features map spectral energy into one canonical octave i.e. 12 semitone bins IF chroma chroma freq / kHz 3 2 G F D C 1 A 0 2 4 6 8 10 100 time / sec 200 300 400 500 • Can resynthesize as “Shepard Tones” all octaves at once freq / kHz Piano scale Piano chromatic scale 4 4 600 700 time / frames Shepard tone resynth 3 2 1 0 2 4 6 Identifying Cover Songs - Ellis & Poliner 8 10 time / sec 2007-04-20 - 3 /16 Calculating Chroma Features G F D C 4 chroma freq / kHz chroma 1: Map every STFT bin • Method blurs non-tonal energy 3 G F 2 D 1 C A 0 A 2: Map only STFT peaks • Method still blurry at low frequencies 2 G F D C A 4 6 8 10 time / sec 4 50 chroma freq / kHz chroma 50 100 150 200 fft bin 3 150 200 250 300 time / frame G F 2 D 1 C 0 100 A 3: Instantaneous Frequency /t • Method escapes frequency resolution limit ( G F D C A 0 2000 4000 4 6 8 10 time / sec ) 4 3 2 1 0 2 4 6 8 Identifying Cover Songs - Ellis & Poliner 10 time / sec chroma 2 freq / kHz chroma 50 100 150 200 fft bin → 50 100 150 200 250 300 time / frame 50 100 150 200 250 300 time / frame G F D C A 2007-04-20 - 4 /16 Beat Tracking (1) • Goal: One feature vector per ‘beat’ (tatum) for tempo normalization, efficiency • “Onset Strength Envelope” freq / mel sumf(max(0, difft(log |X(t, f)|))) 40 30 20 10 0 0 5 10 time / sec 15 • Autocorr. + window → global tempo estimate 0 168.5100BPM 200 0 300 400 500 Identifying Cover Songs - Ellis & Poliner 600 700 800 900 1000 lag / 4 ms samples 2007-04-20 - 5 /16 Beat Tracking (2) • Dynamic Programming finds beat times {t } i optimizes i O(ti) + i W((ti+1 – ti – p)/) where O(t) is onset strength envelope (local score) W(t) is a log-Gaussian window (transition cost) p is the default beat period per measured tempo incrementally find best predecessor at every time backtrace from largest final score to get beats C*(t) O(t) τ t C*(t) = γ O(t) + (1–γ)max{W((τ – τp)/β)C*(τ)} τ P(t) = argmax{W((τ – τp)/β)C*(τ)} τ Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 6 /16 Beat Tracking Results • DP will bridge gaps (non-causal) there is always a best path ... freq / Bark band Alanis Morissette - All I Want - gap + beats 40 30 20 10 • 2nd place in MIREX 2006 Beat Tracking 182 184 186 188 190 192 time / sec compared to McKinney & Moelants human data freq / Bark band Subject # test 2 (Bragg) - McKinney + Moelants Subject data 40 40 30 20 10 20 0 0 5 Identifying Cover Songs - Ellis & Poliner 10 time / s 2007-04-20 - 7 /16 15 Beat-Synchronous Chroma Features • Beat + chroma features / 30ms frames → average chroma within each beat compact; sufficient? 34,5-.-6,7 &# %# $# 89/,)-/)4,9:); "# # 0;48+2-1*9/ "$ "# ( ' & $ # ! "# )*+,-.-/,0 "! 0;48+2-1*9/ "$ "# ( ' & $ ! "# "! $# Identifying Cover Songs - Ellis & Poliner $! %# %! )*+,-.-1,2)/ 2007-04-20 - 8 /16 Matching (1): Little Fragments • Cover versions may change song structure multiple local matches at different alignments • Match query and target as many small pieces? how big are the pieces? G E D C A 100 200 extract 300 400 beats 500 how do we combine individual scores? do we have all day? cross-correlate Candidate chroma bins chroma bins Query G E D C A 100 200 300 Identifying Cover Songs - Ellis & Poliner 400 500 beats 2007-04-20 - 9 /16 Matching (2): Global Correlation • Cross-correlate entire beat-chroma matrices ... at all possible transpositions implicit combination of match quality and duration chroma bins Elliott Smith - Between the Bars G E D C A skew / semitones chroma bins 100 200 300 400 Glen Phillips - Between the Bars 500 beats @281 BPM G E D C A Cross-correlation +5 0 -5 -500 -400 -300 -200 -100 0 100 200 300 400 skew / beats • One good matching fragment is sufficient...? Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 10/16 Filtered Cross-Correlation • Raw correlation not as important as precise local match skew / semitones looking for large contrast at ±1 beat skew i.e. high-pass filter Cross-correlation +5 0 -5 -500 -400 -300 -200 -100 0 100 200 Cross-correlation @ skew = +2 semitones 0.6 300 400 skew / beats 300 400 skew / beats raw 0.4 0.2 filtered 0 -500 -400 -300 -200 -100 Identifying Cover Songs - Ellis & Poliner 0 100 200 2007-04-20 - 11/16 Results (1): Ellis 23 set • 23 pairs of cover songs from uspop2002 +... one correct match per query Cover Songs - dpwe23 - 12/23 correct Take_Me_To_The_River/annie_lennox Let_It_Be/nick_cave I_Love_You/faith_hill I_Can_t_Get_No_Satisfaction/rolling_stones Hush/milli_vanilli Grand_Illusion/styx Gold_Dust_Woman/sheryl_crow God_Only_Knows/brian_wilson Query Faith/limp_bizkit Enjoy_The_Silence/tori_amos Day_Tripper/cheap_trick Come_Together/beatles Cocaine/nazareth Claudette/roy_orbison Cecilia/simon_and_garfunkel Caroline_No/brian_wilson Blue_Collar_Man/styx Between_The_Bars/glen_phillips Before_You_Accuse_Me/eric_clapton America/simon_and_garfunkel All_Along_The_Watchtower/dave_matthews_band Addicted_To_Love/tina_turner Abracadabra/sugar_ray Ab Ad Al Am Be Be Bl Ca Ce Cl Co Co Da En Fa Go Go Gr Hu I_ I_ Le Ta Test Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 12/16 Results (2): MIREX 06 • Cover song contest • Found 761/3300 = 23% recall next best: 11% guess: 3% Identifying Cover Songs - Ellis & Poliner song-set (each row is one query song) 30 songs x 11 versions of each (!) (data has not been disclosed) # true covers in top 10 8 systems compared (4 cover song + 4 similarity) MIREX 06 Cover Song Results: # Covers retrieved per song per system 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 8 6 4 2 0 correct matches retrieved CS DE KL1 KL2 KWL KWT LR TP cover song systems similarity systems 2007-04-20 - 13/16 Where are the matches? • Look inside global cross-correlation to find matching fragments... xcorr = t f (C1(t, f)⋅C2(t, f)) - view along time chroma Let It Be / Beatles (beats 11-441) G F D C chroma A 50 100 150 200 250 Let It Be / Nick Cave (beats 13-443) 50 100 150 200 50 100 150 200 300 350 400 time / beats 250 300 350 400 time / beats 250 300 350 400 time / beats G F D C A 0.4 0.2 0 -0.2 0 Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 14/16 What are the mistakes? • False reject - missed true match cover version is too different, beat tracking wrong ... • False alarm - invalid match “Cocaine” (Clapton) vs. “Satisfaction” (Stones) chroma Eric Clapton - Cocaine - beats 17:1027 G F D C A 100 200 300 400 500 600 700 800 900 1000 chroma Rolling Stones - Satisfaction - beats 1:1011 G F D C A 100 200 300 400 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 2 1 0 -1 -2 0 Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 15/16 Conclusions and Future Work • Beat-synchronous chroma features are successful for matching cover songs captures melody-harmony, not instruments • Further uses: Beat-chroma fragments as musical building blocks e.g. VQ over large body of music find recurrent motifs artist identification? • Code available! Google “matlab cover song” Identifying Cover Songs - Ellis & Poliner 2007-04-20 - 16/16