Eigenrhythms: Drum Track Bases Dan Ellis and John Arroyo Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA dpwe@ee.columbia.edu 1. 2. 3. 4. ja2124@columbia.edu Eigenrhythms: Representing drum tracks Basis projections Aligning the data Classification and Synthesis Eigenrhythms - Ellis & Arroyo 2004-10-14 Drum Track Structure • To first order, All pop music has the same drum track: • Can we capture this from examples? .. including the variations • Can we exploit it? .. for synthesis .. for classification .. for insight Eigenrhythms - Ellis & Arroyo 2004-10-14 Basis Sets • Dataset reduced to linear combinations of a few basic patterns weights data X = W × H bases 1 = 0 × -1 bases H: subspace that spans the data weights W: dimension-reduced projection of data Eigenrhythms - Ellis & Arroyo 2004-10-14 Different basis projections • • PCA basis 6 5 Principal Component Analysis (PCA) optimizes MSE of low-D reconstruction 4 3 2 1 0 0 2 4 ICA basis 6 0 2 6 6 Independent Component Analysis (ICA) projections are independent (cf decorrelated) • Linear Discriminant Analysis (LDA) 4 3 2 1 0 3 4 LDA basis 2 given class labels for data, find dimensions to separate them • Nonnegative Matrix Factorization (NMF) each basis function only adds bits in 5 1 0 -1 -2 -3 -2 6 0 NMF basis 2 5 4 3 2 1 0 Eigenrhythms - Ellis & Arroyo 0 2004-10-14 2 4 6 Data • Drum tracks extracted from MIDI 100 examples (10 × 10 genre classes) fixed mapping of instruments to 3 classes bass drum, snare, hi-hat • Pseudo-envelope representation 40ms half-Gauss window sampled at 200 Hz HH • •• Extract just one pattern from each MIDI SN BD 100 200 300 samples looking for variety, quantity not a problem Eigenrhythms - Ellis & Arroyo 2004-10-14 Aligning Data: Tempo • Need to align patterns prior to PCA... • First, normalize tempo autocorrelation gives BPM candidates Eigenrhythms - Ellis & Arroyo 2004-10-14 Sound Examples Aligning Data: Downbeat • Downbeat from best match of temponormalized pattern to ‘mean’ template try every tempo hypotheses, choose best match Original pattern scaled Eigenrhythms - Ellis & Arroyo 2004-10-14 Aligned Data • Tempo normalization + downbeat alignment → 100 2-bar excerpts: • Can now apply basis projection(s) Eigenrhythms - Ellis & Arroyo 2004-10-14 Eigenrhythms (PCA) • Need 20+ Eigenvectors for good coverage of 100 training patterns (1200 dims) • Eigenrhythms both add and subtract Eigenrhythms - Ellis & Arroyo 2004-10-14 Indirhythms (ICA) Indirhythm 1 Indirhythm 2 HH HH SN SN BD BD Indirhythm 3 Indirhythm 4 HH HH SN SN BD BD 0.1 Indirhythm 5 Indirhythm 6 HH HH SN SN BD BD 0 1 50 100 2 150 0 200 3 250 300 4 350 0 -0.1 400 50 1 100 2 150 200 3 250 300 4 • 6 of 12 components from FastICA2_1 picking up fine timing shifts? Eigenrhythms - Ellis & Arroyo 2004-10-14 350 samples (@ 2 beats (@ 120 Discrirhythms (LDA) Discirhythm 1 Discirhythm 2 HH HH SN SN BD BD Discirhythm 3 Discirhythm 4 HH HH SN SN BD BD 0.1 Discirhythm 5 Discirhythm 6 HH HH SN SN BD BD 0 1 50 100 2 150 200 3 250 300 4 350 400 0 -0.1 0 1 50 100 2 150 200 3 250 300 4 • Trying to differentiate genre classes... Eigenrhythms - Ellis & Arroyo 2004-10-14 350 samples (@ 2 beats (@ 120 Posirhythms (NMF) Posirhythm 1 Posirhythm 2 HH HH SN SN BD BD Posirhythm 3 Posirhythm 4 HH HH SN SN BD BD 0.1 Posirhythm 5 Posirhythm 6 HH HH SN SN BD BD 0 1 50 100 2 150 200 3 250 300 4 350 400 0 -0.1 0 1 50 100 2 150 200 3 250 300 4 • Nonnegative: only adds beat-weight • Capturing some structure Eigenrhythms - Ellis & Arroyo 2004-10-14 350 samples (@ 2 beats (@ 120 Eigenrhythms for Classification • Projections in Eigenspace / LDA space PCA(1,2) projection (16% corr) LDA(1,2) projection (33% corr) 10 6 blues 4 country disco hiphop 2 house newwave rock 0 pop punk -2 rnb 5 0 -5 -10 -20 -10 0 10 -4 -8 -6 -4 -2 • 10-way Genre classification (nearest nbr): PCA3: 20% correct LDA4: 36% correct Eigenrhythms - Ellis & Arroyo 2004-10-14 0 2 Eigenrhythm BeatBox • Resynthesize rhythms from eigen-space Eigenrhythms - Ellis & Arroyo 2004-10-14 Conclusions & Future Work • Basis projections capture subspace of drum patterns • Not genre, but something? • • Future work use more data extract drum patterns from audio examine ‘feel’? Eigenrhythms - Ellis & Arroyo 2004-10-14