Data Driven Music Understanding Dan Ellis Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Engineering, Columbia University, NY USA http://labrosa.ee.columbia.edu/ 1. 2. 3. 4. Motivation: What is Music? Eigenrhythms Melodic-Harmonic Fragments Example Applications Data-driven music understanding - Ellis 2008-10-03 p. 1 /24 LabROSA Overview Information Extraction Music Recognition Environment Separation Machine Learning Retrieval Signal Processing Speech Data-driven music understanding - Ellis 2008-10-03 p. 2 /24 1. Motivation: What is music? ? • What does music evoke in a listener’s mind? • Which are the things that we call “music”? Data-driven music understanding - Ellis 2008-10-03 p. 3 /24 Oodles of Music • What can you do with a million tracks? Data-driven music understanding - Ellis 2008-10-03 p. 4 /24 Re-use in Music Scatter of PCA(3:6) of 12x16 beatchroma 60 50 40 30 • What are the most popular chord progressions in pop music? 20 10 10 20 30 40 50 60 10 20 30 40 50 60 60 50 40 30 20 10 Data-driven music understanding - Ellis 2008-10-03 p. 5 /24 Potential Applications • Compression • Classification • Manipulation Data-driven music understanding - Ellis 2008-10-03 p. 6 /24 2. Eigenrhythms: Drum Track Structure Ellis & Arroyo ISMIR’04 • To first order, all pop music has the same beat: • Can we learn this from examples? Data-driven music understanding - Ellis 2008-10-03 p. 7 /24 Basis Sets • Combine a few basic patterns to make a larger dataset weights data X = W × H patterns 1 = 0 × -1 Data-driven music understanding - Ellis 2008-10-03 p. 8 /24 Drum Pattern Data • Tempo normalization + downbeat alignment Data-driven music understanding - Ellis 2008-10-03 p. 9 /24 NMF Eigenrhythms Posirhythm 1 Posirhythm 2 HH HH SN SN BD BD Posirhythm 3 Posirhythm 4 HH HH SN SN BD BD 0.1 Posirhythm 5 Posirhythm 6 HH HH SN SN BD BD 0 1 50 100 2 150 200 3 250 300 4 350 400 0 -0.1 0 1 50 100 2 150 200 3 250 300 4 • Nonnegative: only add beat-weight Data-driven music understanding - Ellis 2008-10-03 p. 10/24 350 samples (@ 2 beats (@ 120 Eigenrhythm BeatBox • Resynthesize rhythms from eigen-space Data-driven music understanding - Ellis 2008-10-03 p. 11 /24 3. Melodic-Harmonic Fragments • How similar are two pieces? • Can we find all the pop-music clichés? Data-driven music understanding - Ellis 2008-10-03 p. 12 /24 MFCC Features • Used in speech recognition Let It Be (LIB-1) - log-freq specgram freq / Hz 5915 1396 329 Coefficient MFCCs 12 10 8 6 4 2 Noise excited MFCC resynthesis (LIB-2) freq / Hz 5915 1396 329 0 5 10 Data-driven music understanding - Ellis 15 20 2008-10-03 25 time / sec p. 13 /24 Chroma Features • To capture “musical” content Let It Be - log-freq specgram (LIB-1) freq / Hz 6000 1400 chroma bin 300 Chroma features B A G E D C Shepard tone resynthesis of chroma (LIB-3) freq / Hz 6000 1400 300 MFCC-filtered shepard tones (LIB-4) freq / Hz 6000 1400 300 0 5 10 Data-driven music understanding - Ellis 15 20 2008-10-03 25 time / sec p. 14 /24 Beat-Synchronous Chroma • Compact representation of harmonies Let It Be - log-freq specgram (LIB-1) freq / Hz 6000 1400 300 chroma bin Onset envelope + beat times Beat-synchronous chroma B A G E D C Beat-synchronous chroma + Shepard resynthesis (LIB-6) freq / Hz 6000 1400 300 0 5 10 Data-driven music understanding - Ellis 15 20 2008-10-03 25 time / sec p. 15 /24 Finding ‘Cover Songs’ Ellis & Poliner ‘07 • Little similarity in surface audio... Let It Be - Nick Cave Let It Be / Beatles / verse 1 4 freq / kHz freq / kHz Let It Be - The Beatles 3 2 1 0 Let It Be / Nick Cave / verse 1 4 3 2 1 2 4 6 8 0 10 time / sec 2 • .. but appears in beat-chroma G F chroma chroma Beat-sync chroma features D C A 4 6 8 10 time / sec Beat-sync chroma features G F D C A 5 10 15 20 25 beats Data-driven music understanding - Ellis 5 10 15 2008-10-03 20 25 beats p. 16 /24 Finding Common Fragments • Cluster beat-synchronous chroma patches #32273 - 13 instances chroma bins G F D C A G F D C A G F D C A G F D C A 5 a20-top10x5-cp4-4p0 #51917 - 13 instances #65512 - 10 instances #9667 - 9 instances #10929 - 7 instances #61202 - 6 instances #55881 - 5 instances #68445 - 5 instances 10 15 20 Data-driven music understanding - Ellis 5 10 15 2008-10-03 time / beats p. 17 /24 Clustered Fragments chroma bins G F depeche mode 13-Ice Machine 199.5-204.6s roxette 03-Fireworks 107.9-114.8s roxette 04-Waiting For The Rain 80.1-93.5s tori amos 11-Playboy Mommy 157.8-171.3s D C A G F D C A 5 10 15 20 5 10 15 time / beats • ... for a dictionary of common themes? Data-driven music understanding - Ellis 2008-10-03 p. 18 /24 4. Example Applications: Music Discovery Berenzweig & Ellis ‘03 • Connecting listeners to musicians Data-driven music understanding - Ellis 2008-10-03 p. 19 /24 Playlist Generation Mandel, Poliner, Ellis ‘06 • Incremental learning of listeners’ preferences Data-driven music understanding - Ellis 2008-10-03 p. 20 /24 MajorMiner: Music Tagging • Describe music using words Data-driven music understanding - Ellis Mandel & Ellis ’07,‘08 2008-10-03 p. 21 /24 Music Transcription Poliner & Ellis ‘05,’06,’07 feature representation feature vector Training data and features: •MIDI, multi-track recordings, playback piano, & resampled audio (less than 28 mins of train audio). •Normalized magnitude STFT. classification posteriors Classification: •N-binary SVMs (one for ea. note). •Independent frame-level classification on 10 ms grid. •Dist. to class bndy as posterior. hmm smoothing Temporal Smoothing: •Two state (on/off) independent HMM for ea. note. Parameters learned from training data. •Find Viterbi sequence for ea. note. Data-driven music understanding - Ellis 2008-10-03 p. 22/24 MEAPsoft • Music Engineering Art Projects collaboration between EE and Computer Music Center Data-driven music understanding - Ellis with Douglas Repetto, Ron Weiss, and the rest of the MEAP team 2008-10-03 p. 23 /24 Conclusions Low-level features Classification and Similarity browsing discovery production Music Structure Discovery modeling generation curiosity Melody and notes Music audio Key and chords Tempo and beat • Lots of data + noisy transcription + weak clustering musical insights? Data-driven music understanding - Ellis 2008-10-03 p. 24/24