mccaulay-dsm_iu_cond.. - Computer Sciences Dept.

advertisement
Music Information Retrieval
With Condor
Scott McCaulay
Joe Rinkovsky
Pervasive Technology Institute
Indiana University
Overview
PFASC is a suite of applications developed at IU to
perform automated similarity analysis of audio files
 Potential applications include organization of digital
libraries, recommender systems, playlist
generators, audio processing
 PFASC is a project in the MIR field, an extension
and adaptation of traditional Text Information
Retrieval techniques to sound files
 Elements of PFASC, specifically the file by file
similarity calculation, have proven to be a very
good fit with Condor

What We’ll Cover
Condor at Indiana University
 Background on Information Retrieval and
Music Information Retrieval
 The PFASC project
 PFASC and Condor, experience to date and
results
 Summary

Condor at IU
Initiated in 2003
 Utilizes 2350 Windows Vista machines
from IU’s Student Technology Clusters
 Minimum 2GB memory, 100 Mb network
 Available to students at 42 locations on
the Bloomington campus 24 x 7
 Student use is top priority, Condor jobs
are suspended immediately on use

Costs to Support Condor at IU
Annual marginal annual cost to support
Condor Pool at IU is < $15K
 Includes system administration, head
nodes, file servers
 Purchase and support of STC machines
are funded from Student Technology Fees

Challenges to Making Good use of
Condor Resources at IU

Windows environment
– Research computing environment at IU is
geared to Linux, or to exotic architectures

Ephemeral resources
– Machines are moderately to heavily used at all
hours, longer jobs are likely to be preempted

Availability of other computing resources
– Local users are far from starved for cycles,
limited motivation to port
Examples of Applications Supported
on Condor at IU

Hydra Portal (2003)
– Job submission portal
– Suite of Bio apps, Blast, Meme, FastDNAml

Condor Render Portal (2006)
– Maya, Blender video rendering

PFASC (2008)
– Similarity analysis of audio files
Information Retrieval - Background
Science of organizing documents for
search and retrieval
 Dates back to 1880s (Hollerith)
 Vannevar Bush, first US presidential
science advisor, presages hypertext in
“As We May Think” (1945)


The concept of automated text
document analysis, organization
and retrieval was met with a good
deal of skepticism until the 1990s.
Some critics now grudgingly
concede that it might work
Calculating Similarity
The Vector Space Model




Each feature found in a file is assigned a weight
based on the frequency of its occurrence in the
file and how common that feature is in the
collection
Similarity between files is calculated based on
common features and their weights. If two files
share features not common to the entire
collection, their similarity value will be very high
This vector space model (Salton) is the basis of
many text search engines, and also works well
with audio files
For text files, features are words or character
strings. For Audio files, features are prominent
frequencies within frames of audio or sequences
of frequencies across frames.
Some Digital Audio History

Uploaded to Compuserve 10/1985
– one of the most popular downloads at the time!





10 seconds of digital audio
Time to download (300 baud): 20 minutes
Time to load: 20 minutes (tape) 2 minutes (disk)
Storage space: 42K
From this to Napster in less than 15 years
Explosion of Digital Audio
1500
1000
physical
digital
500
0
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
RIAA Sales Figures (millions)
Digital audio today similar to text 15 years ago
 Poised for 2nd phase of the digital audio
revolution?

– Ubiquitous, easy to create, access, share
– Lack of tools to analyze, search or organize
How can we organize this
enormous and growing
volume of digital audio data
for discovery and retrieval?
What’s done today

Pandora - Music Genome Project
– expert manual classification of ~ 400
attributes

Allmusic
– manual artist similarity classification by critics

last.fm – Audioscrobbler
– collaborative filtering from user playlists

iTunes Genius
– collaborative filtering from user playlists
What’s NOT done today

Any analysis (outside of research) of
similarity or classification based on the
actual audio content of song files
Possible Hybrid Solution
Automated
Analysis
User
Behavior
Manual
Metadata

Classification/Retrieval system could use elements
of all three methods to improve performance
Music Information Retrieval
Applying traditional IR techniques for
classification, clustering, similarity
analysis, pattern matching, etc. to digital
audio files
 Recent field of study, has accelerated with
the inception of the ISMIR conference in
2000 and MIREX evaluation in 2004.

Common Basis of an MIR System



Select very small segment
of audio data, 20-40ms
Use fast Fourier transform
(FFT) to convert to
frequency data
This ‘frame’ of audio
becomes the equivalent of
a word in a text file for
similarity analysis
The output of this ‘feature
extraction’ process is input
to various analysis or
classification processes
 PFASC additionally
combines prominent
frequencies from adjacent
frames to create temporal
sequences as features

PFASC as an MIR Project

Parallel Framework for Audio Similarity Clustering
Initiated at IU in 2008
 Team includes School of Library and Information
Science (SLIS), Cognitive Science, School of
Music and Pervasive Technologies Institute (PTI)
 Have developed MPI-based feature extraction
algorithm, SVM classification, vector space
similarity analysis, some preliminary visualization.
 Wish list includes graphical workflow, job
submission portal, use in MIR classes

PFASC Philosophy and Methodology






Provide an end-to-end framework for MIR, from
workflow to visualization
Recognize temporal context as an critical element
of audio and a necessary part of feature extraction
Simple concept, simple implementation, one highly
configurable algorithm for feature extraction
Dynamic combination and tuning of results from
multiple runs, user controlled weighting
Make good use of available cyberinfrastructure
Support education in MIR
PFASC Feature Extraction Example
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
folk
hiphop
rock
0.1
0

Summary of 450 files classified by genre, showing
most prominent frequencies across spectrum
PFASC Similarity Matrix Example
Hip Hop
Folk
Rock
Hip Hop
0.115
0.049
0.042
Folk
0.049
0.087
0.024
Rock
0.042
0.024
0.168
Audio file summarized as a vector of feature values,
similarity calculated between vectors
 Value is between 0.0 and 1.0, 0.0 = no commonality,
1.0 = files are identical
 In the above example, same genre files had similarity
scores 3.352 times higher than different genre files

Classification vs. Clustering
Most work in MIR involves classification, e.g.
genre classification, an exercise that may be
arbitrary and limited in value
 Calculating similarity values among all songs in a
library may be more practical for music
discovery, playlist generation, grouping by
combinations of selected features
 Calculating similarity is MUCH more
computationally intensive than categorization,
comparing all songs in a library of 20,000 files
requires ~200 million comparisons

Using Condor for Similarity Analysis
Good fit for IU Condor resources, a very
large number of short duration jobs
 Jobs are independent, can be restarted
and run in any order
 Large number of available machines
provides great wall clock performance
advantage over IU supercomputers

PFASC Performance and Resources
A recent run of 450 jobs completed in 16 minutes.
Time to run in serial on a desktop machine would
have been about 19 hours
 Largest run to date contained 3,245 files, over 5
million song-to-song comparisons, completed in
less than eight hours, would have been over 11
days on a desktop
 Queue wait time for 450 processors on IU’s Big
Red is typically several days, for 3000+ processors
it would be up to a month

Porting to Windows
Visualizing Results
Visualizing Results
PFASC Contributors
Scott McCaulay (Project Lead)
 Ray Sheppard (MPI Programming)
 Eric Wernert (Visualization)
 Joe Rinkovsky (Condor)
 Steve Simms (Storage & Workflow)
 Kiduk Yang (Information Retrieval)
 John Walsh (Digital Libraries)
 Eric Isaacson (Music Cognition)

Thank you!
Download