Aspects of Music Information Retrieval Will Meurer School of Information University of Texas Music Information Retrieval (MIR) MIR Overview Challenges in MIR Current MIR Technology Possibilities & Concerns Recommendations Final Remarks MIR Overview Currently MIR is chiefly Bibliographic How is Music so different? Downie’s 7 Facets Pitch, Temporal, Harmonic, Timbral, Editorial Textual* and Bibliographic* Representations Visual (musical scores, manuscripts) Aural (digital music) Text Hybrid (visual representation of an audio file) * Used in current mainstream MIR systems MIR Overview Facets MIR Overview Visual Representations Common Music Notation Tablature E---------------0---------3-3------3--1-1-------------3-1-1---------------B---1-------------3-2-----2-2------2--3-3-----3-2---2---3-3------------1--G---0--0--0h1-------0-----0-0------0--2-2---2---0-0-----2-2------------2--D---2--2------------------------------0---0-------------0----0-1-2-3-3---3A-3-------------------0-0-----0--0--------------0-------------------------E-------------0------------------------------------------------------------ MIR Overview User Groups General music listeners Music students, performers, composers, and conductors; music therapists; musicologists; music librarians and library patrons; audio engineers; scholars; researchers, and; intellectual property lawyers Challenges in MIR Began in the 1950’s, still an “emerging discipline” Subjectivity and Versioning Many levels of music knowledge No standardization No standard test collection (HNH Naxos) No standardized sets of performance No standardized evaluation metrics Lack of bibliographic control (Downie’s site) No communication among interested disciplines Current MIR Technology Aural Queries Query By Humming (QBH) systems Input: aural melody Matches interval sequences to index terms Musart (Bartsch et al., 2003) matches melody, harmony, and rhythm Current MIR Technology Indexing for Aural Queries Thematic melodies are extracted from the source (Beginning of Beethoven’s 9th Symphony) Translated into text representations of intervals, pitch, and harmony (e.g. EEFGGFEDCCDEEDD) Text versions shrink index size. Audio indexing is expensive and involves more processing to match queries Musart extracts thematic material automatically by finding common passages N-gramming Current MIR Technology N-gramming “Chunks” search terms Compares search “chunks” to indexed “chunks” Example: Indexed melody is CCGGAAG-FFEEDDC- (Twinkle, Twinkle, Little Star) Searcher hummed CCGGAAG-FFECDDC N-gramming this query would match the CCGGAAG even though FFECDDC was incorrectly hummed Provides fault tolerance Current MIR Technology Polyphonic Focus Monophonic/polyphonic queries Doraisamy and Rüger (2002) Evaluated monophonic queries against a polyphonic database Results were “promising” Polyphonic/polyphonic queries Musart Flattens chord tones into text codes Does not account for timbral aspects Not suitable for large databases where more matches are made per query. The more fault tolerance, the more results Current MIR Technology Fusing the Representations and Formats Need to synchronize data in all formats and representations Allows one system to serve many different types of users Arifi et al. (2003) synchronized Score (visual), MIDI (digital), and PCM (digital audio) Possibilities & Concerns Another Facet? Jan LaRue “SHMRG” G is the overall “form” and how parts of a piece Affect the Effect of a piece Growth may be useful to index and search LaRue Downie Sound Timbral Harmony Harmonic Melody Pitch Rhythm Temporal Growth ???? Editorial, Bibliographic, and Textual work within and between LaRue’s S, H, M, and R. Possibilities & Concerns Further effects of copyright laws Interfaces and usability Current focus is on technology, not usability Dixon, Pampalk & Widmer (2003) Browse multiple views simultaneously Unnatural, awkward interface Possibilities & Concerns Navigation? Recommendations Downie & Olson (2003), Chopin Early Editions Content-based search features Symbolic content search Optical Music Resolution (OCR for music) Version distinguishing Recommendations Focus must be on why Complex problems? Simple solutions. Base the fault tolerance level on searcher’s aural query precision from past queries Results should display multiple facets: bibliographic, textual, pitch (what key), etc. Results should offer different formats: score, mp3, MIDI Display all versions from the database within each search result Final Remarks Music is a complicated form of information and requires special retrieval systems Demand for MIR will increase, and research and funding will follow Copyrights and lack of standardization may prevent fast growth of MIR development MIR technology is improving, application is lacking Interface design and usability must develop as the technology advances