Exploring a million hours of sounds Richard Ranft, The British Library 27 November 2014 Search Solutions 2014 Outline • the British Library’s audio collections • discovery and access • finding one in a million www.bl.uk 2 The British Library’s audio collections • originated in 1955 • national collection of UK record industry • selected publications from overseas • radio broadcasts • unpublished recordings www.bl.uk 3 Subjects • music • spoken word • environments & nature www.bl.uk 4 Extent • 6 million tracks • from 1857 to this morning • many formats • 115 years of listening www.bl.uk 5 Obstacles to exploring and access • copyrights • analogue or offline digital • many non-digital tracks • time-based = time consuming • limited, text-based search • no serendipity • high expectations (c.f. iTunes, Spotify) www.bl.uk 6 Online consumer audio services • ‘opacity’ of audio (no freezeframes!) Human-led enrichment • description • transcription • annotation • category tagging • rating, recommendation & review www.bl.uk 9 Machine enrichment/search Categorisation Music genre, language/dialect detection, mood Synchronisation Score following Transcript following Identification Speaker/vocalist ID Melody recognition Query by humming/tapping Non-text browsing Map browse Timeline browse Recommendation & matching melody matching Cross-media linking Speaker/ tune matching Feature extraction Pitch, tempo, chord, time signature, rhythm Segmentation/event detection Music/speech segments Speaker/ lead instrument change Laughter, applause, emotion detection Transcription Speech-to-text Score generation Discovery and access • Sound & Moving Image Catalogue sami.bl.uk • onsite listening: – Appointments service – SoundServer (200,000 tracks, 3% of total) • off site listening: – BL Sounds website (50,000 tracks, 1%) • streaming • downloading www.bl.uk 11 Sound & Moving Image Catalogue sami.bl.uk www.bl.uk 12 BL Sounds • Improving access and discovery • http://sounds.bl.uk/ Visualisation and analysis Current BL projects • ‘Metable’ software: acquire / describe UK’s digital music, searching via APIs across open music databases (MusicBrainz, Decibel, Discogs) • COMMA: cloud-based media analysis project with BBC http://www.bbc.co.uk/rd/projects/comma • Digital Music Lab: analysing and visualising big music data collections http://dml.city.ac.uk/ www.bl.uk 21 Digital Music Lab example Chord detection using Chordino VAMP Plugin (Queen Mary University of London) www.bl.uk 22 English conversation: At the Tobacconist's (1929) Linguaphone 78rpm shellac disc http://sounds.bl.uk/Arts-literature-and-performance/Earlyspoken-word-recordings/024M-1CS0011556XX-0200V0 www.bl.uk 23 Thanks for listening! richard.ranft@bl.uk http://sounds.bl.uk @soundarchive www.bl.uk 25