readme - LDC Catalog

Lincoln Laboratory Handset Database ( LLHDB ) Recorded at MIT Lincoln Laboratory Speech Systems Technology Group This corpus is delivered "as is" and no claims suitability. The data may be used for research be further distributed or transmitted without the Lincoln Laboratory. Use of this data implies agreement are made for specific purposes only and may not written consent of MIT with the above conditions. Introduction -----------The LLHDB corpus is recordings of people speaking into different telephone handsets. The aim was to create a corpus for the study of telephone transducer effects on speech which minimized confounding factors, such as variable telephone channels and background noise. LLHDB was created by having volunteers speak prompted and extemporaneous speech into different transducers in a sound-proof room and directly digitizing the output from the transducers on a SunSparc A/D at a 8kHz sampling rate and a 16 bit resolution. There were three types of speech recorded for each handset. First, the speaker read the rainbow passage" [Nolan 83], a ninety-seven word passage sometimes used in phonetic research. Second, the speaker read 10 sentences extracted from the TIMIT (Each speaker was was assigned to one of the TIMIT speakers and was prompted to read each of the TIMIT speaker's ten sentences). Finally, the speaker was asked to describe a photograph for approximately 40 seconds. (A different photograph was used for each handset.) LLHDB contains speech from 53 speakers (24 males and 29 females) recruited from the Laboratory. Ten transducers were used, as described in the table below. Most of the telephone handsets are not new (except el2) and were obtained from the Lincoln Telecom office. Handsets with obvious damage were not used, but in order to obtain some diversity with a limited number of handsets, handsets were selected to have variable sound characteristics, transducer designs or, in the case of electrets, different grill designs. For example, cb1-cb3 have the same handset manufacture name (NT G-type) but the carbon-button transducer is different in each. In addition, cb3 and cb4 were selected because they had particularly poor (although not pathological) sound characteristics. Table 1: Transducers used in corpus. --------------------------------------------------------------------------Transducer Name | Description ----------------|---------------------------------------------------------senh | Sennheizer head-mounted microphone ----------------|---------------------------------------------------------pt1 | Sony portable (cord-less) telephone ----------------|---------------------------------------------------------el1 | Northern-Telecom Unity electret (3-line grill) ----------------|---------------------------------------------------------el2 | Northern-Telecom Unity Noisy-Environment electret | (2-line grill) ----------------|---------------------------------------------------------el3 | Unknown manufacture electret (64-hole grill) ----------------|---------------------------------------------------------el4 | Radio Shack Chronophone-255 electret telephone ----------------|---------------------------------------------------------cb1 | Northern-Telecom G-type carbon-button | (center hole membrane transducer) ----------------|---------------------------------------------------------cb2 | Northern-Telecom G-type carbon-button | (6 hole metal transducer) ----------------|---------------------------------------------------------cb3 | Northern-Telecom G-type carbon-button | (6 hole membrane transducer) ----------------|---------------------------------------------------------cb4 | ITT carbon-button (6 hole membrane/attached transducer) --------------------------------------------------------------------------The handsets are the same handset used in the collection of the HTIMIT corpus (also available through the LDC). It is thus possible to compare the effects of artificially creating transducer degradations by playing speech through handsets to people speaking into handsets. Data Organization ----------------The files are organized in the following hierarchy: <Handset1> <Handset2> ... <Handset10> ________|___________ / | \ <spkr1> <spkr2> ... <spkr53> ______|___________ / | \ sa1.sph sa2.sph ... extemp.sph The following TIMIT-style naming convention is used. <HANDSET>/<SEX><SPEAKER_ID>/<SENTENCE_ID>.<FILE_TYPE> where, HANDSET :== cb1 | cb2 | cb3 | cb4 | el1 | el2 | el3 | el4 | pt1 | senh (see Table 1 for handset code description) SEX :== m | f SPEAKER_ID :== <INITIALS><DIGIT> where, INITIALS :== speaker initials, 3 letters DIGIT :== number 1-9 to differentiate speakers with identical initials SENTENCE_ID :== <TEXT_TYPE><SENTENCE_NUMBER> | rainbow | extemp where, TEXT_TYPE :== sa | si | sx (see TIMIT documentation for text type description) SENTENCE_NUMBER :== 1 ... 2342 FILE_TYPE :== sph | txt where, sph :== Speech waveform file with NIST Sphere header txt :== Text of TIMIT sentences (not transcriptions of what was actually said) Example: cb1/mdar/sa1.sph (carbon-button 1 handset, male speaker, speaker-ID "dar", sentence text "sa1", speech waveform file) The doc directory contains the following files: - rainbow.txt : The text of the rainbow passage spoken by all speakers on all handsets. - spkrs.lst : A list of the speakers' initials, sex and birth year. - icassp97.ps : A Postscript version of an ICASSP paper describing the HTIMIT and LLHDB collection procedures. Updates: This 2 CD-ROM set is a reprint of Lincoln Laboratory Handset Database (LLHDB), produced by Linguistic Data Consortium, catalog number LDC1998S68, isbn 1-58563-136-1. Relative to the original CD-ROMs produced in 1998 by the Linguistic Data Consortium, the extension of the audio files was changed from ".wav" to ".sph".

readme - LDC Catalog

Related documents

Products

Support

readme - LDC Catalog

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib