Seminar on Endangered Languages Alan W Black, Bob Frederking, Lori Levin and Laura Tomokiyo We are pleased to announce a new seminar, being offered now, Fall 2010. The purpose of this seminar is to allow students to better understand the linguistic, social and political issues when working with language technologies for endangered languages. Often in LTI we concentrate on issues of modeling with small amounts of data, or designing optimal strategies for collecting data, but ignore many of wider practical issues that appear when working with endangered languages. This seminar will consist of reading books and papers, and having participants give presentations; a few invited talks (e.g. from field linguists, and language advocates) will also be included. It will count for 6 units of LTI course credit. It may be possible for interested students to also carry out a related 6-unit project as a lab. Weekly meetings will be 1.5 hours, at a time suitable for attendees; our initial suggestion is 3:00-4:30pm on Tuesdays, starting this Tuesday, August 24, in GHC 5510. Grade will be based on presentations and class participation. We list below the topics to be covered, possible guest speakers, and our initial list of possible readings. Please reply to this email if interested, or if you have any questions. Topics will include: What are endangered languages Linguistics of Endangered Languages More variation, less information, more mixed with nearby languages Sharing knowledge from other linguistically close languages Sociolinguistic issues Preservation and avoiding change vs. natural language change Dealing with rival dialects/close languages Creolization, sociology of high/low prestige dialects, register/code switching Which technologies are practical What is feasible to construct What is actually useful, what isn’t What are the technical/formal tools one uses to gather data and represent it Orthography of low resource languages Use of alphabets from other languages (which more people can use), Standardization of Languages Access to and aid from native experts Managing their expectations Cultural issues: gender issues, reluctance to criticize, activist militancy Ethics: making contributions to their communities Managing their time and contributions efficiently Using non-LT experts to do LT annotation tasks Evaluation How can you measure technical success for one language What does success mean globally across all ELs Data collection How can you work with communities to collect information How can we work with legacy data where it exists Sustainability How collection/development continue Unconfirmed list of possible guest lecturers: Delyth Prys on Welsh Language Revitalization Steven Bird on Language Documentation David Mortensen on Field Linguistics Meg Noori on Chippewa/Ojibwe Initial list of likely readings: Crystal, D. “Language Death” Cambridge University Press, 2000. K. Hale, M. Krauss, L. Watahomigie, A. Yamamoto, C. Craig, L. Masayesva Jeanne, and N. England. Endangered Languages. Language 68(1), pp. 1-42, 1992. Peter Ladefoged. Another view of endangered languages. Language 68(4), pp. 809-811, 1992. Davel, M. and Barnard E. “The Efficient generation of Pronunciation Dictionaries: Human Factors during Bootstrapping”, Interspeech 2004, Jeju, Korea, 2004. Haspelmath, M, Dryer, M.S., Gil, D., and Comrie, B. (eds) (2005) World Atlas of Language Structures, Oxford University Press. [As a reference work.] NeSmith, R. Keao, “Tūtū’s Hawaiian and the Emergence of a Neo Hawaiian Language”, in ‘Owihi Journal,Volume 3, “Huliau (Time of Change)”. Ku‘ualoha Ho‘omanawanui, editor. ISBN 0-9668220-3-X 2005. Payne, T. (1997). Describing Morphosyntax: a guide for field linguists, Cambridge University Press. Rice, K. and Saxon, L. (2002) “Issues of Standardization and Community in Aboriginal Language Lexicography” in Willam Frawley, Kenneth C. Hill, and Pamela Munro (eds), Making dictionaries: preserving indigenous languages of the Americas. University of California Press, Chapter 6 125-154. Schultz, T., Black, A., Badasker, S., Hornyak, M. and Kominek, J. “SPICE: Wed-based tools for Rapid Language Adaptation in Speech Processing Systems”, Interspeech 2007, Antwerp, Belgium 2007. Joshua A. Fishman. Language Maintenance and Language Shift as a Field of Inquiry: a Definition of the Field and Suggestions for its Further Development. Linguistics 2(9), Pages 32–70, ISSN (Online) 1613-396X, ISSN (Print) 0024-3949, DOI: 10.1515/ling.1964.2.9.32, 1964. Published Online: 19/11/2009. http://www.referenceglobal.com/doi/abs/10.1515/ling.1964.2.9.32 Thomason, Sarah G. and Terrence Kaufman (1988). Language contact, creolization, and genetic linguistics. Berkeley: University of California Press. ISBN 0-520-07893-4. “The Linguists.” PBS documentary film on two linguists trying to save the world’s endangered languages. http://thelinguists.com/ Sherwani CMU PhD thesis: Speech Interfaces for Information Access by Low-Literate Users in the Developing World. http://www.cs.cmu.edu/~jsherwan/JS-thesis.pdf Font-Llijtos CMU PhD thesis: Interactive and Automatic Refinement of Translation Rules for a Transfer-based MT systems. http://www.cs.cmu.edu/~aria/thesis/FontLlitjosDissertation-2007.pdf Kominek CMU PhD thesis: TTS from Zero. http://www.lti.cs.cmu.edu/Research/Thesis/john_kominek.pdf Selections from: SaltMil conference series. http://ixa2.si.ehu.es/saltmil/index.php/en/activities-mainmenu73/saltmil-workshops-mainmenu-77 Aflat conference series. http://aflat.org/ The 1st International Conference on Language Documentation and Conservation (ICLDC). http://nflrc.hawaii.edu/ICLDC/2009/ Saving Languages: An introduction to language revitalization. Lenore A. Grenoble and Lindsay J. Whaley, Cambridge University Press. 2006. Language and National Identity in Africa. Andrew Simpson (ed.), Oxford University Press. 2008. Language and National Identity in Asia. Andrew Simpson (ed.), Oxford University Press. 2007.