Introducing Phon 1.4 Yvan Rose Memorial University of Newfoundland Acknowledgements Funding: Current development of Phon and PhonBank is supported by the National Institute of Health. Earlier development of Phon was funded by grants from National Science Foundation, Canada Fund for Innovation, Social Sciences and Humanities Research Council of Canada, Petro-Canada Fund for Young Innovators, and the Office of the Vice-President (Research) and the Faculty of Arts at Memorial University of Newfoundland. Dictionaries: Built-in dictionaries of pronounced forms were obtained from several organizations (see http://phon.ling.mun.ca/phontrac/ for details) Special thanks: We owe special thanks to a wonderful group of early adopters and beta testers. These include researchers and PhD students from Universidade da Lisboa (Laetitia Almeida, Susana Correia, Teresa da Costa, Maria João Freitas); Universitat Autònoma de Barcelona (Joan Borràs Comes, Ana Estrella, Maria del Mar Vanrell, Pilar Prieto, Jill Thorson); Center for Advanced Research in Theoretical Linguistics (Helene Nordgård Andreassen, Bruce Morén); Universiteit Leiden (Claartje Levelt); Radboud Universiteit Nijmegen (Paula Fikkert, Nicole Altvater-Mackensen); Université Lumière Lyon 2 (Christophe dos Santos, Sophie Kern); Université Paris 3 (Aliyah Morgenstern, Naomi Yamaguchi); Université Paris 10 (Christophe Parisse); UBC (May Bernhardt, Joe Stemberger ); Memorial University of Newfoundland (Lindsay Babcock, Christine Champdoizeau, Carla Peddle, Erica Davis, Sarah Phon development Development teams: Phon team at Memorial University of Newfoundland CHILDES team at Carnegie Mellon University Design and implementation criteria Reliability Simplicity Flexibility/Neutrality (no analytical bias) Compatibility, Extensibility Availability Phon can be used for all types of transcription- Technical overview Programmed in Java Main computer platforms supported Unicode compliant XML (TalkBank) data structure Working toward compatibility with other TalkBank-compliant applications Support for IPA characters and diacritics Most features integrated within a unique interface Open-source Phon’s interface (r)evolution Early interfaces posed a number of problems Cluttered in many ways Not very flexible Improvement of user experience, mostly based on user-feedback Comfort Flexibility Additional refinements from tools available in the open-source universe Look back: Interface in Phon 1.0, 1.1 Look back: Interface in Phon 1.2, 1.3 Phon 1.4: Interface improvements treamlined visuals xternal media player table; optional) exible, user-defined terface Waveform visualization Workflow supported* Project management Media linkage & segmentation Data transcription Transcript validation Syllabification and alignment Corpus query Query results visualization & management Project management Project management Project structure: Project Corpus Corpus 1 Corpus 2 n… … … … Transcript(s) Transcript(s)Transcript(s) Project management from within the application Ability to move/copy transcripts across corpora/projects Project management Media linkage and segmentation Media linkage and segmentation For projects based on multimedia data Linkage of media file to transcript Identification of the time intervals that are relevant for research, for each participant Media playback Whole media Segmented portion (Scene playback not yet implemented in 1.4) Data transcription Data transcription Support for IPA transcriptions Built-in IPA chart Dictionaries of pronounced forms Languages supported: Catalan, German, English, French, Icelandic, Italian, Dutch, and Spanish Support for ‘sandhi’ rules English plurals (e.g. cat[s] versus dog[z]) French contractions (e.g. l’ami) Data transcription Word grouping (for sub-utterance segmentation) Ability to export sound/video clips Facilitates access to acoustic measurements (Also useful for presentation purposes) Integrated system for multiple-blind Data transcription Transcript validation Transcript validation Required under the multi-blind protocol Method based on comparisons between multiple blind transcriptions Integrated within the session editor Best performed by a team of transcript validators Simultaneous listening of the media Exporting of sound clip for acoustic measurement Selection of the most accurate one Transcript validation Syllabification and alignment Syllabification and alignment Automatic labeling of segments for syllable information Support for: Various languages Various theoretical assumptions Automatic alignment of transcribed phones in IPA Target-Actual pairs of transcribed words Required for comparison, process identification Syllabification and alignment Data compilations Data compilations Search ‘plugin’ system Ability to create own search plugin without reprogramming Phon Based on Java (ECMA) scripting Support for text, phonological expressions and regular expressions Built-in script editor Persistent search results Queries and results saved in a relational database Data compilations Inventories (phones, syllables types, stress patterns) Textual or phonological data Character strings, feature sets, syllable positions Syllable types Aligned phones and groups Consonant and vowel harmony Consonant metathesis Combinations of the above Data visualization and reporting Search results integrated with the session editor Session opens as results are displayed Results summaries Clipboard-accessible (for quick usage in other applications) Report format Several file formats supported Various result types combined into reports Let’s see some real action! Other useful features Multiple undos supported in data tiers User-defined tiers Data copy to clipboard (tier, record, session) Audio/video clip export Compatibility with CLAN through data conversion utilities CHAT2XML, XML2Phon (lots accomplished on this front in recent months) On the horizon: Phon ⇔ Praat Integration between Phon and Praat, through TextGrid Acoustic measurements performed within Praat Integration of Praat-generated data in Phon data compilations E.g. Get FØ, intensity and duration data for all mid vowels in word-initial, stressed syllables Fuller vision to be expressed in Paul’s Longer Term Goal Phonological Data Acoustic Data CLAN, ELAN, SFS… Some areas of contribution Easier exchange between researchers Study and comparisons of corpora Within and across languages, populations, … Better understanding of: Linguistic phenomena Acquisition-related patterns Speech impediments More efficient educational and clinical interventions Thanks for your Phon and user manual: attention! http://childes.psy.cmu.edu/phon/ http://phon.ling.mun.ca/phontrac/Corpora: http: //childes.psy.cmu.edu/data/PhonBank/ http: //childes.psy.cmu.edu/data/PhonBank-Phon/ Discussion Group: phon@googlegroups.com Technical forums: Questions, feedback: http://phon.ling.mun.ca/phontrac/ yrose@mun.ca