Intro-Phon 1 - CHILDES - Carnegie Mellon University

advertisement
Introducing Phon 1.4
Yvan Rose
Memorial University of
Newfoundland
Acknowledgements
Funding: Current development of Phon and PhonBank is supported by the
National Institute of Health. Earlier development of Phon was funded by grants
from National Science Foundation, Canada Fund for Innovation, Social
Sciences and Humanities Research Council of Canada, Petro-Canada Fund
for Young Innovators, and the Office of the Vice-President (Research) and the
Faculty of Arts at Memorial University of Newfoundland.
Dictionaries: Built-in dictionaries of pronounced forms were obtained from
several organizations
(see http://phon.ling.mun.ca/phontrac/ for details)
Special thanks: We owe special thanks to a wonderful group of early adopters
and beta testers. These include researchers and PhD students from
Universidade da Lisboa (Laetitia Almeida, Susana Correia, Teresa da Costa,
Maria João Freitas); Universitat Autònoma de Barcelona (Joan Borràs Comes,
Ana Estrella, Maria del Mar Vanrell, Pilar Prieto, Jill Thorson); Center for
Advanced Research in Theoretical Linguistics (Helene Nordgård Andreassen,
Bruce Morén); Universiteit Leiden (Claartje Levelt); Radboud Universiteit
Nijmegen (Paula Fikkert, Nicole Altvater-Mackensen); Université Lumière Lyon
2 (Christophe dos Santos, Sophie Kern); Université Paris 3 (Aliyah
Morgenstern, Naomi Yamaguchi); Université Paris 10 (Christophe Parisse);
UBC (May Bernhardt, Joe Stemberger ); Memorial University of Newfoundland
(Lindsay Babcock, Christine Champdoizeau, Carla Peddle, Erica Davis, Sarah
Phon development
Development teams:
Phon team at Memorial University of
Newfoundland
CHILDES team at Carnegie Mellon University
Design and implementation criteria
Reliability
Simplicity
Flexibility/Neutrality (no analytical bias)
Compatibility, Extensibility
Availability
Phon can be used for all types of transcription-
Technical overview
Programmed in Java
Main computer platforms supported
Unicode compliant
XML (TalkBank) data structure
Working toward compatibility with other
TalkBank-compliant applications
Support for IPA characters and diacritics
Most features integrated within a unique
interface
Open-source
Phon’s interface (r)evolution
Early interfaces posed a number of
problems
Cluttered in many ways
Not very flexible
Improvement of user experience, mostly
based on user-feedback
Comfort
Flexibility
Additional refinements from tools
available in the open-source universe
Look back: Interface in Phon
1.0, 1.1
Look back: Interface in Phon
1.2, 1.3
Phon 1.4: Interface
improvements
treamlined visuals
xternal media player
table; optional)
exible, user-defined
terface
Waveform visualization
Workflow supported*
Project management
Media linkage & segmentation
Data transcription
Transcript validation
Syllabification and alignment
Corpus query
Query results visualization & management
Project management
Project management
Project
structure:
Project
Corpus
Corpus 1
Corpus 2
n…
…
…
…
Transcript(s) Transcript(s)Transcript(s)
Project management from within the
application
Ability to move/copy transcripts across
corpora/projects
Project management
Media linkage and
segmentation
Media linkage and
segmentation
For projects based on multimedia data
Linkage of media file to transcript
Identification of the time intervals that
are relevant
for research, for each participant
Media playback
Whole media
Segmented portion
(Scene playback not yet implemented
in 1.4)
Data transcription
Data transcription
Support for IPA transcriptions
Built-in IPA chart
Dictionaries of pronounced forms
Languages supported: Catalan,
German, English, French, Icelandic,
Italian, Dutch, and Spanish
Support for ‘sandhi’ rules
English plurals (e.g. cat[s] versus
dog[z])
French contractions (e.g. l’ami)
Data transcription
Word grouping (for sub-utterance
segmentation)
Ability to export sound/video clips
Facilitates access to acoustic
measurements
(Also useful for presentation purposes)
Integrated system for multiple-blind
Data transcription
Transcript validation
Transcript validation
Required under the multi-blind protocol
Method based on comparisons between
multiple blind transcriptions
Integrated within the session editor
Best performed by a team of transcript
validators
Simultaneous listening of the media
Exporting of sound clip for acoustic
measurement
Selection of the most accurate one
Transcript validation
Syllabification and alignment
Syllabification and alignment
Automatic labeling of segments for
syllable information
Support for:
Various languages
Various theoretical assumptions
Automatic alignment of transcribed
phones in IPA Target-Actual pairs of
transcribed words
Required for comparison, process
identification
Syllabification and alignment
Data compilations
Data compilations
Search ‘plugin’ system
Ability to create own search plugin
without reprogramming Phon
Based on Java (ECMA) scripting
Support for text, phonological
expressions and regular expressions
Built-in script editor
Persistent search results
Queries and results saved in a
relational database
Data compilations
Inventories (phones, syllables types,
stress patterns)
Textual or phonological data
Character strings, feature sets, syllable
positions
Syllable types
Aligned phones and groups
Consonant and vowel harmony
Consonant metathesis
Combinations of the above
Data visualization and
reporting
Search results integrated with the session
editor
Session opens as results are displayed
Results summaries
Clipboard-accessible (for quick
usage in other applications)
Report format
Several file formats supported
Various result types combined into
reports
Let’s see some
real action!
Other useful features
Multiple undos supported in data tiers
User-defined tiers
Data copy to clipboard (tier, record,
session)
Audio/video clip export
Compatibility with CLAN through data
conversion utilities
CHAT2XML, XML2Phon (lots
accomplished on this front in recent
months)
On the horizon: Phon ⇔
Praat
Integration between Phon and Praat,
through TextGrid
Acoustic measurements performed within
Praat
Integration of Praat-generated data in
Phon data compilations
E.g. Get FØ, intensity and duration
data for all mid vowels in word-initial,
stressed syllables
Fuller vision to be expressed in Paul’s
Longer Term Goal
Phonological
Data
Acoustic
Data
CLAN,
ELAN,
SFS…
Some areas of contribution
Easier exchange between researchers
Study and comparisons of corpora
Within and across languages,
populations, …
Better understanding of:
Linguistic phenomena
Acquisition-related patterns
Speech impediments
More efficient educational and clinical
interventions
Thanks for your
Phon and user manual:
attention!
http://childes.psy.cmu.edu/phon/
http://phon.ling.mun.ca/phontrac/Corpora:
http: //childes.psy.cmu.edu/data/PhonBank/
http:
//childes.psy.cmu.edu/data/PhonBank-Phon/
Discussion Group: phon@googlegroups.com
Technical forums:
Questions, feedback:
http://phon.ling.mun.ca/phontrac/
yrose@mun.ca
Download