Understanding Programming for Phoneticians through Semi-automatic Data Extraction

advertisement

PTLC2005 Takeshi Ishihara Understanding Programming:1

Understanding Programming for Phoneticians through Semi-automatic Data Extraction

Takeshi Ishihara University of Edinburgh / Mejiro University

1 Programming in phonetics After mastering articulatory phonetics and basic speech acoustics, phonetics students usually need to deal with a fair amount of acoustic data for quantitative analysis in their project. In data analysis, it is beneficial for them to conduct semi-automatic data extraction by writing a program, for at least two reasons. One is to save time. Imagine that a student needs to type the duration of target segments and the

F0 value of target points of thousands of files into a spreadsheet/spreadsheets. To do this, they are likely to have to spend a few hours every day throughout a few months sitting in front of a computer. This is quite laborious. Instead, if they write a program, their computer can do this in a minute or less, and they can spend the rest of their time more fruitfully like preparing a literature review, writing their thesis or meeting their friends.

The other reason, which is more important, is to reduce the amount of errors. Manual work is unavoidably subject to errors. For example, in the case above, if they were to type thousands of values into a spreadsheet, they would be likely to mistype some of them. It would be quite difficult, if not impossible, to find and correct them afterwards.

These errors may be serious, or may not be. In any case the fewer errors they make, so much the better. However, those who have little programming experience usually find it difficult to write a program, and, as a consequence, are likely to end up with timeconsuming error-prone manual work.

During the last ten years, along with the ever-lower cost of PCs and the rapid spread of the Internet, several speech processing packages which are available free to end-users and are as good as (or even better than) commercial ones have emerged. Among them,

Speech Filing System (Huckvale, 1987–2005) and Praat (Boersma and Weenink, 1992–

2005) are arguably the best two. The available functions are innumerable and diverse; versions for different platforms including Windows are developed; and, as already mentioned, they are free of charge (whereas it could cost a few hundred pounds to buy a commercial package). Above all, a truly exceptional feature of these two is that they have a built-in scripting language. This is extremely helpful in both teaching and learning phonetic data analysis, since it saves phoneticians the trouble of having to learn a general-purpose programming language.

The remainder of this paper is to give an idea of how to teach or understand an approach to semi-automatic phonetic data extraction, using one of the packages. Since computers cannot understand natural human language (as yet), computer programs must always be written in a computer language. However, it is simply impossible to give a thorough tutorial on writing a program for semi-automatic phonetic data extraction in this article, due to limitations of space. Instead, I will discuss three keys to understanding how to write such a program: variable manipulation, loop, and jump.

1

1

The syntax of SFS and Praat scripting is fully described in their documentations, so the details can be obtained there if required. For a tutorial on Praat scripting, see, for example, Ishihara (2004).

PTLC2005 Takeshi Ishihara Understanding Programming:2

2 Keys to programming for phonetic data extraction Let us suppose the following situation, which a phonetics student is likely to encounter:

1. Speech data are stored as a number of files in a computer.

2. The data are annotated in such a manner that points of time at which some value may be taken are specified.

3. There are some values (i.e. F0, duration, formant frequencies etc.) to be obtained from each file.

What we need then is to write a program which opens all the files one by one, extracts the values needed, and stores them into an output file such as a tabulated text file. The basic structure of a program for such data extraction is essentially the same, whatever acoustic values you are measuring. Therefore, once you learn the basic structure and the syntax of SFS or Praat scripting, you can start writing a program for your own analyses. As mentioned above, since there are three essentials to understanding the basic structure of a program (i.e. how a program works), I think it is always valuable for phonetics students to grasp it before they actually start writing their own program in the syntax of SFS or Praat scripting. These essentials---variable manipulation, loop, and jump---are explained below.

2.1 Variables: When we have obtained some phonetic value from the data, it is nearly always the case that we need some sort of calculation. Let us consider a case in which we want to measure the duration of a segment. If a segment starts at 10 milliseconds and ends at 25 milliseconds in a one-second-long utterance, and the beginning and the end of the segment are labelled ‘p1’ and ‘p2’ respectively, the duration of the segment, i.e., the difference between ‘p1’ and ‘p2’, is 15 milliseconds. This is simple if we are working on a single file, because there is only one value each for ‘p1’ and ‘p2’. However, if we were working on multiple files, we would have many different values for ‘p1’ and

‘p2’. In order to deal with them, we use variables .

Variables are names that refer to values, and a value is assigned to each variable somewhere in a program. The variable, to which a value is assigned, is then interpolated somewhere later in the program as the output. In the duration measurement example given above, we can use variables to calculate the duration as shown below (I will use the syntax of Praat scripting in this article for consistency): a = 10 b = 25 dur = b − a print ’dur’

Variables ‘a’ and ‘b’ are assigned the values ‘10’ and ‘25’ respectively ('=' is the operator for variable assignment here, not an equal sign), and the result of the subtraction ( b - a ) is assigned to the variable 'dur'. The variable 'dur', then, is interpolated in the last line to output the result of the subtraction. These four lines will make Praat display '15' as the output on one of the windows. Note that a newly obtained value is assigned to each of the variables every time a file is processed, so a unique result is obtained for each file.

2.2 Loop: As mentioned above, we need some mechanism to repeat the duration measurement in order to process multiple files. We use loop (also called repetitive

PTLC2005 Takeshi Ishihara Understanding Programming:3 execution) for this. There are various ways of writing a repetition in both SFS and Praat scripting. A typical way is for repetition, which can be written in Praat scripting as follows: for ifile to numberOfFiles a = Get time of point... 1 i b = Get time of point... 1 i dur = b − a print ’dur’ endfor

'ifile' and 'numberOfFiles' above are the variables to which numbers are assigned indicating how many times the relevant instructions are repeated. 'Get time of point … ' is a built-in function to obtain the time of a point, and the values which are assigned to the variables ‘a’ and ‘b’ change depending on where ‘p1’ and ‘p2’ are labelled. Though loop itself is not so complicated (it is normally obvious which instructions in a program are to be repeated), we need to learn how to write a repetitive execution precisely in the syntax of the language we use.

2.3 Jump: While instructions written in a program are normally executed sequentially, jump (also called conditional execution) is used in some cases where some of the instructions are only executed under certain conditions. In the duration measurement example above, it is necessary to make the computer find where a segment begins and ends in an utterance. In our data, relevant points of time, including the onset and the offset of the segment in question, are already annotated. Thus, what we need to do is to make the computer identify the labels annotated at the beginning and end of the relevant segment, among various labels in each file. This is achieved by using jump, and is written in Praat scripting as: for i to numberOfPoints labelOfPoints$ = Get label of point... 1 i if "p1" = labelOfPoints$ a = Get time of point... 1 i elsif "p2" = labelOfPoints$ b = Get time of point... 1 i elsif "p3" = labelOfPoints$ c = Get time of point... 1 i endif endfor dur = b – a

It may be difficult for those who are not familiar with Praat scripting to understand the above excerpt from a program. What it tells the computer to do is to check whether a label matches either 'p1', 'p2' or 'p3', and, if it does, to execute the appropriate block of instructions (e.g. if the label matches 'p1', then the time of 'p1' is assigned to the variable

'a'), repeating until all the labels in the file are checked. After checking all the labels, the result of the subtraction is assigned to the variable 'dur' in the last line: dur = b - a .

3 Concluding remarks As pointed out at the beginning of the previous section, the basic structure of a program for semi-automatic phonetic data extraction is basically the same, and it is advantageous for phonetics students to understand the three keys--variable manipulation, loop and jump---at the very beginning of learning programming.

While it is always necessary to spend some time learning a computer language by trial and error in order to become fluent in it, I believe that a basic understanding of these

PTLC2005 Takeshi Ishihara Understanding Programming:4 three will be the first step for phonetics students to be able to write efficient programs for their own project. This will eventually lead them to perform more advanced phonetic data analyses such as thorough exploitation of a large speech corpus.

*I would like to thank Kayoko Yanagisawa for her advice and valuable comments.

Needless to say, all errors are my own.

References

Boersma, Paul & David Weenink (1992–2005) Praat: doing phonetics by computer.

Available online from www.praat.org.

Huckvale, Mark (1987–2005) Speech filing system. Available online from www.phon.ucl.ac.uk/resource/sfs/.

Ishihara, Takeshi (2004) Semi-automatic data extraction in phonetics: an introduction to

Praat scripting. Proceedings of Sophia University Linguistic Society , No. 18 (Sophia

University, Japan).

Download