Grading Difficulties of Tasks by Analyzing the State of Speakers' Performance Naoki Takei Tokyo Institute of Technology 2-12-1, O-okayama, Meguro-ku, Tokyo, 152-8550, Japan takei@ryu.titech.ac.jp 1. The purpose of this study: The purpose of this study is to show an objective way of grading tasks according to their difficulties by analyzing the speakers' states of performance in executing tasks through principal components analysis (PCA), and to demonstrate the validity of this way of grading by comparing the results of the grading with other subjective grading systems, such as ACTFL's oral proficiency interview (OPI). 2. Grading tasks: Various studies on the task complexity and difficulty pointed out many factors that affected them (Ellis, 2003; Nunan, 1989; Skehan, 1998,). However, these studies did not provide clear and objective standards to grade task complexity or difficulty. Ellis (2003) stated, "grading tasks cannot follow a precise algorithmic process but rather must proceed more intuitively in accordance with a general assessment of task complexity” (p. 227). This is not to say that qualitative or intuitive grading by experienced teachers is unreliable, but points out the necessity to study objective ways to measure complexity or difficulty of a task. Robinson (2002) distinguished task difficulty and complexity. However, speakers feel a task to be difficult irregardless of the cause. In this study difficulty and complexity are not distinguished. If the speaker feels the task is difficult, some phenomena, such as hesitations, pauses, the change of speech rate, and the change of plan of speech, communication strategies etc. in the surface of speech performance. In this study, it is not analyzed the properties of task itself, like other qualitative or subjective analysis on task difficulty. Instead the phenomena appear in the surface of speech, that is the state of performance is analyzed to rate difficulty of the task in this study. 3. The state of performance and difficulty of tasks The quality of speaker’s performance is affected by the difficulty of the task, speaker’s proficiency etc.. The quality of performance, namely the state of performance has various properties, such as complexity of utterances, speech rate, variation of expressions. It would be possible to know the degree of difficulty of the task by analyzing the state of performance in executing the task. If Difficulty becomes higher, generally speaking, forms to be used in utterances become more complex and a wider range of vocabulary are needed. If the speaker has sufficient linguistic resources and ability to process them, he or she can afford to use more complex forms and various words in a certain level of fluency and accuracy. If the difficulty of a task becomes higher, allover performance rises, as long as the speaker is proficient enough. If it is possible to summarize the state of various properties of speaker’s performance into a single dimension, the relations between the sate of performance and difficulties of tasks can be drawn in the green line in Graph 1. Graph 1: State of performance level of proficiency state of performance task difficulty The vertical axis is the height of the state of performance, and the horizontal axis is the task difficulty. The higher the task difficulty is, the higher the state of performance as long as the state of performance is not higher than the level performance that the speaker’s proficiency can afford. However, L2 learners often face a lack of L2 knowledge and ability to process complex utterances. Hence, they need to use various strategies to cope with these communication problems. A Communication Strategy (hereafter CS) is a strategy to compensate for insufficient linguistic knowledge of a second language. And L2 speakers often simplify the contents of messages or linguistic forms, because they do not have enough self-confidence or cannot afford to process complex forms and/or information. This is the strategy of simplification. When simplifications and CSs are executed the forms the speaker uses are controlled or forced to be limited to the simpler ones. Then variation of linguistic forms does not increase even if the speech becomes longer, and often erroneous utterances appear; because sometimes speakers contrive new expressions by using strategies like word coinage, direct translation from words and expressions in their first language etc. (Faerch & Kasper, 1983). In addition, speakers naturally need time to process these strategies. These make the overall state of performance lower, Graph 2: The crucial point states of performance level of proficiency crucial point task difficulty Up to a certain level of Difficulty of tasks, speakers do not feel difficulty in executing tasks because their proficiency is high enough to cope with these tasks, even if they use some CSs and simplifications. However, at a given point, Difficulty becomes higher than or equal to the speaker’s proficiency. From here on, the state of performance will not become higher than before. After this point, the state of performance will decline or at least remain unchanged, even though Difficulty rises, because the speaker needs to execute various time-consuming strategies and face serious communication problems. Let us name this point the crucial point. Graph 3: state of performance State of Performance Crucial Point Task Difficulty Therefore, it is possible to assume (1) (2) (3) as follows, (1) The states of performance rise along with Difficulty as long as the level of Difficulty is lower than the crucial point. (2) If Difficulty becomes higher than the crucial point, the states of performance do not rise but decline or at least remain unchanged (3) Values of the states of performance and Difficulty at the crucial point of speakers of high proficiency are higher than those of low proficiency Hypotheses (1) to (2) are visualized like this graph (Graph 3). Graph 4: the crucial points and proficiency states of performance more proficient speaker Crucial points Difficulties of tasks The values of the crucial point are decided by the proficiency. When speaker’s proficiency is higher, the crucial point is located upper right like in Graph 4. 4. Way of grading task difficulty Graph 5: Difference of performance Let us draw the graphs of three groups of speakers at the different levels of proficiency. High is the graph of the state of performance and difficulty of tasks, of the speakers at highest proficiency, and Low is that of the speaker at lowest proficiency, and Mid is between High and Low. When we compare the difference of states of performance of 3 groups of speaker at the point d1, the differences are not large, and at d2, after the crucial point of Low, the differences spread, and at d3 after the crucial point of mid, the differences become largest. Therefore; HYPOTHESIS (A) If Difficulty is low, the differences between subjects of different proficiency tend to be smaller; and if Difficulty is high, the differences tend to be larger. If (A) is right, the differences of means of the states of performance among the groups of subjects of different proficiency can be a measure to grade Difficulty of tasks. This is not to insist, the state of performance behave exactly in the way shown in the graphs, but the graph is a visualization of Hypothesis (A). What the hypothesis means is proficient speakers have higher tolerance against Difficulty of the task, and the low proficient speakers are not. Their state of performance decline in lower Difficulty of tasks and decline rapidly. It is measured the state of performance by (1) selecting the parameters to represent the state of performance, which is multi dimensional data, and (2) summarizing the data by principal component analysis (PCA) into single dimension. The principal component scores represent the sates of performance (Takei & Akahori 2005a). 5. Experiment 1. Tasks are set by having the interviewer ask questions. Interviews are recorded and parameters to represent the state of oral performance are measured 2. The data are analyzed by PCA, and the result is compared with the ratings of the proficiency of learners according to the standard of ACTFL’s oral proficiency interview. Subjects are classified into three groups i) Native speakers of Japanese Language, the number is 17, shown as (N). ii) Learners of Japanese Language at advanced level according to ACTFL’s standard, the number is 19, shown as (H) iii) Learners of Japanese Language at intermediate level according to ACTFL’s standard, the number is 18, shown as (L) Tasks; BL: Description of daily life, such as the schedule of weekdays and the route to go to school or work place etc. This task is thought to be a task at the intermediate level according to the ACTFL standard (Buck, 1989). NR: Narrating a story, such as explaining the contents of a novel or a film the subject knows. This task is thought to be at the advanced according to the ACTFL standard. OP: Giving opinion on social, political and cultural issues or problems. This task is thought to be at the advanced or superior level according to the ACTFL standard. Parameters; 1. Complexity of Utterances (rGM): this parameter represents the degree of complexity of the forms used in execution of tasks. 2. Semantic Variation (rUL): when the value of semantic variation is high, speakers transition smoothly from one topic to another in their speech (Ito, 2002). 3. Grammatical Variation (rUG): this parameter represents the richness of grammatical forms in utterances. 4. Modal Frequency (rPM): modality is a very important element in Japanese language; this parameter shows the frequency of modalities. 5. Speech Rate (sSR): this parameter shows how fast or slow the speech is. If speakers pause to think many times, this parameter becomes low. Ways to calculate the parameters; 1) Complexity of Utterances (rGM): tokens of functional words and modalities of truth and value judgment (Masuoka 1991) in terms of every 100 semantic words. Namely: the number of functional words (token) nFN the number of modalities of truth and value judgment (token) nJM the number of semantic words (token) nSW ∴ rGM= 100*(nFN+nJM)/nSW 2) Semantic Variation (rUL): the number of lexis of semantic words (vSW) in terms of every 100 semantic words ∴ rUL= 100*vSW/nSW 3) Grammatical Variation (rUG): the number of lexis of functional words (vFN) and the modalities of truth and value judgment (vJM) in terms of every 100 semantic words. ∴ rUG= 100*(vFN+vJM)/nSW 4) Modal Frequency (rPM): frequency of modalities except truth and value judgment; the numbers of tokens of these modalities (nPM) in terms of every 100 semantic words. ∴ rPM= 100*nPM/nSW 5) Speech Rate (sSR): the average number of moras uttered in a second; the number of mora (nMR) is divided by the total time of the speaker’s turn, taken to complete the task (tMT). ∴ sSR= nMR/tMT Analysis: The parameters measured are analyzed by principal component analysis in each task. The principal component that is thought to reflect the differences between the states of performance of subjects is identified from the pattern of values of eigenvectors. Then principal component score of each subject in each task are calculated by the appropriate eigenvector, and means of principal component score are compared between each group of subjects at the different proficiency level. 6. Result and discussions Table 1 shows the eigenvector of PCA. Table 1: Eigenvector of PCA BL NR OP 1st 2nd 3rd 1st 2nd 3rd 1st 2nd 3rd rGM 0.589 -0.136 -0.483 0.421 0.603 -0.371 0.500 -0.461 -0.242 rUL 0.136 0.659 0.480 0.453 -0.497 -0.172 0.378 0.281 0.855 rUG rPM sSR 0.516 0.455 -0.336 0.496 -0.084 0.570 0.351 -0.578 0.313 0.501 -0.328 -0.420 0.411 -0.202 0.749 0.444 0.492 0.309 0.497 0.267 -0.274 0.373 0.616 -0.343 0.470 -0.507 0.133 The parameters used in this study represent high states of performance when their values are high. According to this Table, all values of eigenvectors of the first principal component are positive and comparatively larger than other. As for other eigenvectors, the values of some parameters are negative, and one or two specific parameters are too small or too large comparing to other values of the same eigenvectors. Therefore, only eigenvectors of the first principal component can represent the states of performance of speakers. Table 2: Means of principal component scores BL N H L N-H N-L H-L NR 0.613 0.085 -0.669 0.527 1.281 0.754 OP 0.735 0.158 -0.861 0.577 1.596 1.019 0.917 0.108 -0.980 0.809 1.897 1.089 Graph 6: Means of principal component scores 1.50 N H L 1.00 0.50 0.00 BL NR OP -0.50 -1.00 Principal component score of each speaker in each task is calculated based on the data -1.50 of eigenvector of the first principal component. N is the native speakers of Japanese, H is the learners at advanced level, and L is the learners at intermediate level. N-H shows the differences between native speakers and the learners at the advanced level and so on. Means of principal component scores of the first principal component of each group of subjects (N, H, L) and the differences of means of principal component scores between every two groups of subjects in three tasks are shown in the table. The means of principal component scores of each group of subjects in three tasks are plotted in the graph. According to these results, native speakers' mean of principal component scores is lowest in BL, and higher in NR, and highest in OP. That is in order of BL, NR, and OP. However, intermediate learners’ means of principal component scores are reverse. It declines in the order of BL, NR; OP. Advanced learners’ means of principal component scores are almost unchanged. According to this graph, differences among three groups of speakers spread in order BL, NR and OP. Therefore, according to the hypothesis, OP is the most difficult task, and NR is the next, BL is the easiest. According to ACTFL’s OPI standard, OP is higher than or equal to NR, and BL is the easiest. Broadly speaking, the grading in this study has the same result as ACTFL’s OPI. This is the confirmation for the validity of the objective grading using PCA. In Principal Component Analysis, the values of eigenvectors reflect the variability of parameters. For examples, In BL, the value of Semantic Variation (rUL) is obviously smaller than other parameters. If speakers fail to explain what they would like to say, they tend to try to repeat the same explanation in different ways. In this case, many of the same words would be used again. Therefore, the variety of semantic words does not increase. Then if speakers have difficulties transmitting what they have in mind, Semantic Variation (rUL) becomes lower. Because BL is not a difficult task, most subjects do not seem to have difficulties in executing the task. Therefore, the variability of Semantic Variation (rUL) in BL is small. In this way, it is possible to know the characteristics of a task in an objective way. 7. Future study More precise study is possible according to the result of the statistical analysis. Though the values of eigenvector can suggest the difference of the characters of tasks, the precise analysis should be undertaken by other statistical method, such as multiple comparisons etc. References: Buck, K. (ed.). (1989). The ACTFL oral proficiency interview tester training manual. Yonkers; The American Council on the Teaching of Foreign Languages. Ellis, R. (2003). Task-based language learning and teaching. Oxford:Oxford University Press Faerch, C., & Kasper, G. (1983b). Plans and strategies in foreign language communicationIN In Faerch, C., & Kasper, G. (eds.). Strategies in interlanguage communication (pp. 20-60). Burnt, Mill, Harley:Longman Group Ltd. Ito, M. (2002). Keiryo Gengogaku Nyumon (An introduction to Mathematical lingusitics). Tokyo Taishukan Publishing. Masuoka, T. (1991) Modality no bunpoo (The grammar of modality). Tokyo; Kuroshio Publishing Nunan, D. (1989). Desigining tasks for the communicative classroom. Cambridge: Cambridge University Press. Robinson, P. (2001). Task complexity, task difficulty, and task production: Exploring interactions in a componential framework. Applied Lingusitics, 22/1, 27-57 Skehan, P, (1998). A cognitive approach to language learning. Oxford: Oxford University Press Takei, N. & Akahori, K. (2005) Analysis on the differences between native speakers and the learners of Japanese language on the performance and use of functional words. Mathematical Linguistics. 24/8, 382-396