Computer-assisted learning of Croatian language stress system (CAL-CROLESS) Nives Mikelić, Tomislava Lauc, Kruno Golubić Department of Information Science, Faculty of Philosophy University of Zagreb I. Lučića 3, Zagreb, Croatia nmikelic@ffzg.hr, tlauc@ffzg.hr, kgolubic@ffzg.hr Abstract. The aim of the paper is to present the computerassisted learning of Croatian language stress system as a base for the development of the computer-assisted Croatian ortography learning system. The system includes two main modules: the dictionary and the rule list for stress assignment and stress realization in Croatian. It is possible to use it for knowledge pretesting, learning and final testing of the acquired knowledge. Two different learning approaches were taken: one was based on acquiring the knowledge concerning the word-stress patterns that determine the word-stress assignment, and the other was sample-based learning. The goal was to probe the fruitfulness of both methods for dealing with a stress assignment problem for Croatian. I. INTRODUCTION Until recently, computer-assisted language learning (CALL) was a topic of relevance mostly to those with a special interest in that area. Today, the majority of language instructors must now begin to think about the implications of computers for language learning. In the practice courseware the computer serves as a vehicle for delivering instructional materials to the learner. It is proved that repeated exposure to the same material becomes beneficial or even essential for learning. This makes computer ideal for carrying out repeated exercises, since it does not get bored with presenting the same material and can provide immediate feedback. A computer can also present such material on an individualized basis, enabling learners to proceed at their own pace. Multimedia technology integrated with the computer assisted learning allows the different media (text, graphics, sound, animation, and video) to be accessed on a single machine. This creates a quite authentic learning environment, whereas skills are easily integrated since the variety of media make it natural to combine reading, writing, speaking and listening in a single activity. Although widely used for English and many other languages, computer-assisted language learning is not much applied to Croatian language yet. There is some software for Croatian language beginners who want to learn basic phrases, colors, numbers, food, shopping, time etc published in English, but CALL-CROLESS is a pioneer in the application of this technology to Croatian orthography, especially to Croatian language stress system. The reason why we developed the CALL-CROLESS is given in the next section. II. MOTIVATION How important is the stress in Croatian language? A small number of words in Croatian have no stressed syllable of their own (most prepositions and the word “ne” (not) in front of a verb are proclitics and hang on to the next word; certain pronoun and verb forms are enclitics, hanging on to the previous word. Apart from these, every word form has one stressed syllable (some compound words have more than one). Croatian language has three dialects (stokavian, cakavian and kajkavian) and they differ significantly in the stress realization. Stress does not differ only across local dialects, but even across idiolects. It is the primary distinguishing feature by which we can recognize the origin of a speaker. Stressed syllables are called either rising or falling, and contain a long or a short vowel. Traditional notation in grammars and dictionaries combines these two features, using four stress marks: short falling \\, long falling ^, short rising \ and long rising ⁄. The names of the stress marks suggest a pitch change on a given syllable. Pitch ascends within long rising stressed vowels, and drops during long fallings. Most of the native speakers of Croatian can tell a long stressed vowel from a short vowel, but don't reliably distinguish rising from falling. They also tend not to shift the lexical stress from one syllable to another when making different word-forms. Thus, the error in accentuation can lead to misunderstanding. That was one of the reasons to build a computer system for improving learning of Croatian language stress system. The other important reason was to investigate the results the learners achieve regarding their linguistic background and native dialect. It has been proved that the learners’ motivation and interest raise even if they do a typical fill-in-the-gap exercise on the computer rather than if they do it on a piece of paper or a book. In a way, doing tests in this form implies giving traditional exercises a new format and results in a quite successful experience, since learners enjoy the mere fact of being manipulating computers. Besides, computer allows us to give exercises a more attractive aspect by means of colour, different letter styles, pictures, graphs, etc. Therefore, we tried to develop the CALL for Croatian stress system, considering the importance of presenting it in a visual way. So, how important is the stress in Croatian language? Croatian language has three dialects (stokavian, cakavian and kajkavian) and they differ significantly in the stress realization. Stress does not differ only across local dialects, but even across idiolects. It is the primary distinguishing feature by which we can recognize the origin of a speaker. Most of the native speakers of Croatian can tell a long stressed vowel from a short vowel, but don't reliably distinguish rising from falling. They also tend not to shift the lexical stress from one syllable to another when making different word-forms. Thus, the error in accentuation can lead to misunderstanding. That was one of the reasons for building a system to ease the learning of Croatian language stress system. The other reasons was to investigate the results that learners achieve regarding their linguistic background and native dialect. III. SYSTEM DEVELOPMENT A. Phase 1. CAL-CROLESS is the computer-assisted learning system developed for the improvement of the recognition of the Croatian prosody. Learning strategy includes an interactive course of prosody with progressive exercises and consistent feedback. Also, the aim is to make learners aware of prosody in general (lexical stress, rhythm and intonation) and Croatian prosody in particular. A database of words uttered by native speaker of Croatian serves as a support for the learning and reference for the correction of learner's productions. Words in the database are given both in the written and the spoken form making the dictionary. We address word-level stress detection of Croatian, where stressed syllables are characterized by not only power level, but also pitch, duration and vowel quality. General stress assignment and stress realization rules in the standard language are given in the Table1. Beside these general rules, there are more specific rules, such as the rule describing that the falling lexical stress can be carried over to a preceding word: ne + znȁm = nè_znam (I don’t know). In the standard language this happens when ne is added to a verb form, and in some preposition + object phrases: sȁ_mnōm (with me), sȁ_sobom (with oneself), ù_grad (in the town). Also, verbs which are derived by adding prefix to the infinitive verb keep the lexical stress unchanged if the verb they are derived from has the rising stress. On the other hand, verbs which are derived by adding prefix to the infinitive verb that has the falling lexical stress get the short rising stress on their prefix. Even more, there are rules for adjectives where twosyllable comparative form needs to have the short falling stress, while in the polysyllabic comparative stress usually falls on the third syllable, counting from the end of the word. The superlative of the two-syllable comparative always has the long falling stress on the prefix naj, while the superlative of the polysyllabic comparative may also inherit the stress of the comparative, besides the superlative stress, etc. All of the above listed rules for stressed syllables are built in the system. B. Phase 2. The exercises were conceived as true learning processes rather than testing procedures. Up to this point the following features were incorporated: the system analyzes the learner's response at every stage giving the consistent feedback, that is based on the prediction of the most probable deviations. The system detects the deviations by comparison between the answer of a learner and a target. The learner is not placed in the somewhat demotivating position of being given a limited number of attempts before having to proceed to the next question, because of the possibility to choose at any stage whether, either to give up if his solution was wrong or to have as many further tries as he wishes. He can choose between receiving the immediate feedback on completion of each question and receiving the complete statistics after answering the each group of questions. The program offers explanations if learner gives the wrong answer. In standard tests, written in paper form, learners are not offered any feedback until the whole exercise has been completed. This seems to be one of the major disadvantages in the learning process. Table1. General stress assignment and stress realization rules in Croatian language STRESS RULES IN THE CROATIAN LANGUAGE Monosyllabic words may have only a falling stress Falling stress may occur only on the first syllable Two-syllable and polysyllabic words may have all 4 stresses on the first syllable Two-syllable words have stress on the first syllable; words of three or more syllables may have stress on any syllable except the last Stress can never occur on the last syllable of polysyllabic words Syllables which are not first nor last (so called inner syllables) may have only a risinig stress Although correction is suggested at every stage, it does not count towards the end result. Each question in the testing part of the system has a button startng the sound of the word that user is then supposed to write in the input text field. This is a kind of optional dictation used because the different users actually pronounce the words depending on their dialect background. Yet, if the learner finds the listening part distractive, he does not have to use the sound at all. Nevertheless, we think that the listening module should be interpreted as the extension of the system that offers possibility of teaching and training the skills of listening Croatian stressed words as well. At the end of the each group of questions, the screen displ-ays a score count of right answers. The program starts with the initial quiz for testing the basic level of stress knowledge, i.e. it simply tests the knowledge of recognizing lexical stress and assigning proper stress mark to the monosyllabic words. Based on the results obtained, each learner continues following his own path. If all the questions in the initial quiz are solved correctly, one can proceede to the second, somewhat more difficult level. On the contrary, if learner knows nothing about the stress in Croatian, the system will take him to the learning module for beginners. This is the place where two different learning approaches take place. The deductive learning approach is integrated in the module consisting of some short introductory notes on different kinds of lexical stress in Croatian together with the pronunciation and followed by the main rules for the monosyllabic words and some examples. The system continues with a new group of questions in the second level, to which the learner responds at the keyboard and the learning process continues. The module built on inductive learning approach consists of words together with their pronunciation. Words are quite similar to those contained in the initial test, but yet organized in groups where the similarity between the members of the group is high as well as the dissimilarity between two groups. Learners are led to carry out conclusions by themselves, which are actually written rules explicitly given to those who are following the deductive learning module. The system proceeds with the second level, using the same group of questions as the deductive module. On completion of the second quiz, the learning process is again inductive. In this way, both groups of learners solve the same questions on each level, but the transition between levels is specific for each module. In the deductive learning approach, explicit statements and examples are consciously learnt and presented in a systematically organized way. Statements are given in the form of rules and learners have to memorize those rules. Examples are used to explain to learners the theoretical points and to assign mechanical tasks. Besides, the error correction is also considered as quite relevant. The samplebased learning approach implies working with the stress system in an inductive way. Thus, learners work out rules from data, form hypotheses and test them. B. Phase 2. The final test is solved by both groups at the end of the seventh level and the statistics is written to a database. The way the statistics is collected is explained in the Fig1. Hence, the system tracks cumulative results from a sequence of interactions and passes them along to the web server. This is performed in a cycle where Flash form collects and sends data from the user to the database via an ASP Page that takes the data sent from the Flash interface and stores it in the database without any change. Finally, Microsoft Access 2000 database is used to store our data. The database offers teachers and instructors two kinds of statistics; the first one presents different groups of the questions together with the number of correct answers, while the second offers an overall summary of the results that the group of learners achieved as a whole. Statistic results allow teachers and instructors to determine the aspects of the language stress system that learners have found the most difficult or those they have not understood correctly. IV. DISCUSSION AND EVALUATION The system was tested preliminary in the laboratory on two (unfortunately unequal) groups of students, under the same conditions. The students were all second and third year students of the Faculty of Philosophy, University of Zagreb. Some of them had the solid linguistic background, but the majority of them had little or no knowledge on lexical stress. In the beginning, we got the impression that they accepted the task of being tested on using the lexical stress and learning the lexical stress rules quite toilsome. Nevertheless, they all made it to the final test. The final test which followed the deductive learning approach resulted with the following statistics. Less than 50 % of the final test solved 22 students. 22% of them achieved the same result as in the exercises in the phase 2 (in other words, they learned nothing), 10% of the students improved their knowledge, while 68% achieved worse results than in the phase 2. Around 50% of the final test solved only 5 students, but out of that number 40% improved their knowledge, while 60% achieved worse results than in the phase 2, and nobody achieved the same results as in the phase 2. Around 75% of the final test solved 8 students. 25% of them achieved the same result as in the phase 2, 25% improved their knowledge, while 50% achieved worse results than in the phase 2. More than 80% of the final test solved 21 student and 52% of them achieved the same result as in the phase 2, 48% improved their knowledge, and nobody achieved worse results than in the first part of the exercise. On the other hand, the final test which followed the sample-based learning approach shows the following statistics. Less than 50% of the final test solved 22 students, the same number as in the deductive learning approach. Out of that number, 31% achieved the same result as in phase 2 (in other words, they learned nothing), 19% of the students improved their knowledge, while 50% achieved worse results than in the phase 2. Around 50% of the final test solved only 2 students, but they both improved their knowledge, comparing to the phase 2. Around 75% of the final test solved 3 students, where 2 of them achieved the same result as in the phase 2, while only one achieved worse results. Nobody improved his / her knowledge. More than 80% of the final test solved 3 students again. This time 2 of them achieved the same result as in the phase 2, one improved his knowledge, and nobody achieved worse results than in the phase 2. Table2. Comparison between deductive and inductive learning approach regarding the number of students that solved the test and the percentage of the final test success Fig. 1. Representation of the FlashMX-server-database communication Final test results < 50% around 50% around 75% > 75% DL approach 22 students 5 students 8 students 21 students Total: 56 students IL approach 22 students 2 students 3 students 3 students Total: 30 students Table3. Comparison between DL and IL approach for students that achieved < 50% in the final test regarding their previous results DL approach IL approach students that solved < 50% of the final test for 22% of the for 10% of the for 68% of the students results students results students results are equal to are better than are worse than previous previous results previous results results for 31% of the for 19% of the for 50% of the students results students results students results are equal to are better than are worse than previous previous results previous results results Table4. Comparison between DL and IL approach for students that achieved 50% in the final test regarding their previous results DL approach IL approach students that solved 50% of the final test for 0% of the for 40% of the for 60% of the students results students results students results are equal to are better than are worse than previous previous results previous results results for 0% of the for 100% of the for 0% of the students results students results students results are equal to are better than are worse than previous previous results previous results results Table5. Comparison between DL and IL approach for students that achieved 75% in the final test regarding their previous results DL approach IL approach students that solved around 75% of the final test for 25% of the for 25% of the for 50% of the students results students results students results are equal to are better than are worse than previous previous results previous results results for 75% of the for 0% of the for 25% of the students results students results students results are equal to are better than are worse than previous previous results previous results results Table6. Comparison between deductive and inductive approach for students that achieved more than 80% in the final test regarding their previous results DL approach IL approach students that solved more than 80% of the final test for 52% of the for 48% of the for 0% of the students results students results students results are equal to are better than are worse than previous previous results previous results results for 75% of the for 25% of the for 0% of the students results students results students results are equal to are better than are worse than previous previous results previous results results As one can see from the given statistics, these preliminary results are hard to compare because the two groups are unfortunately not of the same size. We consider that as an obstacle in finding general conclusions. But, there are some issues which are worth of emphasizing. Most of the students who had no knowledge of lexical stress got confused in the final test, after passing all seven levels. But, some of them still learned something and inductive learning approach proved to be better at this stage. Also, if they had some previous knowledge of the lexical stress, they improved it following the inductive learning approach again, but this time they got less confused at the final test. On the other hand, students that had some linguistic background never got confused, no matter which approach they followed, but the deductive learning approach proved to be more efficient for them. Therefore, we could say that only after the preliminary testing results CAL-CROLESS is proved to be a good tool for students with some linguistic background. Also, deductive learning approach seems to be more suitable for them. On the contrary, sample-based approach appears to be more adaptive to students with no or little knowledge of the field, but this conclusion needs some more investigation. The fact we noticed during the testing is that students start to feel weariness somewhere in the beginning of the final exam. We suggest therefore the introducing of a short break after the last level of the learning process, so that students can access the final test only after the break and not immediately after the learning process. Overall impression was that students found the listening of the words very helpful and also enjoyed this kind of interaction. Furthermore, they used the “check answer” option a lot, although it increased the time they needed to finish the whole test and regardless of the fact that they will be introduced with the results on the end of each set of questions. In spite of the fact that the task to learn the lexical stress was quite toilsome, nobody complained at the end of the session and that proves that CAL-CROLESS accomplished at least one of its tasks. V. CONCLUSION AND FURTHER RESEARCH Multimedia computing, the Internet, and the World Wide Web have provided an incredible boost of Computer Assisted Language Learning applications. The benefits of adding a computer component to language learning are many, including the multimodal practice with feedback, individualization in a large class, small group work on projects, either collaboratively or competitively, the fun factor, etc. We found that real-life skill-building in computer use and variety in the learning styles used made the CAL-CROLESS very attractive to our students. One of the advantages of CAL-CROLESS is its sensitivity to the learner’s level of proficiency. It allows learners to assume mastery of their own learning experience. Another one of its advantages is that it gives a new role to language stress teaching materials. Teaching materials in the written textbook form are usually passive, but in CAL- CROLESS, because of the interactivness, materials adapt themselves to the requirements of the individual student. REFERENCES The computer skills of learners are not important for the success of the learning because the interface is very simple and yet very motivating so that even learners with little or no knowledge of computers can easily go through the learning process. Regarding the technical issues we did not face any problems with different platforms or computer screen resolutions because we used Macromedia FlashMX editor and made interface that loads quickly and looks the same on different platforms. Further technical development would include the option for the learner to request a clue while solving the each test. Also, at the end of each exercise system would display not only the score, but also the variation of the exercise with the learner’s answers, both true and false. The result counter could be set to be much more precise, it could count second answers but weight them less than the first attempts, etc. Furthermore, we plan to work on the development of the CAL-CROLESS which will bring some new insights regarding the dialectal background of the user. Our system can easily track the typical mistakes users make while solving the tests and therefore could be extended quite successfully to help users to correct these errors that come from their dialectal background. Finally, our plan is to develop the whole computer-assisted Croatian orthography learning system based on the skills and techniques we learned, problems we solved and results we obtained building the computer-assisted learning application for the Croatian language stress system. [1] Babić, S. “Tvorba riječi u hrvatskom književnom jeziku.” Nacrt za gramatiku. JAZU, Zagreb, 1986. [2] Barić - Lončarić - Malić - Pavešić - Peti - Zečević Znika. “Hrvatska gramatika”. II. promijenjeno izdanje. Školska knjiga, Zagreb, 1997. [3] Garde, P. “Naglasak.“ Školska knjiga, Zagreb, 1993. [4] Jelaska, Z. “Fonološki opisi hrvatskoga jezika; Glasovi, slogovi, naglasci.“ Hrvatska sveučilišna naklada, Zagreb, 2004. [5] Katičić, R. “Načela standardnosti hrvatskoga jezika. “ Jezik 43 (5), 1996, 175-182. [6] Mildner, V. “Perceptual acquisition of the long-hort distinction in the falling accents of standard Croatian.“ Language and Spech, 37 (2), 1994, 163-170. [7] Moguš, M. “Fonološki razvoj hrvatskog jezika“. Matica hrvatska, Zagreb, 1971. [9] Škarić, I. “Razlikovna prozodija.“ Jezik 48 (1), 2001, 11-19. [10] Šonje, J. (ur.) “Rječnik hrvatskoga jezika.” Leksikografski zavod Miroslav Krleža i Školska knjiga, Zagreb, 2000. [11] Chapelle, C. and Jamieson, J. “Research Trends in Computer Assisted Language Learning“. In Pennington, Martha C. Teaching Language with Computers. La Jolla: Athelstan, 1989.