George Henry and Robert Zerwekh SEAsite: Web-based Interactive Learning Resources for Southeast Asian Languages and Cultures George Henry Robert Zerwekh Northern Illinois University ABSTRACT SEAsite is a web-based interactive learning resource site for Southeast Asian Languages (Indonesian, Tagalog, Thai, Khmer, Lao, Burmese, and Vietnamese). Its language learning materials feature second language (L2) script support, streaming audio, pictures, and interactive exercise types that allow learners to test their understanding. Many SEAsite resources about culture, politics, music, art, religion, and other subjects related to Southeast Asia are written in English. A nonstandard, but workable, system for rendering Southeast Asian orthographies in web pages and interactive exercises is described. Computer code to support display of L2 characters in Java applets is available to interested parties. KEYWORDS L2 Fonts, Language Learning, Interactive Exercises, SEAsite, Southeast Asian Languages INTRODUCTION Late in 1997, faculty from Northern Illinois University’s Center for Southeast Asian Studies began development of an Internet site dedicated to the delivery and promotion of Southeast Asian languages and cultures. Today, SEAsite (www.seasite.niu.edu) features materials for both beginning and intermediate students of Thai, Indonesian, Tagalog, and Vietnamese, with Burmese, Lao, and Khmer under development. In this paper, we will describe SEAsite, discuss some of its more unusual features, and explain how we solve the orthography problems posed by some of these languages when delivery of material is made exclusively via the Internet. © 2002 CALICO Journal Volume 19 Number 3 499 SEAsite Resources for Southeast Asian Languages With support from the National Security Education Program and the US Department of Education’s Title VI and International Research and Studies programs, SEAsite is now completing its fourth year of development and operation at Northern Illinois University. The language materials consist of text, audio, and pictures, with online dictionaries for Thai, Indonesian, and Tagalog, and interactive exercises and quizzes in many lessons. Each language represented also has extensive English content on the culture, history, religion, and peoples of the country where that language is spoken. This site is used daily by hundreds of teachers, students, business people, and travelers. It is also used in our university’s “smart classrooms” where our Southeast Asian language teachers investigate ways to integrate web-based materials into classroom-based courses. The Internet is especially suited for teaching and learning the less commonly taught languages of Southeast Asia. For Burmese, Lao, and Khmer, in particular, there are few locales in the United States where one can study these languages in a classroom environment. Any language instructional delivery system that does not involve humans working together will not be able to teach all of the communicative language skills. This situation is true not only of SEAsite, but also of books, videos, tapes, and CDROMs. Therefore, SEAsite focuses on what it can do best, that is, on the receptive skills of reading and listening, vocabulary acquisition, and, to a limited degree, the productive skill of writing. The web is in many ways the best mode of production as well as of distribution of language learning materials. Web-based materials are interactive and can include practice and self-assessment activities. They can offer color images, streaming audio, and video, all at the click of a mouse (admittedly at present at relatively modest quality and size). Unlike any other medium of instruction, web content can be continually updated and augmented so that users automatically see the new material the next time they access the site. SEAsite FEATURES Interactive exercise types (implemented in Java) have been developed for SEAsite that allow learners to test and to practice skills such as vocabulary acquisition, reading comprehension, and grammatical sentence construction. For Thai, Vietnamese, Burmese, Khmer, and Lao, all second language (L2) words are displayed in their native orthography. Furthermore, most of the exercise types allow questions to be posed via text or audio, in English or in L2. A description of each Java-based exercise type follows. 500 CALICO Journal George Henry and Robert Zerwekh Multiple Choice The multiple choice format, which is included in most computer based systems featuring interactive testing, is a useful way to test quickly many kinds of knowledge and skills. While such questions are easy to construct and judge, they test recognition rather than production and may encourage random guessing rather than focus on content and meaning. One of the multiple choice exercises on SEAsite presents questions in sets of three and does not permit students to continue until they have answered all three questions in the current set correctly. Moreover, the program specifies only how many of the questions were answered incorrectly, not the specific question(s). Students must find and fix the errors, making random guessing difficult or tedious. In Figure 1, note that the “Judge” button is disabled. The button stays that way until all three questions in the current set are answered. Figure 1 Multiple Choice Quiz Applet The “Hint” button will supply a textual hint and can be disabled by the exercise author. If no audio accompanies the question, the “Play” button is disabled. Volume 19 Number 3 501 SEAsite Resources for Southeast Asian Languages Flashcards Flashcards are a time honored way for students to learn new vocabulary, a vital aspect of beginning language learning (Brown, 1994; Zimmerman, 1997). This exercise presents words in random order each time the exercise is viewed and lets students remove words as they master them. Figure 2 shows a flashcard for Thai with its corresponding image when the learner clicks the “Flip Card” button. Figure 2 Flash Card Applet Matching Matching is another vocabulary acquisition device. While flashcards depend on a student’s introspective honesty in knowing whether a word is correctly recalled, matching gives immediate and explicit feedback. A student clicks one word (English or L2) and then its other language mate. If the selection is correct, immediate graphical feedback is given; otherwise nothing happens. Furthermore, the arrangement of the words on the screen is different each time the exercise is taken, so students cannot depend on a word’s position to remember a given match. Figure 3 displays a matching exercise from the Vietnamese site. Figure 3 Matching Applet 502 CALICO Journal George Henry and Robert Zerwekh Vocabulary Arcade Game The Vocabulary Arcade Game is a third type of vocabulary practice exercise. The game presents an L2 target word together with several pictures moving across the screen. The user’s task is to click the picture of the word shown. If the “hit” is correct, the picture disappears and another word is displayed. Obviously, nouns are easily practiced in this way, but many other grammatical types can be illustrated and tested (e.g., certain verbs and adjectives). Figure 4 presents a Thai recognition exercise of color words. Figure 4 Arcade Game Applet Word Drag and Drop The word-drag-and-drop format is a more flexible and sophisticated exercise type that presents a task involving manipulation of syllables, words, or phrases. The student uses the mouse to drag text chunks (single letters, syllables, words, or phrases, not all of which need be part of the correct answer) into an order that forms the correct answer. This production task Volume 19 Number 3 503 SEAsite Resources for Southeast Asian Languages does not involve any typing, an important point because typing of nonRoman orthographies can be extremely difficult. The question types presented in this exercise can be simple (e.g., asking the student to form a phrase using correct word order) or difficult (e.g., asking the student to form a complete sentence). Feedback is given in the form of edit markup symbols that indicate how to fix the answer (e.g., interchanging two words, eliminating an extra word, or substituting a different word). Figure 5 shows a student’s attempt to construct the sentence stated in English at the top of the display. Figure 5 Word-Drag-and-Drop Applet At the bottom of the display, the asterisk below the word indicates that that word is correct; the angled brackets indicate that the other two words need to be interchanged. The “Play” button, disabled in Figure 5, is enabled if the task has accompanying audio. 504 CALICO Journal George Henry and Robert Zerwekh Picture Drag and Drop Picture Drag and Drop is particularly well suited to listening comprehension questions. A set of pictures is presented on the screen (e.g., books, cups, table, or chair), and the learner is asked, for example, to “put the books on the table.” The learner uses the mouse to drag the pictures to accord with the spoken (or written) directive. Graphical feedback is given to indicate which objects are in the correct place and which ones are not. Figure 6 displays a picture-drag-and-drop example testing vocabulary. Green lines around the object indicate that it is placed correctly (here, Manggis and Durian). Figure 6 Picture-Drag-and-Drop Applet Email Quizzes Email quizzes allow for multiple-choice or short-answer questions which can be posed in written form, as an audio recording, or both and permit feedback as each question is answered (at the author’s discretion) (see Figure 7). Volume 19 Number 3 505 SEAsite Resources for Southeast Asian Languages Figure 7 Email Quiz Applet Normally, results are shown to the student after the last question is answered. Quiz results can be emailed to the student’s instructor. Online Dictionaries Online dictionaries containing over 5,000 words are on the sites for Indonesian, Tagalog, and (to a somewhat limited extent) Thai. Students have easy access to the dictionaries and may look up words while reading or listening. Most entries contain a short English definition, a longer English definition, an L2 definition, and sample sentences. Audio and pictures are available for selected words. Students may print the words looked up during any session (and their accompanying information). The retrieval software for the Indonesian and Tagalog dictionaries determines and searches on the root of a requested word whose inflected form has been entered by the user. Both of these languages are heavily inflected with suffixes, infixes, and prefixes that qualify root forms. Beginning students often find it difficult to determine the root form to look up when a heavily inflected form is encountered. Since dictionaries are organized by root form, this difficulty can be a serious problem. Thus, for Indonesian, for example, if a student enters the word menyewekan, the program determines that the root is sewa, searches for sewa in the dictionary, and then presents that word’s information. For the Tagalog word itinakbo, the program determines and searches on the root form takbo. 506 CALICO Journal George Henry and Robert Zerwekh Streaming Audio Streaming audio with the voice of a native speaker reading text passages is available in many of the lessons. In addition, the dictionary described above can be made available to students at the click of a button, or specialized glossaries may be provided in the case of unusual words or words that are not in the dictionary. LANGUAGE AND CULTURE Culture is often the gateway to learning a language. People generally become interested in learning less commonly taught languages either through travel abroad or experiences with speakers of those languages. All sites have extensive information on the culture, art, history, politics, and religion of their respective countries, much of it in both L2 and English. The Indonesian site has material on Balinese and Javanese dance, the traditional Indonesian music of Gamelan, and a large section on how batik is made and used. Forty-one stories of the epic Ramayana are present (all with L2, English translation, and audio) in addition to 27 fables and folklore stories (with accompanying L2 audio). The site features many vocabulary lessons based on Indonesian concepts such as proverbs, family, the wildlife of the islands, and the foods and recipes of Indonesia. There is also a section on Reformasi, the political movement in early 1998 that led to the downfall of the Suharto government. The Thai site has a section on the writing system of Thailand and the Thai alphabet. Animated graphics (.gif files) show how the characters are drawn, and a Java writing applet lets students practice their writing skills. Thai shadow puppets, famous temples, Thai cuisine, Thai classical music, and the Songkran (the Thai New Year Festival) are on the site, along with self-assessment exercises for tone discrimination of spoken Thai. The Tagalog site contains structured language lessons for both beginning and intermediate students of the language. For both instructional levels, separate lessons concentrate solely on vocabulary acquisition by using a variety of learning strategies. There are also sections on Philippine art, history, pre-colonial times, and Tagalog grammar. Finally, the Tagalog site has a chat room where visitors can send messages in Tagalog and a popular discussion forum for questions about the Tagalog language or features of Philippine life or culture. Our Tagalog staff regularly monitors this forum and provides answers or comments to questions. The core of the Vietnamese site is a set of 20 language lessons for spoken Vietnamese. A pronunciation guide illustrates the consonants, vowels, and tones of Vietnamese, and another section has information on travel, food, currency, and current Vietnamese news. Volume 19 Number 3 507 SEAsite Resources for Southeast Asian Languages We are also developing language and culture materials for Burmese, Khmer, and Lao, which are taught at only a few area studies centers. Published language teaching materials for Burmese, Khmer, and Lao are outdated and have few supplementary teaching resources. We will soon have both beginning and intermediate language learning materials for all three of these languages. Finally, we are creating a free, searchable database of a variety of pictures from these Southeast Asian countries. Users will be able to enter a few keywords (e.g., “rice fields,” “cities,” and “temples”) and receive a web page with thumbnail pictures that match the search criteria. Clicking the thumbnail will deliver a larger version of the picture which users can save on their own computer. ORTHOGRAPHY ISSUES The display of non-Roman orthographies on the web is a complex and difficult issue. The solution described below, used for the SEAsite supported languages (Thai, Vietnamese, Lao, Burmese, and Khmer), evolved over the past several years and was constrained by the following requirements: 1. All interactive activities supported for Roman writing systems (Indonesian and Tagalog) should be equally supported for the non-Roman systems. 2. Users should be able to view non-Roman script with a minimum of trouble. 3. Both PC and Macintosh platforms should be supported. 4. Interactive activities should be more flexible than form-based web pages. 5. It should be practical in order for authors to create large amounts of content. The solution we have developed is not conceptually neat nor does it conform to the emerging Unicode-based standards often advocated. Four years ago, when SEAsite began, the emerging Unicode techniques were not widely understood or supported. These standards are still not firmly in place and multiple display incompatibilities badly complicate L2 script display. Unicode and related HTML standards may be the future, but the present is complex and somewhat broken. Our solution is based on two main ideas: (a) since Southeast Asian languages are alphabetic and contain a limited number of characters, custom fonts are used to display web page content via the FONT FACE tag; and (b) L2 characters are rendered in Java applets for interactive exercises. 508 CALICO Journal George Henry and Robert Zerwekh We describe our admittedly less than ideal solution in some detail. At the beginning (1997), custom fonts seemed like a simple way to render text on web pages. A True Type font, converted to a Macintosh font for Macintosh users, and the FONT FACE tag was used to specify L2 material. As a result, users have only to download and install the font (a few mouse clicks) to see the intended results. However, with new versions of browsers (first Internet Explorer, version 5 and then Netscape, version 6) we found that some L2 characters would no longer display. In fact, the official HTML specification was designed not to show certain characters of a given font but, rather, to display HTML-specified characters. Older browser versions did what we wanted them to do, but newer ones conform (understandably, but inconveniently from our point of view) to the standard and do what they should do. Faced with this new (to us) circumstance, we decided on the following solution. For those languages with more characters than could be displayed in a single font under the standards, we created two fonts—one with the most commonly used characters, and a second with the least commonly used characters. Using a web editor such as FrontPage, the author switches to the second font occasionally when a rare character is needed. Users now have to download and install two fonts for such a language. While this is somewhat awkward, it is not in practice a serious impediment to authoring. Some would argue that L2 characters should simply never be used to replace Roman letters—an ASCII ‘a’ should always be some kind of an ‘a’ and should not be a Thai or Khmer letter. Indeed, if there were a simple and reliable way to avoid this kind of situation, that argument would have much merit. Perhaps the most serious disadvantage is the fact that the HTML FONT FACE tag has been “deprecated,” which means it may not be supported by new browsers in the future. If this were to happen, it would break all pages supporting this tag, not just in SEAsite, but in all pages on the web. Thus, it seems unlikely to happen in the foreseeable future. If, however, it were to occur, it would probably be possible to write an automated program to translate current pages to some new encoding. Early on, student interactivity was made a priority for SEAsite. For maximum flexibility, we decided that most such activities would be supported by Java programs (or applets) which run inside a web page, and several Java applets were designed to support instruction. Java applets are more powerful and flexible than HTML page-based JavaScript; however, for security reasons, applets cannot access fonts on the user’s computer. Therefore some means was needed to present L2 non-Roman script within these applets. Fortunately, the basis for such a system was made available on the web in 1996 by a programmer named Kevin Hughes (who has since disapVolume 19 Number 3 509 SEAsite Resources for Southeast Asian Languages peared from the web).1 Essentially, a Java “GraphicFont” is created as a .gif file. The rendering program, given a letter to display, identifies the part of the graphic that corresponds to the letter and copies it (quickly) onto the screen. Given this beginning, it was not difficult (in principle) to extend the idea to Southeast Asian languages. The major addition was to define and implement custom horizontal positioning of nonspacing characters for each language (i.e., placement of sub- and superscript tones and vowels and, in some cases, compound characters). Each of the Java applets used by SEAsite is designed to be data-driven. That is, the content of the quiz or exercise is typed in by the author and read by the applet when the quiz is presented. In this way, one applet can be made to present many quizzes or exercises with different content (and in different languages). This scheme is necessary when developing large volumes of material since it is far too labor intensive to create a unique applet for each exercise. The system, as it has evolved over the past four years to its present form, is far from elegant. It is also far from standard with respect to Unicode and HTML practices and standards. It is somewhat vulnerable to possible future developments in browsers and standards. Its sole virtue is that at present it does work well enough for authors of web pages as well as users. CONCLUSION The SEAsite project began with the notion that learning materials for Southeast Asian studies should include both language and cultural materials. Language materials should be highly interactive and should support display of the various orthographies of the languages studied. Cultural materials should be designed primarily in English to attract the largest possible audience and perhaps motivate some of them to pursue language and area studies. After four years of development, our web server usage logs, a user survey form, and unsolicited emails indicate that we have succeeded. However, anecdotal reports of this nature are of limited use and generality. It is probably time, now that the web has reached a modicum of maturity as an instructional medium, to conduct more formal analysis of its use. We plan to assess in a more formal manner the strengths and weaknesses of SEAsite materials and of the Internet itself as a learning medium. Formal surveys of users will gather information on students’ attitudes toward SEAsite and Internet instruction. A second area of research will focus on conducting studies of language learning itself in which the Internet is used (almost incidentally) as the medium of instruction. For example, does focused training in distinguishing Thai tones lead to im510 CALICO Journal George Henry and Robert Zerwekh proved comprehension? Does it also lead to improved production? Two advantages of conducting studies over the Internet are that a larger number of subjects can be included (always a problem for the less commonly taught languages) and that student responses can be collected in a central database for convenient analysis. Planning and preparation for both of these areas of research have begun and will be conducted over the next two years. NOTE 1 A complete description of Hughes’ GraphicFont work can be found at www.seasite.niu.edu/jimstest/GraphicFont/graphic_font_information.htm. The authors would be pleased to answer any questions or supply code for the Southeast Asian extensions developed for the system described here. REFERENCES Brown, H. D. (1994). Teaching by principles: An interactive approach to language pedagogy. Englewood Cliffs, NJ: Prentice Hall Regents. Zimmerman, C. B. (1997). Historical trends in second language vocabulary instruction. In J. Coady & T. Huckin (Eds.), Second Language Vocabulary Acquisition (pp. 5-19). Cambridge: Cambridge University Press. AUTHORS’ BIODATA George M. Henry (Ed.D., Instructional Technology; M.S., Computer Science, Northern Illinois University) is Associate Professor of computer science at Northern Illinois University. His research interests are in the areas of CALL and computer processing of Southeast Asian language orthography. He is a project Co-director for SEAsite. Robert A. Zerwekh (Ph.D., Philosophy, University of Illinois; M.S., Computer Science, Northern Illinois University) is Associate Professor of computer science at Northern Illinois University. His research interests are in database design and implementation, in conducting education research via the web, and the web as a learning/instructional medium. He is a project Co-director for SEAsite. Volume 19 Number 3 511 SEAsite Resources for Southeast Asian Languages AUTHORS’ ADDRESSES George Henry Department of Computer Science Northern Illinois University DeKalb, IL 60115 Phone: 815/753-6496 Fax: 815/753-0342 Email: henry@cs.niu.edu Robert Zerwekh Department of Computer Science Northern Illinois University DeKalb, IL 60115 Phone: 815/753-6949 Fax: 815/753-0342 Email: zerwekh@cs.niu.edu 512 CALICO Journal