Western Malayo-Polynesian Although Western Malayo-Polynesian is a convenient cover term for the Austronesian languages of the Philippines, western Indonesia (Borneo, Sumatra, Java-Bali-Lombok, Sulawesi), mainland Southeast Asia, Madagascar, and at least Chamorro and Palauan in western Micronesia, it is in effect a catchall category for the Malayo-Polynesian languages that do not exhibit any of the innovations characteristic of Central-Eastern Malayo-Polynesian and may very well contain several primary branches of Malayo-Polynesian. As mentioned previously, some of the largest and best-known Austronesian languages—including Ilokano, Tagalog, Cebuano, Malay, Acehnese, Toba Batak, Minangkabau, Sundanese, Javanese, Balinese, Buginese, Makasarese, and Malagasy—are Western Malayo-Polynesian. Central Malayo-Polynesian (CMP) The Central Malayo-Polynesian languages are found throughout much of eastern Indonesia, including the Lesser Sunda Islands from Sumbawa through Timor, and most of the Moluccas. Many of the changes that define this linguistic group cover most of the languages but do not reach the geographic extremes, and the group has therefore been questioned by some scholars. Few of the languages are large or well-known, but those for which fuller descriptions are available include Manggarai and Ngadha, spoken on the island of Flores; Roti, spoken on the island of the same name; Tetum, spoken on the island of Timor; and Buruese, spoken on the island of Buru in the central Moluccas. South Halmahera–West New Guinea (SHWNG) This small group of Austronesian languages is found in the northern Moluccan island of Halmahera and in the Doberai Peninsula (also called Vogelkop or Bird’s Head) of western New Guinea. Preliminary descriptions exist only for Buli of Halmahera and Numfor-Biak and Waropen of western New Guinea; most of the languages are known only from short word lists. Oceanic (OC) The Oceanic subgroup is the largest and best-defined of all major subgroups in Austronesian. It includes all the languages of Polynesia, all the languages of Micronesia (except Palauan and Chamorro), and all the Austronesian languages of Melanesia east of the Mamberamo River in Indonesian New Guinea. Some of the betterknown Oceanic languages are Motu of southeastern New Guinea, Tolai of New Britain, Sa’a of the southeastern Solomons, Mota of the Banks Islands in northern Vanuatu, Chuukese (Trukese) of Micronesia, Fijian, and many Polynesian languages, including Tongan, Samoan, Tahitian, Maori, and Hawaiian. Yapese, long considered unplaceable, now appears to be Oceanic, although its place within Oceanic remains obscure. Lower-level subgroups Philippine languages One of several identifiable lower-level units within these major subgroups is the Philippine group within Western Malayo-Polynesian. It consists of Yami, spoken on Lan-yü (Botel Tobago) island off the southeastern coast of Taiwan; almost all the languages of the Philippine Islands; and the Sangiric, Minahasan, and Gorontalic languages of northern Sulawesi in central Indonesia. The Samalan dialects—spoken by the Sama-Bajau, the socalled sea gypsies in the Sulu Archipelago, and elsewhere in the Philippines—do not appear to belong to the Philippine group, and their exact linguistic position within the Austronesian family remains to be determined. Although the term Philippine language or Philippine-type language has been applied to such languages as Chamorro of the Mariana Islands or the languages of Sabah in northern Borneo, this label is typological rather than genetic. Polynesian languages Perhaps the best-known lower-level subgroup of Austronesian languages is Polynesian, which is remarkable for its wide geographic spread yet close relationship. The “Polynesian triangle,” defined by Hawaii, Easter Island, and New Zealand, encloses Polynesia proper, an area about twice the size of the continental United States. In addition, some 18 Polynesian-speaking societies, the above-mentioned Polynesian Outliers, are found in Micronesia and Melanesia. The Polynesian languages generally are divided into two branches, Tongic (Tongan and Niue) and Nuclear Polynesian (the rest). Nuclear Polynesian in turn contains Samoic-Outlier and Eastern Polynesian. Maori and Hawaiian, two Eastern Polynesian languages that are separated by some 5,000 miles of sea, appear to be about as closely related as Dutch and German. The closest external relatives of the Polynesian languages are Fijian and Rotuman, a non-Polynesian language spoken by a physically Polynesian population on the small volcanic island of Rotuma northwest of the main Fijian island of Viti Levu; together with Polynesian, Fijian and Rotuman form a Central Pacific group. A number of proposals have been made regarding the immediate relationships of the Central Pacific languages; the majority of these suggest a grouping of Central Pacific with certain languages in central and northern Vanuatu, but these proposals remain controversial. Nuclear Micronesian Most of the languages of Micronesia are Oceanic, and, with the possible exception of Nauruan, which is still poorly described, they form a fairly close-knit subgroup that is often called Nuclear Micronesian. Palauan, Chamorro (Mariana Islands), and Yapese (western Micronesia) are not Nuclear Micronesian languages; the former two appear to be products of quite distinct migrations out of Indonesia or the Philippines, and, while Yapese probably is Oceanic, it has a complex history of borrowing and does not readily seem to form a subgroup with any other language. Aberrant languages Yapese is one of several problematic languages that can be shown to be Austronesian but that share little vocabulary with more typical languages. Other languages of this category are Enggano, spoken on a small island of the same name situated off the southwest coast of Sumatra, and a number of Melanesian languages. In the most extreme cases the classification of a language as Austronesian or non-Austronesian has shifted back and forth repeatedly, as with the Maisin language of southeastern Papua New Guinea (now generally regarded as an Austronesian language with heavy contact influence from Papuan languages). Other controversial or aberrant languages are Arove, Lamogai, and Kaulong of New Britain, Ririo and some other languages of the western Solomons, Asumboa of the Santa Cruz archipelago, Aneityum and some other languages of southern Vanuatu, several languages of New Caledonia, and Nengone and Dehu of the Loyalty Islands in southern Melanesia. Atayal of northern Taiwan is an example of a language once considered to be highly aberrant in vocabulary, but it is much less distinctive now that researchers have found that the Squliq dialect (which was chosen as representative of Atayal) exhibits idiosyncratic changes owing to a historical form of “speech disguise” characteristic of men’s speech. This feature is still preserved in the Mayrinax dialect of the Cʔuliʔ dialect cluster. Prehistoric inferences from subgrouping The view, current from roughly 1965 to 1975, that Melanesia is the area of greatest linguistic diversity in Austronesian and that the Austronesian homeland therefore must have been in Melanesia has been shown to be inconsistent both with the comparative method of linguistics and with archaeological indications that Austronesian speakers entered the western Pacific from island Southeast Asia about 2000 BCE. It has accordingly been abandoned by virtually all scholars. Both linguistic and archaeological evidence point to an initial dispersal of Austronesian languages from Taiwan several centuries after Neolithic settlers introduced grain agriculture, pottery making, and domesticated animals to the island from the adjacent mainland of China about 4000 BCE. By perhaps 3500 BCE, populations bearing a clear cultural resemblance to those in Taiwan had begun to appear in the northern Philippines, and within a millennium similar material traces appear throughout Indonesia. The linguistic evidence suggests a steady southward and eastward movement, with Austronesian speakers moving around the northern coast of New Guinea into the western Pacific about 2000 BCE. From the region of New Guinea and the Bismarck Archipelago settlers fanned out very rapidly, crossing the sea with highly seaworthy outrigger canoes. In Oceania the dispersal of Austronesian-speaking peoples is most closely associated archaeologically with the distribution of Lapita pottery. Because the earliest Lapita sites in Fiji and western Polynesia are only three or four centuries younger than the earliest dated Lapita site in western Melanesia, the colonization of Melanesia as far east as Fiji appears to have been accomplished within 15 or 20 generations. There is a puzzling thousand-year gap before the settlement of central and eastern Polynesia, with Hawaii being settled only within the past 1,500– 1,700 years and New Zealand within roughly the past millennium. The settlement history of Micronesia is more complex: Palau and the Mariana Islands were settled by two migrations which were distinct from that associated with Lapita pottery. Most of the low coral atolls of the Caroline Islands were settled by 2000 BP, but some radiocarbon dates from the Marshall Islands suggest that Austronesian speakers may have reached the atolls of Micronesia not long after the settlement of Fiji and western Polynesia. External relationships Speculation concerning the external relationships of Austronesian languages has ranged far and wide. In the first half of the 19th century Bopp, who was a distinguished Indo-Europeanist, became convinced of the relationship of Indo-European to Austronesian. This theme was taken up again in the 1930s by Brandstetter. In 1942 the American linguist Paul K. Benedict initiated the Austro-Tai hypothesis, a proposed connection between the Tai languages and various minority (Kadai) languages on the mainland of Southeast Asia. Other researchers have proposed connections with Japanese (as has Benedict himself), the Papuan languages of New Guinea, various American Indian languages, Chinese, and Ainu. In short, almost every language family that might conceivably be related to Austronesian simply on grounds of a priori geographic proximity has been proposed as a relative, the one notable exception to date being Australian Aboriginal languages. Most of these proposals are speculative and have not achieved a general following. Benedict’s Austro-Tai hypothesis has perhaps received the widest attention in recent years, as it has been advocated in a large number of publications. However, in some ways the most compelling hypothesis for a wider language grouping that includes Austronesian is the Austric hypothesis, linking the Austroasiatic languages (the Munda languages of eastern India and the Mon-Khmer languages of mainland Southeast Asia) with Austronesian. The original hypothesis, first proposed in 1906 by Wilhelm Schmidt and long neglected by most linguists, has been greatly strengthened by more recent research. Structural characteristics of Austronesian languages Syntax Word order Although some linguists have questioned the usefulness of the notion of subject in Philippine languages, it remains a pivotal concept in typological studies of word order. The great majority of Formosan and Philippine languages are verb–subject–object (VSO) or VOS. This statement is true of virtually all the Formosan languages, with the minor qualification that auxiliaries and markers of negation may precede the main verb. Some contemporary languages, such as Thao and Saisiyat, have SVO word order, but there are indications that this is a relatively recent adaptation to the similar word order of Taiwanese, the Chinese language with which the Formosan languages have been in longest contact. Most languages of western Indonesia—such as Malay, Javanese, or Balinese—are SVO. However, a smaller number of languages, including Malagasy, the Batak languages of northern Sumatra, and Old Javanese (as opposed to modern Javanese), begin sentences with a verb. The majority of Austronesian languages in both eastern Indonesia and the Pacific are also SVO. The major exceptions to this pattern are in coastal areas of New Guinea, where a number of Austronesian languages are SOV, and the Polynesian languages and Fijian, which are VSO. The SOV languages of New Guinea also exhibit other features universally characteristic of verb-final languages, such as the use of postpositions (e.g., “the house in”) rather than prepositions (“in the house”). It is generally agreed that these Austronesian languages evolved to their present state as a result of generations of contact with Papuan languages, which typically are SOV. Verb systems Perhaps the most fundamental distinction in the verb systems of Austronesian languages is the division into stative and dynamic verbs. Stative verbs often translate as adjectives in English, and in many Austronesian languages it is doubtful whether a category of true adjectives exists. Examples of stative verbs are ‘to be afraid,’ ‘to be sick/painful,’ ‘to be new,’ ‘to sleep/to be asleep,’ and colour words. In some languages the stative prefix ma- can be added to higher numerals, as in Maranao ma-gatos ‘one hundred.’ Dynamic verbs generally are more complex than stative verbs. Most Formosan and Philippine languages and many of the languages of Sulawesi have a large inventory of affixes used to create different nuances of meaning in verbal or nominal stems. Most noteworthy is the system of verbal focus, which has been the centre of controversy and the subject of many conflicting interpretations since 1917, when Leonard Bloomfield provided the first detailed description of Tagalog syntax. The major verbal focuses of Tagalog can be illustrated as follows: A sentence that focuses on the actor (subject) is marked by -um-; for example, b-um-ilí ang lalake ng tinapay sa tindahan ‘the man bought some bread at the store’ (literally, ‘buy ang man ng bread sa store’) or b-um-ilí si Maria ng tinapay sa tindahan ‘Maria is buying/bought some bread at the store’ (literally, ‘buy si Maria ng bread sa store’). A sentence that focuses on the patient (object) is marked by -in- in the past, and by -in in the nonpast); for example, b-in-ilí ni Maria ang tinapay sa tindahan ‘Maria bought the bread at a/the store’ (literally, ‘bought ni Maria ang bread sa store’) or bilh-ín ni Maria ang tinapay sa tindahan ‘Maria is buying the bread at a/the store.’ A sentence that has a locative focus is marked by -an; for example, b-in-ilhán ng babae ng tinapay ang tindahan ni Aling Maria ‘the woman bought some bread at Maria’s store’ (literally, ‘bought ng woman ng bread ang store’). A sentence with an instrumental or benefactive focus is marked by i-; for example, i-b-in-ilí ni Maria ng tinapay ang pera nang tatay-niyá ‘Maria bought some bread with her father’s money’ or i-b-in-ilí ni Maria ng tinapay si Juan ‘Maria bought (some) bread for Juan.’ In each of the above sentences one noun is marked as being in focus. Focused personal nouns (proper names or common nouns that can be used as proper names, such as ‘Mother’ or ‘Father’) are preceded by si. Focused common nouns are preceded by ang, and the combination is commonly called the “ang-phrase.” The syntactic relationship that the focused noun bears to the verb is signaled by the focus affix (e.g., actor, patient). Moreover, focused noun phrases are definite, or old information, while nonfocused noun phrases may be either definite or indefinite. The speaker’s choice of focus thus depends to a large extent on discourse factors. Similar systems of encoding syntactic relationships are widespread in Formosan and Philippine languages, in the languages of Sabah (formerly North Borneo), in those of northern Sulawesi (northern Celebes), in the Chamorro language of western Micronesia, and in Malagasy. Somewhat less similar systems with some of the same features are found in the Batak languages of northern Sumatera (northern Sumatra) and in Old Javanese. One school holds that focus is voice. Under this interpretation such languages as Tagalog have only one active voice but three types of passives: a direct passive, a local passive, and an instrumental or benefactive passive. A second school holds that focus is case-marking: the case roles of subjects are marked by the focus affix on the verb. What distinguishes focus systems from the simple active-passive voice systems of such languages as Malay or modern Javanese is their ability by means of verbal affixation to express prepositional phrases as subjects. When the prepositional phrase is not in focus it is expressed as a preposition followed by a noun rather than as an ang-phrase: compare the third example above, b-in-ilh-án ng babae ng tinapay ang tindahan ‘the woman bought the bread at the store,’ where ang tindahan ‘the store’ is in focus and the locative relationship is expressed by the verb suffix -an, with any of the other sentences that contain tindahan ‘store,’ where the locative relationship is expressed by the preposition sa. One feature of the verb systems of many Austronesian languages is particularly noteworthy: nonsubject actors and possessors are marked in the same way (in Tagalog these are marked with the particle ni). As a result ‘was bitten by the dog’ and ‘the dog’s biting (of something)’ have identical structures. Because of this ambiguity the focus affixes in most focus languages create both verbs and nouns. Where focus has been lost, as in much of Indonesia and the Pacific, the remnant affixes may be used only to create nouns. Pronouns Almost all Austronesian languages distinguish two forms of ‘we’: an inclusive form (listener included) and an exclusive form (listener excluded). Many languages in the Philippines have a special dual inclusive (‘you and me’). In addition to singular and plural numbers, some Oceanic languages distinguish a dual number (‘we two,’ ‘you two,’ ‘the two of them’). A few Oceanic languages distinguish a fourth number that is greater than two but smaller than a typical plural. Historically, this number derives from the Proto-Austronesian word for ‘three,’ but it may in fact apply to numbers up to five and so is sometimes called “paucal” (‘a few’). Gender is rarely if ever distinguished. Probably the most spectacular pronominal feature in Austronesian languages is the expression of possessivemarking in Oceanic languages. In many of the languages of Melanesia, nouns are marked for one of two types of possessive relationship, generally called “inalienable” and “alienable.” Inalienable categories include body parts, certain kinship relationships, and such “spiritual” aspects of an individual as his shadow (often associated with the soul) and his name. Inalienable possession is marked by suffixing a possessive pronoun to the possessed noun, as in Fijian na mata-na ‘his eye’ (literally, ‘[article] eye-his’) or na tama-qu ‘my father.’ Alienable possession is expressed by suffixing the possessive pronoun to a generally preposed classifying particle that specifies any of several possible relationships between the possessed noun and the possessor, as in Fijian na nona vale ‘his house’ (literally, ‘[article] neutral-his house’), na ke-na ika ‘his fish (to eat)’ (‘[article] edible-his fish’), and na me-na dovu ‘his sugarcane (to suck the juice from)’ (‘[article] drinkable-his sugarcane’). The distinction between neutral and edible possession is widespread in Oceanic languages, and it appears in a few languages of eastern Indonesia. The further distinction of drinkable possession has a more limited distribution. The Polynesian languages have a somewhat different system of possessive marking. The most prominent feature of this system is the contrast between what are sometimes called “dominant” and “subordinate” possession. In dominant possession the possessor generally has a relationship of control, as with Hawaiian ka ki‘i a Lani ‘the picture taken or painted by Lani,’ while in subordinate possession this sense of control does not exist, as in ka ki‘i o Lani ‘the picture taken or painted of Lani.’ Numbers and number classifiers Most Austronesian languages have a decimal system of counting, as illustrated in the Table. Others, such as Ilongot of the northern Philippines and some of the languages of the Lesser Sunda Islands in eastern Indonesia, have quinary systems (i.e., systems based on five). In the New Guinea area several Austronesian languages have radically restructured number systems that probably result from intensive contact with neighbouring Papuan languages. An example is Gapapaiwa of Milne Bay, with sago ‘one,’ ruwa ‘two,’ aroba ‘three,’ ruwa ma ruwa ‘four’ (literally, ‘two and two’), miikovi ‘five’ (‘hand finished’), miikovi ma sago ‘six,’ miikovi ma ruwa ‘seven,’ and so on. In such systems counting is often limited to relatively small quantities. A number of the languages of Indonesia and the Pacific use number classifiers in counting objects, as with Bahasa Indonesia se-buah rumah ‘a house’ (literally, ‘one-fruit house’), se-orang guru ‘a teacher’ (literally, ‘one-person teacher’), or se-batang rokok ‘a cigarette’ (literally, ‘one-trunk cigarette’). In some languages of Micronesia the traditional counting systems were highly complex, with upwards of 30 number classifiers that distinguished counted objects by their shape, animateness, and other features. Spacial orientation Some Austronesian languages have terms for the cardinal directions east, west, north, and south, but in most cases these appear to have developed after European contact and may sometimes be due to inaccurate reporting by Europeans. The system of directional orientation found in many Austronesian languages is constructed on two axes, a landsea axis and a monsoon axis. The land-sea axis is very widespread among Austronesian-speaking peoples. Two widely separated examples are Thao (central Taiwan) tana-saya ‘uphill, toward the mountains,’ tanaraus ‘downhill, toward the sea’ and Hawaiian mauka ‘toward the mountains,’ makai ‘toward the sea.’ The monsoon axis is geographically more restricted, but the earlier reconstructed terms *habaRat ‘west monsoon’ and *timuR ‘southeast monsoon’ have been preserved in languages outside the monsoon region, though with change of meaning (e.g., Samoan afā ‘storm, gale, hurricane,’ timu ‘be rainy’). Demonstrative pronouns often distinguish two forms of ‘there.’ In some languages these correspond to secondperson and third-person pronominal reference: ‘there (near the listener)’ versus ‘there (near a third person).’ In other languages a distinction is made between a referent that is visible versus a referent that is not visible. Morphology and canonical shape Verb morphology The Austronesian languages of Taiwan, the Philippines, northern Borneo, and Sulawesi and some other languages (such as Malagasy, Palauan, and Chamorro) are characterized by a very rich morphology, which functions in both verb-forming and noun-forming processes. Some languages use affixation to encode many types of syntactic relationships that are expressed in most other languages through the use of free words. Thao of central Taiwan, for example, allows aspect markers to be attached to prepositional phrases, as in in-i-nay yaku ‘I was here’ (literally, ‘[past]-location-this I’). In Thao, relative clauses are expressed through attributive constructions that may use complex nouns derived by affixation, as in m-ihu a s-in-aran-an yanan sapaz ‘the place where you walked has footprints’ (‘your [ligature-past]-walking-place has footprints’). Most of the so-called focus affixes in such languages have both verbalizing and nominalizing functions. Many of the languages of Sulawesi and eastern Indonesia have prefixed subject markers on the verb. In some languages these co-occur with full free pronouns marking the subject and so function like a system of agreement. In some of the languages of western Melanesia, such as Motu, the verb complex consists of a prefixed subject marker, the verb stem, and a suffixed object marker, together with free nouns or pronouns marking subject and object, producing structures such as ‘the man the dog he-kicked-it’ for ‘the man kicked the dog.’ In a case such as this, the structure of the verb complex provides a clue that the current SOV order of sentence constituents has developed from an earlier SVO order. Reduplication Reduplication takes numerous forms and has a great variety of functions in Austronesian languages. Partial reduplication of a verb stem is used to mark the future tense in both Rukai of Taiwan and Tagalog of the Philippines, as in Tagalog l-um-akad ‘walk’ but la-lakad ‘will walk’ or s-um-ulat ‘write,’ su-sulat ‘will write.’ Full reduplication is used to mark plurality of nouns in Bahasa Indonesia, as with anak ‘child’ but anak anak ‘children.’ In many languages reduplication is used together with affixation to express a variety of semantic nuances. The pattern seen in Indonesian anak anak-an ‘doll’ or orang orang-an ‘scarecrow’ (orang ‘person’) is only one of many that occur in various languages. Submorphemes Linguists have generally maintained that the smallest meaning-bearing units of language structure are morphemes, elements that are isolated by the contrast of partially similar words, as in berry: cranberry (hence both cran and berry are morphemes of English). However, English words such as glow, glimmer, glisten, glitter, glare, glint, gloss, and the like exhibit a recurrent association of sound and meaning without contrast. Many Austronesian languages, particularly in insular Southeast Asia, show similar types of recurrent sound-meaning associations that are not defined by contrast. In the great majority of cases, these consist of the last syllable of a morpheme. A clear illustration is seen in Malay, where about 40 two-syllable words end in -pit and roughly half of these have meanings that can be characterized as referring to the approximation of two surfaces, as in (h)apit ‘pressure between two disconnected surfaces,’ capit ‘pincers,’ mencepit ‘to nip,’ dempit ‘pressed together, in contact,’ gapit ‘nipper, clamp,’ kempit ‘carry under the arm,’ and limpit ‘in layers.’ Canonical shape The term canonical shape refers to the clearly marked preferences that some languages show for number of syllables, sequencing of consonants and vowels, and so on in the construction of words. Many Austronesian languages show a clear preference for a disyllabic (two-syllable) canonical shape in content words (words that have a reference rather than a purely grammatical function). Where this preference is violated by the operation of other forces, it often reasserts itself through special mechanisms. Javanese əri ‘thorn’ passed through a stage in which it was ri but gained a schwa to meet the preferred two-syllable canonical shape. Many other quite varied examples of this type can be shown for languages throughout the Austronesian family. In view of the disyllabic canonical target in Austronesian languages, the words that represent certain meanings are often conspicuous for their length. An example is the word for ‘butterfly’: Paiwan (Taiwan) quLipepe, Puyuma (Taiwan) Halivanvan, Bunun (Taiwan) talikoan, Ilokano (Philippines) kulibangbang, Tagalog (Philippines) alibangbang, Iban (Borneo and Malaysia) kelebembang, Tae’ (Sulawesi) kalubambang, Sichule (Sumatra) alifambang, Gani (Halmahera) kalibobo, Numbami (north coast of New Guinea) kaimbombo. This word contains a prefix or family of prefixes that almost invariably is fossilized, thus creating a much longer word than is typical of Austronesian languages. The same phenomenon is seen with certain other meanings, such as ‘ant,’ ‘firefly,’ ‘leech’ (two types), ‘echo,’ ‘dizzy,’ ‘rainbow,’ ‘whirlpool/whirlwind,’ and ‘hair whorl.’ In the Philippines clusters consisting of “heterorganic” consonants (consonants produced at different places in the mouth) are common in the middle of words (Tagalog hagpós ‘loose, slack,’ puknát ‘unglued, detached’), but this is not typical of Austronesian languages in most other areas, where consonants tend to alternate with vowels in CVCV sequences. Most Austronesian languages do not permit final palatal consonants, although in a few cases these have developed through secondary change. Other languages have a severely restricted inventory of possible final consonants in relation to consonants in other positions, as with Makasarese of southern Sulawesi, where the only possible final consonants are the velar nasal -ŋ and the glottal stop (a consonant produced by suddenly closing the vocal cords so as to interrupt the outward flow of air from the lungs). In most Oceanic languages and some Austronesian languages in other areas, all words end in a vowel. This is the result of either of two types of change: loss of final consonants or addition either of an “echo” vowel or of an invariant “supporting” vowel. Fijian and the Polynesian languages show open final syllables as a result of the first type of development; Mussau of western Melanesia and Malagasy show open final syllables as a result of the second type (see Table). Phonetics and phonology Size of phoneme inventory Most Austronesian languages have between 16 and 22 consonants and 4 or 5 vowels. Exceptionally large consonant inventories are found in the languages of the Loyalty Islands in southern Melanesia, and exceptionally small consonant inventories in the Polynesian languages. Hawaiian has the second smallest inventory of phonemes, or distinctive sounds, of any known language, with just eight consonants (p, k, ‘ [glottal stop], m, n, l, h, and w) and five vowels (a, e, i, o, and u). Vowel systems in Austronesian languages tend to be simple. Many languages in Taiwan, the Philippines, and Indonesia have just four contrasting vowels: i, u, a, and e, an indistinct mid-central vowel. The great majority of Oceanic languages have a five-vowel system: i, u, e, o, and a. Larger vowel systems are found in a number of Nuclear Micronesian languages, in some of the languages of Melanesia (such as Sakao of north-central Vanuatu), and in a few of the Chamic languages. Phonetic types In view of the large number of Austronesian languages it is not surprising that observers have recorded a wide range of speech sounds, including some that are quite rare in the world’s languages. Some Formosan languages have a uvular stop (written q), which is a consonant sound produced by drawing the backmost part of the tongue down to touch the wall of the pharynx. A number of the languages of Borneo and some other areas have unusual nasal consonants belonging to either of two types: “preploded” nasals, in which nasal consonants are heard as /-pm/, /-tn/, and /-kng/ at the end of a word, and what might be called “postploded” nasals /-mb/, /-nd-/, or /-ngg-/, in which a nasal consonant between vowels is followed by a stop that is almost too short to hear. Preglottalized or implosive consonants are found in several of the languages of central Taiwan, in a number of the languages of northwestern Borneo, in the Chamic languages of mainland Southeast Asia, and in several languages of the Lesser Sunda Islands. In Fijian and many other languages of Melanesia, voiced stops b, d, and g are automatically preceded by a nasal: mb, nd, and ngg. Perhaps the most unusual consonant types reported in Austronesian are prenasalized bilabial trills, made by trilling the lips following an m, and apico-labial stops (nasals and fricatives), which are made by touching the upper lip with the tip of the tongue. The former are quite common in the languages of Manus Island in the Admiralty Islands of western Melanesia, and the latter are found in a number of languages scattered throughout central Vanuatu. Many Austroasiatic languages of the Mon-Khmer family found on mainland Southeast Asia distinguish two voice registers, a breathy, or “sepulchral,” voice (made by relaxing the vocal cords) and a clear voice (made by tensing the vocal cords). As a result of generations of bilingualism this feature has been acquired by most of the Chamic languages. Together with other Mon-Khmer characteristics, these areal adaptations in the Chamic languages caused Schmidt in 1906 to incorrectly classify them as “Austroasiatic mixed languages.” Where they have been further exposed to languages with lexical tone, as Eastern Cham (in contact with Vietnamese) or Tsat (in contact with both Chinese and Tai-Kadai tone languages on Hainan Island in southern China), at least two Chamic languages have become largely monosyllabic and tonal. Tonal contrasts are also reported for a few Austronesian languages in two widely separated parts of New Guinea and in southern New Caledonia. Despite contact with Chinese, which in some cases must date back at least three centuries, none of the aboriginal languages of Taiwan are tonal. Many languages in the Philippines use stress to distinguish words that are otherwise identical in form, as in Tagalog sábat ‘design woven into cloth or matting’ versus sabát ‘stop pin or lug.’ Some languages outside the Philippines use accent contrasts to distinguish different forms of the same word, as in Toba Batak (northern Sumatra) gógo ‘push hard!’ versus gogó ‘strong’ or díla ‘tongue’ versus dilá ‘a big talker.’ The origin and history of accent contrasts remains one of the major unresolved problems in the study of the Austronesian languages. Lexical semantics and sociolinguistics Lexical semantics Many common words in Austronesian languages are not easily translated into English or most other European languages. Examples of noncorrespondence can be seen in the comparison of several Malay words to English meanings: (1) one to many: Malay kaki corresponds to both ‘foot’ and ‘leg’ in English, (2) many to one: Malay rambut and bulu both correspond to English ‘hair,’ the former referring exclusively to hair of the head and the latter to body hair, downy feathers, plant floss, and the like, and (3) some combination of many to one and one to many: Malay adik corresponds to both ‘brother’ and ‘sister’ in English but is used only to refer to siblings younger than the speaker; Malay kakak also means both ‘brother’ and ‘sister’ but is used to refer to older siblings. In many Austronesian languages there is no general term for the verbs ‘to cut’ or ‘to carry,’ or for the noun ‘root,’ but rather numerous terms to specify the type of activity or type of structure in much greater detail than is typical in European languages. Speech levels and honorific registers Javanese and several languages in close contact with it—including at least Sundanese and Balinese—have developed a linguistic reflection of social stratification. Javanese uses three speech levels, distinguished by choice of vocabulary. The primary distinction is between Kromo, a high form used when speaking to social superiors, and Ngoko, a low or neutral form used when speaking to social equals or inferiors. Further subdivisions are recognized within Kromo, and in addition a small number of words called Madya (Middle) contain elements of both Kromo and Ngoko styles. In Samoa a special vocabulary is used when addressing persons of chiefly rank. Male-female speech differences are covert in many languages, evident chiefly in the greater frequency with which speakers of one sex use particular forms; in some languages, however, gender-associated differences become conventionalized and rigid. The most-notable case reported for an Austronesian language is in the Mayrinax dialect of Atayal in northern Taiwan, where women’s speech is historically a more conservative variety and men’s speech shows unpredictable changes in pronunciation owing to the addition of entire syllables to earlier word forms. These innovations present in Atayal men’s speech may have originated as a form of speech disguise. In Tagalog and some other languages of the Philippines, as well as in Malay, forms of “backward speech” (which have as their primary purpose the concealment of messages) have been reported for adolescents. Such phenomena are functionally not unlike English pig Latin. Iban of northwestern Borneo shows an unusually large number of words with what appear to be reversals of the meanings found in cognates in other languages. This, too, may reflect an earlier tradition of speech disguise that succeeded in altering some meanings of the language for all speakers. Reconstruction and change Grammar Proto-Austronesian (PAN) probably had a verb–object–subject (VOS) word order. Four PAN affixes are commonly recognized: *Si- marked instrumental focus (abbreviated IF), *-um- actor focus (AF), *-an locative focus (LF), and *-en patient focus (PF). In addition, the infix *-in- marked completive (c) aspect or past tense. The completive infix could co-occur with *Si-, *-um-, and *-an, but, in the completive form of the patient focus, *-in- was used without the suffix *-en, and *-in- thus simultaneously marked two functions: *k-um-aen i aku (AF) ‘I am eating,’ *k-um-in-aen i aku ‘I was eating,’ *kaen-en ni aku (PF) ‘is eaten by me/what I am eating,’ *k-in-aen ni aku (PFc) ‘was eaten by me/what I ate.’ This fusion of functions in the infix *-in-, when used with the patient focus, has been carried down to many attested languages, including languages that no longer have a focus system. Most views of grammatical change in Austronesian assume that Philippine-type focus systems continue a type of structure that was present from the earliest recoverable period. Not only do widely scattered languages, including Malagasy and Chamorro, have such systems, but many other languages have what appear to be fragments of a formerly more fully integrated system of particles and affixes. For example, in Tagalog the particle si, indicating actor focus for personal nominals, is syntactically opposed to ni, marking genitive/agentive. In Malay, a nonfocus language with a simple active-passive verb contrast corresponding to the focus systems of Philippine languages, ni has disappeared and the particle si has no grammatical function other than simply marking personal names or attributes used as names with a mildly pejorative connotation, as in si Ahmad ‘Ahmad’ or si Gemuk ‘Chubby’ (compare gemuk ‘obese’). It is generally agreed that the focus affixes (with the possible exception of *-um-) had both verbalizing and nominalizing functions. A more extreme view, not widely held, maintains that the focus affixes were originally used only to create nominals and were reinterpreted as verbal affixes in the separate histories of many daughter languages. Proto-Oceanic diverged widely from this type of syntax. It appears to have been SVO, and most of the focus morphology of Proto-Austronesian was either lost or reinterpreted as nominalizing morphology. A major debate that has continued for three decades concerns the classification of various of the Polynesian languages as either accusative (having both transitive and intransitive subjects distinguished from objects) or ergative (having intransitive subjects and objects distinguished from transitive subjects). Differing theory-dependent definitions of these terms have not facilitated agreement. Morphology The morphology of verbal focus has attracted the most attention in Austronesian studies, but other areas of morphology are also of interest. One such area is that of Ca-reduplication, a pattern of derivation in which the first consonant and vowel (stereotypically an *a) are repeated. This pattern was first recognized with the numbers, where *esa ‘one,’ *duSa ‘two,’ *telu ‘three,’ *Sepat ‘four,’ *lima ‘five,’ and the like are matched by a corresponding set of numbers *a-esa, *da-duSa, *ta-telu, *Sa-Sepat, *la-lima. The unreduplicated set was used in serial counting or in counting nonhuman objects, and the reduplicated set in counting human beings. In some daughter languages (such as Tagalog) elements from both sets have survived and have been combined into a single set. In addition, Ca-reduplication was used rather productively to derive instrumental nouns from verbs. Phonology Proto-Austronesian probably had the following consonant inventory: voiceless stops *p, *t, *C, *c, *k, and *q; voiced stops *b, *d, *z, *j, and *g; nasals *m, *n, *ñ, and *ŋ; fricatives *s, *S, and *h; liquids *l, *N, *r, and *R; and semivowels *w and *y. *C and *c probably were alveolar and palatal affricates; *q was a uvular stop. The *z was most likely the voiced counterpart of *c, while *j appears to have been a voiced palatalized velar stop, a segment without any counterpart elsewhere in the system. The *s probably was a palatal and *S an alveolar sibilant; although conventionally written with the symbol for a nasal, *N is more likely to have been a liquid of some kind; *r seems to have been an alveolar tap, and *R an alveolar or uvular trill. There were just four vowels: *i, *u, *a, and *ə (the schwa, a neutral mid-central vowel). In addition the semivowels *w and *y combined with *a, *i, and *u to form diphthongs *-aw, *-ay, *-iw, and *-uy. The principal changes from this system to that of Proto-Malayo-Polynesian (the hypothetical ancestor of all nonFormosan Austronesian languages) are the merger of *C and *t as PMP *t, the merger of *N and *n as PMP *n, and the shift of *S to PMP *h (and of *eS to *ah). A number of other mergers occurred in Proto-Oceanic, including the merger of *p and *b and of *k and *g; the merger of the palatals *s, *c, *z, and (in all Oceanic languages outside the Admiralty Islands of western Melanesia) *j; and the merger of *e and *-aw as ProtoOceanic *o. These changes are illustrated in the Table. Vocabulary About 5,000 unaffixed stems have been reconstructed for Proto-Austronesian, Proto-Malayo-Polynesian, or Proto-Western-Malayo-Polynesian. Although the Indo-European languages have a far richer textual tradition, probably no language family excels Austronesian in the richness of vocabulary reconstructed through the comparative method. The vocabulary of a language reflects the collective experience of its speakers, making reference to both their natural world and their culture. The reconstruction of vocabulary and the identification of loanwords thus can provide insight into the natural environment and culture of prehistoric language communities and the nature of their linguistic contacts. Reconstructed vocabulary shows clearly that the speakers of Proto-Austronesian had grain crops, including rice and millet; that they lived in settled villages of houses raised on piles; that they practiced weaving on simple back looms; that they domesticated dogs, pigs, and probably chickens; and that they were in contact with the sea and its resources. Familiarity with many tropical food plants can be inferred for speakers of Proto-MalayoPolynesian. These include the coconut, banana, yam, sugarcane, pandanus, taro, sago, and breadfruit. Of these only sugarcane, pandanus, and wild taros of the genus Alocasia can safely be inferred for Proto-Austronesian, which probably was spoken on both sides of the Tropic of Cancer. Reconstructions for ‘boat,’ ‘sail,’ and ‘paddle’ can be attributed to Proto-Austronesian, but terminology specific to the outrigger can be assigned only to Proto-Malayo-Polynesian, a language that was probably spoken somewhere in the northern Philippines in the period 3500–3000 BCE. Lexicostatistics, a controversial method for studying word replacement in relation to subgrouping, often distinguishes a subset of terms called “basic vocabulary.” Lists of basic vocabulary words typically include those for body parts, terms for everyday natural phenomena (sky, wind, rain, sun, star, earth, stone, water, tree), basic kin terms (father, mother, child), and some others. Although lexicostatistical theory assumes a universally constant rate for the replacement of basic vocabulary, replacement rates in Austronesian languages appear to show considerable variation. Malay and its closest relatives (Iban, Minangkabau, and so on), many Philippine languages, and some languages in eastern Indonesia (Manggarai of the Lesser Sundas, Yamdena of the southern Moluccas) show very high concentrations of vocabulary items that have a wide distribution in the Austronesian family. It is inferred from this that they have replaced basic vocabulary at a slower rate than other languages. By contrast, languages in the South Halmahera–West New Guinea group and many of the Austronesian languages of western Melanesia show far lower concentrations of widely distributed forms, and it is inferred that they have experienced more rapid rates of basic vocabulary replacement. Some Oceanic languages—including several in the southeastern Solomons, Fijian, Polynesian (especially Samoan and Tongan), and the Chuukic (Trukic) languages of Micronesia—also have relatively large concentrations of widely distributed forms and have for this reason traditionally been highly valued as witnesses in comparative linguistics.