Genetic relationships II Li2 Language variation: Historical linguistics David Willis The Wave Theory Assumptions behind the family-tree model: • homogeneous parent language • sound change is regular • parent language splits suddenly and cleanly into two daughter languages Instead we observe: • dialect continua (with recently imposed divisions into standard languages) • mixture of shared and independent innovation in neighbouring varieties • conflicting evidence for subgrouping • irregular development of creoles The Wave Theory The satem : centum split SATEM CENTUM LANG UAGES LANG UAGES Indo-Iranian (Sanskrit) Baltic Slavic Ave stan Armenian Albanian Italic (Latin) Greek Celtic Germanic Another split INST. / DAT. PL. WITH [b] Indo-Iranian (Sanskrit) Celtic Italic (Latin) INST. / DAT. PL. WITH [m] Germanic Baltic Slavic The Wave Theory The Wave Theory: Example The Wave Theory: Example These sound changes produce contradictory trees: The Wave Theory: Example The Comparative Method beyond IE Problems with the languages of Australia • most (> 80%) Australian languages are classified as Pama-Nyungan • subgroupings within Pama-Nyungan have been established, but not a full family tree • it has been argued that there is so much borrowing in Australian languages that systematic correspondences are obscured Punctuated equilibrium (Dixon 1997) • language groups have existed for most of human history without disruption • in periods of equilibrium languages change slowly and largely by borrowing from one another, creating linguistic areas crossing genetic families • long periods of equilibrium are interspersed with short periods of punctuation (invasion, migration, new technology etc.) • splits of the type found in family trees happen only in periods of punctuation, hence the Comparative Method is applicable only to periods of punctuation Punctuation equilibrium: Problems • convergence and split can occur simultaneously e.g. in ancient Anatolia, the non-IndoEuropean languages were converging with the Indo-European (Anatolian) ones, while, at the same time, the Indo-European (Anatolian) languages were splitting from one another • punctuated equilibrium justified by the claim that there are 'too few' languages (only 6500) if languages had been splitting in the manner of the Comparative Method for 100,000 years; however, this underestimates the role of language death, and ignores the huge recent population growth which has increased language diversity in the last 10,000 years Punctuated equilibrium • The Comparative Method can be applied partially even to Australian languages: Punctuated equilibrium • The Comparative Method can be applied partially even to Australian languages: Conclusions Bardi and Yawuru are related (both Nyulnyulan family) Bardi (Nyulnyulan) and Karajarri (Marrngu (Pama-Nyungan)) are not. Punctuated equilibrium Alternative view: • Family Tree model works well for relationships between languages but not for relationships between dialects • Dialects/closely related languages form chains of mutually comprehensible varieties • This leads to conflicting subgroupings, even though the varieties are genetically related (compare problems with subgrouping in Romance or Indo-Aryan languages) Long-distance reconstruction General view: • even under the best conditions the Comparative Method allows reconstruction only as far back as 6000–8000 years ago • further reconstruction is impossible because the shared set of cognates between any two related languages will have diminished so much after this time period that it will be indistinguishable from chance resemblances • language families may well be related beyond this time depth, but we can never demonstrate those relationships successfully Can we do better? Long-distance reconstruction: Nostratic • claims to be using conventional Comparative Method (identifying sound correspondences and reconstructing proto-phonemes on this basis) groups together Indo-European, Afro-Asiatic [= hypothesised grouping of Semitic, Berber, Chadic, Cushitic and Ancient Egyptian], Kartvelian, Uralic, Dravidian, Altaic [= Turkic, Mongolian and Tungusic]. Example (Trask 1996: 383): Proto-Nostratic **/k/ 1. PN **küni 'wife, woman' > PIE *gwen-, Proto-Afro-Asiatic *k(w)n, *knw 'wife' woman', Proto-Turkic *küni 'one of the wives' (in polygamy) 2. PN **kälU 'female in-law' > PIE *gjlou- 'brother’s wife', Proto-Afro-Asiatic *kl(l) 'sister-inlaw, bride', ?Proto-Kartvelian *kal- 'woman', Proto-Altaic *käli(n) 'wife of younger brother or son; sister’s husband'; Proto-Dravidian *kal- 'father’s brother’s wife' 3. PN **kamu 'grasp, grab, squeeze' > PIE *gem- 'grab, take, squeeze', Proto-Afro-Asiatic *km- 'grab, take, squeeze', Proto-Altaic *kamu- 'seize, grab, squeeze', Proto-Uralic *kamo'handful', Proto-Dravidian *kamV- 'grab, take, hold' • works by establishing proto-forms, then by applying the Comparative Method to the proto-languages • distribution of cognates is indistinguishable from chance (Ringe 1995) Long-distance reconstruction: Mass comparison Methodology: • collect lots of words from the languages you are interested in • look for resemblances between words • declare any languages with resemblances related • Seemed to work for African languages (Greenberg 1963), but highly controversial for Amerind (the claim that all languages of the Americas except Eskimo-Aleut and Na-Déné are related, Greenberg 1987) Long-distance reconstruction: Mass comparison • cognates are identified on the basis of phonetic similarity alone but it is highly unlikely that true cognates would be phonetically similar at the degree of separation postulated, cf. English five and its cognates French cinq, Russian pjat', Armenian hing; and English two and Armenian erk (PIE *dw > *tg- > *tk> *rk- > erk-); or German Feuer 'fire' and French feu ‘fire’, and English day and Spanish día 'day' which are not cognates • after a certain period of time, lexical replacement will remove all cognates between two related languages, making it impossible to identify the link between them • borrowing is difficult to eliminate: even basic vocabulary items can be borrowed e.g. Finnish has borrowing tytär 'daughter' from Germanic, English has borrowed 'basic vocabulary' person, grease and mountain from French Long-distance reconstruction: Mass comparison • using large numbers of languages increases the possibility that chance is responsible for the similarities (that is, in a group of twenty languages, you are more likely to discover four with a similar word than in a group of four) but no statistical check on this is offered. Greenberg says the reverse though, which is clearly wrong: The method of multilateral comparison is so powerful that it will give reliable results even with the poorest of materials. Incorrect material should have merely a randomizing effect. (Greenberg 1987) • there is no check on the semantic shifts allowed e.g. Greenberg's Amerind hypothesis used a group of 'cognates' with the range of meanings 'body / belly / heart / skin / meat / be greasy / fat / deer' • there is no check on the degree of phonetic similarity permitted • similarities due to onomatopoeia (in words such as 'suck', 'sneeze' etc.) and nursery forms (mama, papa) are not consistently ruled out Long-distance reconstruction: Mass comparison • the role of chance is so great that any genuine similarities will inevitably be obscured, especially if the forms involved are of CVC structure and any consonant at the same point of articulation is allowed to match any other consonant at that point of articulation • in practice the evidence offered is full of elementary philological errors and the known early history of languages is ignored • morphological structure is added or eliminated at will in order to increase the plausibility of a relation e.g. Dravidian (Tamil) melku 'chew' has the morphological structure mel-ku, which decreases the proposed similarity with Indo-European milk-type words • evidence is second-hand through dictionaries: errors build up Glottochronology • compile a list of basic vocabulary in two or more languages you are interested in (use a Swadesh list or something similar) • use basic vocabulary: this is most resistant to borrowing the items can be identified in most languages • identify what proportion of the vocabulary on the list is cognate in pairs of languages • calculate the time since the languages diverged, assuming that after one thousand years, a language will have retained 81-86% of its core vocabulary Glottochronology • cognates are identified by inspection and not by the Comparative Method • translating vocabulary lists is not straightforward: different possible translations • semantic shifts obscure cognates: English head German Haupt '(metaphorical) head' German Kopf French chef 'chief, boss' French tête Ok to allow two cognates that have moved apart in meaning? • lexical replacement doesn't operate at a constant rate • languages don't split apart at a particular date (there is a potentially long period when they drift apart), so the final calculation is meaningless