Guus Kroonen Roots of Europe Course 18 November 2014 Copenhagen University A source that no longer exists “The Sanskrit language […] is of a wonderful structure, more perfect than Greek, more copious than Latin, and more exquisitely refined than either; yet bearing to both of them a stronger affinity […] that could possibly have been produced by accident; so strong, indeed, that no philologer could examine all three without believing them to have sprung from some common source which perhaps no longer exists.” Regular Sound Correspondences Is this the whole story? “There is a similar reason, though not quite so forcible, for supposing that both the Gothic and the Celtic, though blended with a very different idiom, had the same origin with the Sanskrit.”(also William Jones, 1786) Linguistic Substrates (A>Ab>aB>B) How Indo-European is Germanic? • 0% non-Indo-European (Schuhmann 2012): “No word that can only be explained as a substrate word. The myth that in Germanic there is a particularly high percentage of substrate words should be given up once and for all.” • 15% without a clear IE etymology, 4-5% non-IndoEuropean (Kroonen 2013) • 10-50 % non-Indo-European (Roberge 2010) • 33% non-Indo-European (Hawkins 2009) • 60% non-Indo-European (Beekes, p.c.) Vennemann: Atlantic/Semitidic Vennemann: Vasconic Methodological fallacies • Baldi & Page (2006): – – – – Considering known/attested languages only Absence of systematic sound correspondences Downplaying of semantic differences Lexical cherry-picking • Ergo: Vennemann’s corpus probably largely consists of false positive matches: – – – – Old Norse Baldr (a god), Hebr. Baᶜal ‘lord’ G Rabe, E raven < *hraban-, Arab. ġurāb- ‘raven’ E knife, OFr. canif, Bsq. kanibet G Eis-vogel, Bsq. *iz ‘water’ Lexical Cherry-Picking (Trask 1997) Basque Hungarian -a def. article a def. article aita father atya father bake peace béke peace egiaz truthfully igaz true, real erreka stream árok ditch, trench hiru three három three kohat bellows (of a forge) kohó forge kontu care, attention gond care, attention etc. Sixty “matches” after only a couple hours of work! Conclusion: without regular sound correspondences you can probably link any two languages. Prehistoric Loanword Methodology • No clear Indo-European etymology – Beekes (passim) • Specific semantic domains (e.g. local flora & fauna, geographical terms, etc.) – Polomé (1986), Hawkins (2009), Schrijver (1997) • Discrepant phonotactics vis-à-vis Indo-European – Polomé (1986, 1989, 1990), Hamp (1979), Huld (1990), Salmons (1992, 2004), Boutkan (1998), Lubotsky (2001), Matasović (2012) • Recurring non-Indo-European patterns: – Kuiper (1995), Schrijver (1997, 2007; 2012), Witzel (1999), Kroonen (2012), Beekes (2014) Prehistoric Loanword Methodology • Comparison of three pre-historic loanword case studies in current historical linguistics: – – – – – Germanic Celtic Saami Greek Vedic • Three more linguistically falsifiable tools: – Recurring sound alternations within a language – Recurring non-inherited morphs – Irregular sound correspondences within language sub-group or within related neighboring languages Lacking etymology = loanword • More than half of the Germanic lexicon is of non-IE provenance (Beekes, p.c.) – Because the IE etymology is lacking • Heggarty (2013, Talking Neolithic Workshop, MPI-EVA): “Why does a word without an etymology have to be a substrate word?” – An IE word may have been preserved in one single daughter language and lost elsewhere Isolating Semantic Fields • Seafaring terminology without clear etymologies J.A. Hawkins (2009): – – – – – – *nurþra- ‘to the north’ *saiwi- ‘sea’ *baita- ‘boat’ *segla- ‘sail’ *skipa- ‘ship’ etc. Isolating Semantic Fields • Seafaring terminology without clear etymologies J.A. Hawkins (2009): – – – – – – *nurþra- ‘to the north’, cf. Gr. enérteros ‘lower’ *saiwi- ‘sea’ < PIE *séikw- ‘to drip, flow’ *baita- ‘boat’ < PGm. *bītan- ‘to dig out’ *segla- ‘sail’, cf. OIr. séol ‘sail’ < *segh-lo*skipa- ‘ship’ << Lat. scyphus << Gr. σκύφος ‘vessel’ etc. • Virtually all examples are false negatives (cf. Schuhmann 2012) Non-Inherited Phonotactics • PIE did not have a *b, so all Proto-Germanic words with *p (Grimm’s Law) must be from a non-Germanic , non-Celtic Indo-Euroepan language (Kuhn’s “Nordwestblock”, 1959; 1962) – *plōga- ‘plow’ – *piþan- ‘pith; root’ – *pissōn- ‘to piss’ • Note the iconicity problem – *pinka- ‘little finger’ (= PIE *penkwe ‘5’?) Partraige in Ireland (Schrijver 2000) • Part-raige means ”Crab People”, cf. part-án ’crab’ (with suffix as in e.g. scat-án ’herring’) • Together with Catt-raige ”Cat People”, Art-raige ”Bear People”, Gab-raige “Goat People” etc. they appear as so-called aithechthuatha, i.e. ’vassalpeoples’ = subjected tribes • The Partraige populated the infertile and mountainous region round Loch Mask which has the hallmarks of a refuge area. – NB: This is almost exactly where the last Irish speaking communities are located in our time Loch Mask in Co. Mayo and Co. Galway Words with p and unlenited stops • part-án ’crab’, pell ’horse’, petta ’pet’, pluc ’cheek’, pata ’hare’ • NB: In Latin loanwords, p is substituted by kʷ until the fifth century as Irish did not have this sound: – Lat. Pascha >> OIr. Cásc, purpura >> corcur, Patricius >> Cothriche, planta >> clann ’offspring’ • From the sixth century onwards, p is retained: – Patricius >> Pádraic ’Patrick’, pācem >> póc ’kiss’ Language of the ”Crab People” • A non-Indo-European language spoken in the marginally habitable areas of Ireland • It survived until at least the sixth century – Otherwise **cortán is expected for actual partán • It is was the source of many Irish words containing p or unlenited stops • The number of items belonging to fishing terminology is strikingly high, cf. bradán ‘salmon’, scadán ‘herring’, gliomach ‘lobster’ Non-Saami Layer (Aikio 2012) • 1/3 of the Saami lexicon is non-Uralic • Semantic fields: local flora & fauna, topography, climate • Non-Uralic phonotactics in North Saami: – uffir ‘rocky seashore’, skuolfi ‘owl’, fierbmi ‘fishing net’ (no *f in PFU) – skávdu ‘2-year old seal’, spáhčču ‘bunch of sinewthread’ skier’ri ‘dwarf beech’ (initial clusters not allowed in PFU) Non-Saami Layer (Aikio 2012) • Irregular simplification of clusters in the dialects: – N láhhpu vs. L sláhhpo ‘thick sinew-thread’, N liessu ‘lair of a fox’ vs. S plieasoe ‘den, lair’, etc. • Irregular alternation of s and š between West and East Saami: – S saasne ‘rotten tree’ vs. N šošnn ‘dead pine-tree’, S satnje ‘fishing net’ vs. Sk. šaannj ‘rag’, etc. • Identification of non-Saami morphs: – *-ērē ‘mountain’: N top. Gealbir, Hoalgir, Jeahkir, Nuhppir, Nussir, Ruohtir, Váhčir, etc. A. Aikio, 2012, An essay on Saami ethnolinguistic prehistory, p. 64. Conclusions (Aikio 2012) • A non-Uralic language spoken in Lapland when the different Saami languages arrived there around before 500 AD. • Words adopted from this language by the Saami were contemporaneous with the latest Old Norse loanwords (600 AD at latest) • It is possible that preaspiration spread from this language to both Saami and Nordic. – For preaspiration, cf. Icelandic rokk [rᴐʰk] ’rock’. Preaspiration in Northern Europe Non-Inherited Layer in Greek • “1000 Pre-Greek etyma” (Beekes 2010) • Semantic fields: local flora & fauna, “landscape terms”, agriculture, architecture, social stratification, religion, names • A wide variety of non-IE features in the phonotactics, e.g. non-IE geminates, CVCVC-root structure instead of PIE CVC-: – thálatta ‘sea’ – Odusseús ‘Ulysses’ – bélekkos ‘chickpea’ Irregularities Alternations • Many forms of obscure dialectal alternations: – dáphnē : láphnē ‘laurel’ (d:l, cf. Lat. laurus) – blẽkhnon : blẽkhron ‘fern’ (b, cf. OSw. brækne) – abrutós : ámbruttos ‘sea urchin’ (prenasalization, irregular gemination) – kolúbdaina : kolúmbaina ‘kind of crab’ (bd:mb) – agerrakábos : agrákabos : agerrákomon ‘bunch of grapes’ (b, m:b, single:double r) Non-Inherited Morphs • The suffix -inth- / -īth- / -īd- (prenasalization): – gálinthos : gálithos : gélinthos : gérinthos ‘chickpea’ – hélmis, gen. hélminthos : hélmingos : acc. hélmitha : pl. líminthes ‘intestinal worm, helminth’ – trémithos : términthos : terébinthos ‘turpentine tree’ – huákinthos ‘hyacinth’ – labúrinthos : Myc. dapurito ‘labyrinth’ – áglis, gen. áglithos ‘garlic’ – órobos : erébinthos ‘pea; chickpea’ (suffixation) Comparing Neighboring Substrates • By tracing irregular correspondences between related languages, you can identify non-IndoEuropean elements (as in the Saami family) • Schrijver (1997) discovered that quite a few nonIndo-European words have an a-prefix in one language, but zero in another. – G Amsel ’blackbird’ < *a-msl : Lat. merula < *mesl– ON ørt ’ore’ < *a-rud : Lat. raudus < *raud– Welsh erfin ’turnip’ < a-rp- : Lat. rāpum < *rāp- • NB: prefixed forms may lose their root vowel PROTO-INDOEUROPEAN GERMANIC CELTIC GREEK ITALIC LANGUAGE X (with a-prefixation) SCAND. C.EUR. BALKANS ANATOLIA … Comparing Neighboring Substrates Item Greek pea órobos < ervum < *orob- : *erw erébinthos < *ereb-indh sand ámathos : sabulum < psámmathos *sadh: psámmos < *sam(-n̥dh) gourd / cucumber lentil Latin cucurbita < *kukurbit láthuros *ln̥dh-ur- < lēns, lentis < *ln̥t- Celtic Germanic Balto-Slavic G Erbse < *orw-īd NoteEthat Pre-Gm. sand, *md MHG > PGm. *nd: sampt *hunda‘100’h-< < *samd *ḱmt-ó- vs. G sanft < *sam(f)þ- < *sóm-tOE hwerwette < *kʷerkʷád (G Linse = Lat. lent-) Comparing Neighboring Substrates Item Greek bean hemp bison, wisent crayfish, crab lead Latin Germanic Balto-Slavic faba < *bhabh The suffix of kánnabis < kábouros was no *kannabi doubt remodeled after índouros ‘mole’, skíouros ‘squirrel’, kíllouros ‘wagtail’, kóllouros ‘a fish’; kám(m)aros, sílouros ‘catfish or kábouros < sturgeon’. *kam(m)ar, *kabar- G Bohne < *bhaw-(n)- OCS bobъ < *bhabh- mólubdos, mólibos < *molubd, *molib G Blei < *mlīw plumbum < *plumdh- Celtic lúaide < ploud(h)- E hemp < Ru. konoplja *kanabi < *kanapi Ru. zubr < *dzumbr, dial. izubr *(u)i- wiG Wisent < <OPru. dzumbr, h- Lith. *wi-sund ssambras < stum̃ bras < *wi-somb *stumbr, h Latv. sumbrs < ON*(t)sumbr humar < (Kroonen *kumar2012) Comparing Neighboring Substrates Item Greek blackbird Latin Celtic merula < *mesal W mwalch < G Amsel < *mesal *a-msl- sturgeon turnip ore clover ráp(h)us < *rap(h) rāpa < *rāp raudus < Georg. sam-qura *raud ‘clover’, lit. “3-ear”: a false-positive? Borrowing as *semh₁r- / *smeh₁rconceivable? Balto-Slavic G Störe < *str- Ru. osëtr < *a-setr MIr. seisc < *sesk- E sedge < *sak- Ru. osóka < *a-sak W erfin < *a-rp G Rübe < *rāp Ru. répa < *rēp a-prefixation: *CVC - *a-CC sedge Germanic OHG aruz < *a-rud OIr. seamar < *semar- ON smári < *smēr Vedic Substrate • Roughly 4% of the Vedic lexicon is non-IE (Kuiper 1955) • Semantic fields: local flora & fauna, agriculture, artisanship, names • Non-IE features in the phonotactics, e.g. non-IE syllable structure or lack of regular retroflexion of s after r, u, k, i: – – – – busa- ‘chaff, fog?’ bīsa- ‘oven/pit with coals, volcanic cleft’ musala 'pestle’ kusīda- ‘lending money’ Recurring Non-IE Morphs • Possible non-IE prefixes: – jar-tila ‘wild sesame’, Atharvaveda tila ‘sesame’ – kumāra ‘boy, young man’, kuliśa ’axe’, kuluṅga ‘antelope’, kulāya ’nest’ – kimīda ’demon’, śimidā ’female demon’, kīnāśa ’ploughman’ – kākambīra ’a tree’, kakardu ’wooden stick’, kapardin ’with a hair-knot’, karpāsa ’cotton’, kavandha ’barrel’ • Compared to the article in Khasi (Austroasiatic), masc. u-, fem. ka-, pl. ki- (Pinnow 1959: 14; Kuiper 1995; Witzel 1999) A Universal LW Detection Method STAGE 1 isolated words 2 specific semantic fields 3 irregular phonotactics 4 irregular correspondences 5 systematic irregularity 6 recurring non-inh. morphs 7 links to neighb. substrates 8 source identified Saami Gmc. Greek Sanskrit Inherited + + + + + + - + + +/+ +/+ + - + + + + + + + - + + + + + +/- +/+/-(+) -( ) + Discussion • Roland Schuhmann (University of Jena): “No word that can only be explained as a substrate word.” • Martin Haspelmath (MPI-EVA): “According to Indo-Europeanists, when a word can be either an inherited word or a loanword, an IndoEuropean origin must always be preferred.”