Chapter 5 The evolution of referential signs David Dwyer, Michigan State University February 3, 2011 This paper is a working draft. Comments are welcome. Do not quote without permission. 1. Introduction Jackendoff (1999), following Bickerton (1990), expanded on a three stage sequence for the development of human grammar beginning with the use of “symbols” and their development and followed by Bickerton’s protolanguage and ending in modern language. Jackendoff avoids a discussion of how these symbols, which in this paper I call referential signs, might have developed. He also does not discuss their properties. This paper explores these properties and identifies the changes that were made in becoming the words of modern human language.1 The common ancestors of humans, chimpanzees and bonobos (hominini) lived 5 to 6 million years ago. Since that time humans and pan have evolved quite differently. Humans have developed upright posture, and the referential sign which the pan (chimpanzees and bonobos) did not. This paper focuses on the development of the referential sign in proto humans, which is qualitatively different from the signals used by other homininae.2 Although pan do not have the vocal apparatus to generate words the way humans do, they can, using the signs of American Sign Language, learn to communicate with them. Given that the hominini and even the homininae have many of the intellectual prerequisites for learning referential signs, we can assume that these abilities were possessed by our common ancestor. In fact, gorillas also have the ability to learn referential signs, so this argument can be extended to include the homininae. These observations raise the following questions about referential signs: 1. How are referential signs different from other types of signs? 1 One of the crucial problems we encounter in our exploration of the evolution of language has to do with the failure to provide precise characterizations to basic constructs. Most articles on this topic fail to provide any definition of terms like language, syntax and communication and often use them interchangeably, with the consequence that we do not always know when we agree or disagree conceptually. Bickerton’s protolanguage (discussed in section 4), for example, has been so loosely characterized that some take to be some sort of primitive syntax. The discussion here involves the development of referential signs and not syntax. I will use the Mead’s (1934) term symbolic interaction to avoid the debate as to whether this is language or not. 2 The terminology in this area is awkward. For example, the branch representing the human line of development is not strictly homo (hominid) until about midway between the time they separated from pan and the present. Becoming human is generally associated with bipedal locomotion, but this is too is murky because Lucy (Australopithecus afarensus) and Ardi (Ardipithecus ramidus) are human ancestors, but are not hominids. 5. The evolution of the referential sign 2. What intellectual abilities are needed for learning referential signs? 3. What representational differences are there between the signs that humans and pans use? 4. What behavioral, physiological and structural abilities enabled the shift to the production of vocal signs? 5. How did referential signs develop in humans? 6. When did referential signs develop? The remainder of the paper takes up these questions. 2. Properties of the referential sign Referential signs constitute only one of the sign systems used in human language. The other sign systems are tactic and representational. Tactic sign systems can be atactic, paratactic or syntactic. An atactic sentence consists of the word. A paratactic sentence consists of two words which stand in one of several possible case relationships. The words in a syntactic sentence, which are potentially infinite, stand in a specific case relationship. In a syntactic sentence, the specific case relationship specified for each constituent. Representational sign systems use vocal, manual, or other gestures. In this paper, I discuss the vocal representation system in some detail. For a further discussion of tactic systems see Dwyer (1986 and 2009 ms). Signs versus symbols Cassirer (1944) distinguishes between two types of signs: signals and symbols. Both the signal and the symbol (which I call a analytic sign or word) are signs in the Saussurian sense of the word because they consist of a signifier and a signified.3 The signifier is the token to which a concept is attached, the signified. For example, the letters t-r-e-e signify the concept tree. Mead (1934) gives the example of the mother hen who clucks when her chicks stray too far from her. Here, the peeping of a mother hen signifies something like my chicks are too far away. The distinction between signal and referential sign for Cassirer had to do with relationship between the signifier and signified. For a signal, the connection was instinctive for a signal and for a referential sign it was not instinctive and had to be learned. As I show below, the referential sign is an artificial construct and this is why there is no essential link between signifier and signified and why it has to be learned.4 Because users are free to attach any concepts to a signifier which means that we can name concepts as long as we have signifiers to represent them. This is why the signifier for the concept water is water in English, eau in French, agua in Spanish and njai in Mende. “Signals are “operators”; symbols are “designators” (Cassirer 1944:32). “In short, we may say Various authors use the terms sign and symbol, as well as signal, quite differently. Saussure’s sign, for example, is virtually the same as the symbol of White (1949), Mead (1934), and Cassirer (1944). To avoid confusion, I shall avoid the use of the word symbol, except when citing others, and use the term sign as a general term encompassing both signals and referential signs. 4 Saussure characterized a sign as consisting of a signifier and a signified. For the referential sign, the signified is the concept and the signifier is the gesture (manual, facial or vocal) that represents it. Saussure also emphasized that unlike signals, there is no fixed relationship between the signifier and signified for referential signs 3 2 5. The evolution of the referential sign that the animal possesses a practical imagination and intelligence whereas man alone has developed a new form: a symbolic imagination and intelligence” (Cassirer 1944:34). Thus Cassirer and Mead believed that this distinction marked the key difference between humans and other animals. As we learn more about the signing of nonhumans we will discover that the transition between signal and referential sign is not the discrete barrier that we, like Cassirer and Mead, assumed, but one of a number of incremental steps. For example, the discovery that all hominini possess the ability to use referential signs means both that humans are not alone in this ability and that this ability precedes the development of the referential sign. In addition, Cheney and Seyfarth’s (1990) work showed that vervets use signs, because the acquisition of these signals required some learning. Because of the arbitrary connection between the signifier and signified we cannot react instinctively to a signifier as we can with a signal. And because the issuance of a referential sign is done so intentionally, we interpret this sign not only by identifying it, but by trying to discover why the speaker sent the message? Was the intention to warn us, inform us, deceive us or something else? Because of this, our entering into the world of referential signs transforms our way of interacting with others for we no longer react to the others signals, but to the intention behind these signals. This form of symbolic interaction also increases our ability to influence the other verbally and our interest in what the other understands. This conclusion applies to all homininae and that as a result of learning to use referential signs, chimpanzees, bonobos and gorillas too have been transformed in a like way. 3. Similarities and differences between human and ape signing To understand the evolution of the evolutionary sign it is necessary to recognize both differences and similarities between ape and human signing. Similarities include intentionality, learnability, expressive and referential sentences; differences include signals verses referential signs; the representation of the signifiers of referential signs, and physiological differences in the vocal tract and brain. Communicative channel Ape signifiers use vocal, visual (including facial and body gestures), and tactile channels, while human signs are predominantly vocal. Furthermore, ape vocalizations involve a wide range of sounds (whrrs, lip smacks,) human vocalizations are almost exclusively syllabic. Although apes can produce syllables, they are capable of only one syllable per utterance, while humans are capable of several dozen per utterance.5 The human reliance on the vocal tract reflects several changes in the human vocal apparatus that we explore below. While the ape vocal tract is capable of producing some vowels, [u] and [ǝ], they cannot produce the full set of vowels that the modern human vocal tract can, [i, e, a, o, u and ǝ], nor can they add consonants before or after a vowel to form a complex syllable. Behavioral comparisons 5 Apes can produce longer utterances by using both ingressive (inhaling) phonation and egressive phonation. 3 5. The evolution of the referential sign Comprehension. Many signing hominini show an ability to understand vocal and gestural referential signs. Even though apes do not have the vocal apparatus to produce referential signs, they can produce them using the hand gestures of American Sign Language. Learnability. The capacity to produce referential signs the other apes are capable of learning to use referential signs, even though they do not use them in the wild. However, the maximum vocabulary size reported for a chimpanzee (Washoe) is 250 signs.6 While impressive this number is small when compared to the human three-year old with about 1000 words, the human five-year old with about 5000 words and the human adult with a vocabulary numbering in the thousands. Intentionality. The development of the capacity to use referential signs includes the capacity to intentionally control the production of signs.7 Because feral apes do not use referential signs researchers once assumed that they did not have the ability to control the issuance of signs. This changed when Washoe was taught to use the hand gestures of American Sign Language. While much of ape signing involves requests for food, they also comment on things they see in magazines, their relationship with others and the bonobo Kanzi has been reported to negotiate with her human caretakers (Savage-Rumbaugh et al n.d.). All of these activities involve the intentional use of signs. This means that intentionality of this sort preceded the development of the referential sign and can be considered an exaptation for the development of the referential sign. Along with the intention to communicate is the awareness that what you say can influence the other’s behavior. This is seen most dramatically in the intent to deceive, but efforts to assist by informing are equally significant. With the awareness that the other’s behavior can be influenced, comes the increased interest in the other. Phatic Communication. Humans like to talk even when they have nothing to say. Talking about the weather is rarely done because people want to share crucial information, but rather as Malinowski (1923) points out, to interact with others. I call this the gift of the gab. While this ability is more pronounced in humans, chimpanzees also engage this type of activity. Analytic statements. In addition to the distinction between signal and symbol, Cassirer (1944:29) developed a second opposition, based on the work of Révész (1940). His distinction between emotional and propositional language is similar to the distinction between referential and expressive language proposed by of Kita (1997). For Kita a referential sentence descriptive of an experience, while an expressive conjures up the emotive and perceptual activation in an experience.” An analytic statement may describe “an experience, but not a rendition of an experience itself. Thus, one may conceptualize unpleasantness in the referential dimension without actually feeling any unpleasantness.” An expressive statement does the opposite. Although Kita’s uses the term referential in opposition to expressive, Dwyer and Moshi (2002) prefer the term analytic, not only because the term referential is used to describe the sign, but because an analytic sentence, while using referential signs, actually analyze a situation and breaks 6 See appendix 2 for a composite listing of the words learned by the chimpanzees Washoe and Nim and the gorilla Koko. 7 I use of the word intention to mean the willful use of the word. 4 5. The evolution of the referential sign it down into its components such as agent, action, object. Nevertheless, the categories of analytic and expressive provide humans with two modes for communicating their experience. We can either describe it analytically or provide express our experience of it expressively. In our (Dwyer and Moshi 2002) paper, we argued that true ideophones were not part of the referential scheme and that ideophones were verbal expressives as were hand gestures and facial expressions. Although, when compared to other languages, English ideophones are sparse, they do exist. Examples include bang, pow, kerplunk, and zing. Ideophones can be used alongside referential statements. The egg fell to the floor: splat. The hand and facial gestures that accompany much of our conversation are expressives. Thus following Kita, an analytic sentence describes an experience by removing the subject’s emotional association from it and by breaking it down into words related to each other by case relations. Expressive language on the other hand represents an experience by through imagery. It is an emotional, as opposed to a rational, representation of experience; it arises from one’s emotions and gives an indication about one’s emotional state in a given situation. These two modes of expression are reflected in the terms reason and intuition, and at are at the heart of the distinction between prose and poetry. While poetry is presented in analytic sentences, it is intended to be understood expressively. This distinction can also be found in Austin (1969) classification of the three dimensions of a sentence: the locutionary; the illocutionary; and the perlocutionary. Austin’s locutionary consists of the grammatical meaning of the sentence, that is, the analytical mode. The expressive mode is represented by the other two dimensions: the illocutionary, how the speaker intends the listener to take the statement (as a question, a request, an apology, a warning, etc.), and the perlocutionary, how the speaker feels about the statement and the illocutionary dimension. Austin pointed out that a sentence consists of more than the expression of information (the locutionary dimension), because it involves an intended listener who is not only to understand what is said (locutionary), but to understand what the sender expects the listener to do, and to understand how strongly the sender feels about the statement. From this perspective, expressives lack the referential (locutionary) dimension, but referential statements still contain the expressive illocutionary and perlocutionary dimensions. Once they have learned to sign, apes have no difficulty in using referential signs analytically. Examples needed here. Semiological Comparisons The complex signifier. A simple signifier is one that cannot be broken into smaller elements. For example, purr of a cat cannot be broken down into smaller elements that form part of the signifier of another sign. In contrast, the signifiers of human referential signs are complex. For example, the signifier for the word cat, be it the sequence of letters C-A-T, or the sequence of phonemes /kæ-t/ or the gestures of American Sign Language8 is complex. In each of these significations of the 8 In American Sign Language the word cat is signified by the holding of the right thumb and forefinger above the lip and moving the hand to the right as though one was pulling a whisker 5 5. The evolution of the referential sign concept of cat is complex because each signifier can be broken into smaller elements, called phonemes, and because each of these units can be used in signifying other words. The sequence ca-t can be recombined to represent the word act; the sequence /k-æ-t/ can be recombined either /æk-t/ (act) or /t-æ-k/ (tack). The phonemic principle is based on the Saussurian principle of semiological system in which the number of the signs in a given system is finite. The individual sounds found in the sequences /k-æt/, /æ-k-t/and /t-æ-k/ are such signs and are termed phonemes. Because the set of phonemes for a given language is finite, as Saussure pointed out, derives its value from the fact that is not any of the other signs in the system. In the above example from English, /k/ is not /t/ and not /æ/, etc. The phonemic principle means that, for any language, the number of phonemes used to represent signifiers is drawn from a finite set and that that the signifier of a referential sign consists of a string of one or more phonemes. A typical inventory of phonemes for a modern human language is given in the table below. Stops: Voiceless Stops: Voiced Fricatives: Voiceless Voiced Nasals Liquids Consonants (C) Labial Dental p t b d f s v z m n l and ř Palatal č j š ž ň Vowels (V) Front High i velar k g x ɣ ŋ Mid e Low Glides y Back Rounded u o a w Phonotactics. Not all strings are possible because there are restrictions on which sequences are possible. The syllable provides the key to understanding these restrictions, because a signifier consists of a string of one or more syllables and each phoneme is attached to one of these syllables. Most typically a syllable has a nucleus consisting of a vowel (V) and an optional onset and coda consisting of one or more consonants (V). The phonemes of a language are classified as either V or C. Thus typical syllabic structures are CV (known as an open syllable) and CVC (known as a closed syllable). The phonotactics of a language describe the permissible syllable structures. There are additional complications to the phonotactics of many languages including consonant clusters, complex nuclei (diphthongs) and subclasses of consonants and vowels that result in more complex phonotactics, but the syllable structure described here is sufficient for our purposes. This development of the complex syllable and the phoneme makes it possible to represent thousands of referential signs using a small inventory of phonemes. This potential is even greater with the development of the capacity to produce several syllables in one breath. These developments are responsible to what I call the shift to vocality, that is the shift to using exclusively vocally based signifiers for referential signs. Physiological comparisons The brain 6 5. The evolution of the referential sign Two areas of the brain, known as Broca area and Wernicke’s area, are central to the processing of language, though of course, many other parts of the brain contribute to the process of objectifying thought. Broca’s area is located in the area of the brain which controls face and mouth movements and the articulation of complex syllables. This involves both the movements of the tongue and jaw, but also the sequencing of these movements. Generally speaking, Broca’s area has to do with the sequencing of language, both syntactic and phonetic strings. Wernicke’s area has to do with the recognition (and monitoring) of speech as well as being the place where referential signs are stored. It was once thought that apes did not have these areas, but recent studies report that (see below) they exist in other apes and some monkeys as well, though these areas have been found to be larger and more developed in humans. The arcuate fasciculus is a nerve bundle leading from Wernicke’s area to Broca’s area. It is larger and more developed in humans than in other apes and primates. Apparently9 it consists of two separate pathways according to Ploog (2002), reported by de Boer (to appear). One, called the “cingulate vocalization pathway,” and is concerned mostly with expressing emotions and is found in most primates. The second, called the “neocortical vocal pathway,” has to do with fine motor control and appears to be “more developed in species that are more closely related to humans.” These two pathways correspond to the expressive and analytic functions of speech. Length of phonation Phonation is the process of generating sound by passing air from the lungs across the vocal cords. The length of phonation is the time that a given species can phonate in a single breath. Humans have a much greater length of phonation than do the other apes. This is due to changes in the lungs and the larynx. The chimpanzee lung has a volume that is consistent with the amount of oxygen that can be absorbed in a breath. In contrast, the human lung holds four times the volume. This greater capacity allows for longer phonation with a single breath.10 In addition, the human larynx is narrower than the trachea (wind pipe) meaning that the vocal chords are much smaller than those of the chimpanzee. Because that it takes less air for the human larynx to phonate, the length of phonation is greater than that of the chimpanzee. These two changes account for the increased length of phonation in humans.11 Vowel production The hominid vocal tract is capable of producing syllables. Syllables consist of a vocalic nucleus (vowel) as described above. Vowels are produced by the vocal tract which acts as an acoustic filter which shapes sounds produced by the larynx. The modern human vocal tract consists of three acoustic filters, the pharynx, the oral cavity and the nasal cavity. 9 It is not clear from Ploog whether these two pathways are part of the arcuate fasciculus or are other pathways. And also allows one to hold one’s breath, another requirement of longer phonation. 11 With respect to breathing and gathering an adequate oxygen supply, the narrowing of the larynx is actually maladaptive. 10 7 5. The evolution of the referential sign An acoustic filter acts to reinforce certain frequencies and to suppress (filter out) others. This ability is a function of the length of the filter and whether it is closed or open at one or both ends. The reinforced frequencies are known as formants. While there are many formants produced for any given acoustic filter, the two formants with the lowest frequencies are sufficient to identify the vowel. For example, a schwa, the vowel sound in the word but, is produced by an obstructed vocal tract. The same result can be achieved using an open tube 17 cm long (the length of the human vocal tract) and a sound source. While the larynx provides a fundamental frequency, it also produces overtones, also known as harmonics and it is these overtones that the filter shapes into formants. There are two competing configurations of a 19 centimeter long acoustic filter, (1) the single resonator and (2) the coupled, double resonator. The single resonator has the capability of producing a schwa [ǝ] and an [u]-like vowel. The schwa as mentioned above involves an unattenuated tube roughly 17 centimeters long, while the [u] involves only the oral cavity and lip rounding which both lengthens and closes off the tube.12 Interestingly, Hoover a Harbor Seal could also produce these two vowels that resembled an[ǝ] and an /u/ and also reflects the capabilities of a 17 centimeter long resonator. The double resonator is capable of producing the cardinal vowels [i] (as in he), [a] (as in ha, and [u] (as in who).13 In adult humans, these vowels are easily articulated with gross movements of the tongue and lips. The [a] involves a closed pharynx and an open oral cavity. The [i] is the reverse with an open pharynx and a closed oral cavity. The articulation of the [u] involves an open back half of the oral cavity and a closed front half which is achieved with lip rounding. The pharynx contains a duplicate of the oral cavity. These configurations produce formants which are acoustically maximally distinct as shown in the sidebar. 12 I need some study to support this assertion. Actually the human vocal tract consists of three coupled resonators, because the nasal cavity is also an important component of this system as is explained below. 13 8 5. The evolution of the referential sign This arrangement allows for the simple movements of the tongue to produce vowels with substantially different formants. For [i] the first two formants are high low; for [a] they are both midrange; and for [u] they are both low. These three vowels, known as the cardinal vowels represent the maximally distinct vowels of human language, meaning that the other vowels of human language, both acoustically and articulatorily, fall within this range. The vast majority of human languages also have the contrasts [e] (day) and [o] (go). They are formed by the tongue being not so high in the oral cavity for [e] and not so far back for [o]. Many languages have more; English for example has 11 vowel contrasts. With the single tube oral cavity, only two vowels are possible, while with the coupled, 2-tube vocal tract, a much larger vowel inventory is possible. In humans, the pharynx is about the same length as that of the oral cavity. In chimpanzees, the pharynx is much smaller. During the course of human evolution, the pharynx lengthened by the lowering of the larynx while the oral cavity shortened producing a two-tube filter in which the two tubes are of equal length.14 Wind (1983) suggested that the original creation of the pharynx as a second resonator occurred as a consequence of upright posture because upright posture resulted in a different positioning of the head on the body. The vowel [ǝ] is can be produced by both configurations, while the articulation of [a] requires the coupled resonator configuration. Although [ǝ] can be produced by the coupled, two tube resonator, it is less commonly used in human languages, presumably because of its acoustic similarity to [a]. When the acoustic properties of the human two-tube vocal tract, it was thought that this configuration (a coupled resonator of two tubes of equal length) was essential for the articulation of human vowels. With this assumption, Lieberman and Crelin (1971) concluded that because Neanderthals a relatively short pharynx, they could not have produced the vowels of modern human languages. More recent research (Boe et al 2002) has shown that Neanderthals were of capable of producing these vowels, albeit with much greater difficulty than humans. De Boer (2006) also shows that earlier hominini could also produce several vowels, but probably not the maximally distinct cardinal vowels, which require a coupled resonator configuration. This evidence suggests that we need to distinguish between a modern human vocal tract in which the coupled resonator is genetically fixed, and earlier versions in which the features of a double resonator can be approximated. Given the greater ease of articulation of the genetically fixed double resonator, natural selection would have favored its natural selection in sign users. The 14 These two tubes do not have to be of equal length to produce the cardinal vowels, but it is clear that the process of vocal tract ended when the two tubes attained equal length. That is, there was no point in further lengthening the pharynx. 9 5. The evolution of the referential sign facility of the fixed configuration would also explain the shift to vocality. It is also possible that the two-tube resonator was better suited to the articulation of consonants. Also, because of the similarity between [a] and [ǝ] modern vowel systems tend not to use the [ǝ], though English is an exception.15 The vibrating larynx produces the sound source for the vocal tract. In contrast to the other apes, the human larynx has an opening that is narrower than the trachea (wind pipe) resulting in shorter vocal cords. Shorter vocal cords require less energy to phonate than a longer ones. While this development can be seen as positive with respect to phonation, it does make it more difficult to inhale air for breathing. In addition to being shorter, the vocal cords, which vibrate to produce sound (phonation), are thinner than those of other apes and this development too increased the length of phonation. The lung, which powers the larynx, increased in volume but not in the ability to absorb oxygen. The additional capacity would allow for longer phonation. Both of these developments incased the length of time of phonation. Contrastive nasal sounds As mentioned above, the human vocal apparatus has a third resonator, the nasal cavity. Unlike the oral-pharyngeal resonators, the nasal cavity cannot be changed in length, but it can be opened and closed using the velum at the back of the oral cavity, something that chimpanzees cannot do. When the velum is open, there is an additional nasal format at about 250 Hz which gives both consonants and vowels a nasal quality. When the velum is closed, vowels and consonants are oral. The development of the ability to control the velum is a precondition to the development of oral consonants which form the basis of the complex syllable. Oral Consonants Oral consonants are produced with the velum closed and by the interference of the flow of air in the oral resonator, so much so that it interrupts the filtering capacity of the resonator and hence the syllabic character of the sound. This interruption may be complete in the case of the stops (typically p, t and k and b, d and g), or partial in the case of the fricatives (typically f, s, š and x, and v, z, ž and ɣ). Voiced consonants are produced with phonation, while voiceless consonants are produced without phonation. The nasal consonants are produced like the voiced oral consonants, but with the velum open. Consonants can be added to the beginning of the syllable (onset) or end (coda), producing a complex syllable.16 Chimpanzees do not produce consonants and attempts to teach them to make these sounds have been unsuccessful. In the early 1950s, Keith and Catherine Hayes (1951) began an experiment to attempt to teach a chimpanzee named Vicki to produce syllables. After considerable training and 15 Not everyone supports the importance of the double resonator in the development of human orality. For example, Ohala (2000), a phonetitian claims that the development of the two tube vocal tract “is independent of and thus irrelevant to the evolution of speech.” 16 In addition, an onset or coda can contain a liquid (l and r) or a glides (y and w), which are not true consonants. This is because the obstruction of the vocal resonator, it is not enough to disturb the filtering effect. 10 5. The evolution of the referential sign coaxing, Vicki, was able to say the word “cup” and “papa” (phonetically [mamə], [papə] and [kəp]). Vicki had a number of problems articulating these words. First, her vowels were whispered [voiceless], reflecting her difficulty in controlling her larynx. Second, Vicki could not produce a velic closure which meant that all sounds would have a nasal coloring. To produce a nonnasal sound, she had to cover her nose manually to block the passage of air through the nasal resonator. Third, to produce the bilabial sounds [m] and [p] Vicki had to manually close her lips. This means that the only consonant that Vicki could produce was [k] and this is with her nose manually covered. Without the manual blocking of the nasal resonator, she would have a sound much like the modern [ŋ] as in sing.17 This view is supported by the higher frequency of velars in most lexical systems. Thus one may conclude that alveolars and labials were added later. 18 Summary of similarities and differences Communicative similarities and differences between humans and other apes Domain Similarities Differences Physiological: A vocal apparatus capable of producing The larynx became a more efficient phonator; The vocal tract noncomplex syllables and the vowels [ǝ] and A genetically fixed coupled resonator evolved; [u]. This apparatus includes the lungs which The lungs increased in volume; can power the larynx to produce sound that can The tongue became more agile; be filtered by the oral resonator. Velic closure allowed purely oral sounds These developments led to the ability to produce complex syllables. Physiological: Have common modules in the brain attributable Increased ability to control vocal articulations The brain to language (Broca’s area, Wernicke’s area and (Broca’s area); the arcuate fasciculus).19 Increased word storage capacity (Wernicke’s) Increased ability to send messages from Wernicke’s area to Broca’s area (arcuate fasciculus) Behavioral Can learn referential signs; a word memory of at An increased urge to communicate phatically least 250 words. Can use referential signs intentionally. Can recognize gestural signifiers signs and some can recognize vocal signifiers. Possess the ability to use analytic sentences. Semiological Uses all types of gestures for signaling Relies predominantly on the vocal mode. The appearance of the referential sign The phonemic principle A vocally based representation system using complex syllables to signify referential signs. 17 This observation conflicts with Lieberman’s claim that not only could Neanderthals could not have articulated the cardinal vowels, they could not have articulated the velar consonants [k] and [g] (reported by Jurmain 1997). 18 A film clip of Vicki producing these words appears in the Nova program entitled, “First Signs of Washoe”. 19 Although these modules were not developed sufficiently to produce complex syllables, they have the neural hardware to comprehend and to produce analytic signs as well as to produce voiced syllables. 11 5. The evolution of the referential sign These changes cannot be understood as autonomous developments but as dialectically intertwined developments, not only of these features, but with other developments as I show below. 4. The evolution of the referential sign The preceding section described the similarities and differences between the communication of chimpanzees and humans. The similarities are important because they help to define the point of divergence. The differences indicate the changes that the homo lineage underwent. The source of the referential sign Cassirer’s discussion suggests that signals and referential signs are very different types of signs. Burling (1993:132) argues that signals “are too narrowly constrained by biology to be converted into learned and conventionalized signals” by which he means referential signs. But if referential signs are not derived from signals how did they arise? Given that other apes have no difficulty in learning and using referential signs, it is clear that our ancestors were capable of learning to use signs. But how is it that our ancestors had the ability to learn these signs when such signs were not part of their environment? Natural signs Semiological signs are similar to what I call natural signs. All the apes, and I suspect most mammals have the ability to interpret things and events in their environment for indications of danger, safety, food and other needs. Natural signs are not simply responses to environmental stimuli, but are conceptualizations about the environment. For example, a predator like a leopard will appear in many different shapes and contexts. Nevertheless, the observer recognizes all these manifestations as the same concept. Having done so, the observer can consider what the appearance of this predator means and what needs to be done about it. The difference between natural signs and semiological signs is not in It does not seem clear from the discussions that humans were their conceptualization but in their issuance. While users have no preadapted. control over the issuance of natural signs, they do with semiological signs. They can decide which sign to send or not to send a sign at all. This is an ability possessed by all hominoids. It is clear from this discussion that proto humans were preadapted for learning referential signs. For this reason, we can rule out a genetic change as the source of the referential sign. Lock and Bogan (2005:5) note that “the An alternative scenario As an alternative, let me offer the following scenario. First, bipedal locomotion (and upright posture) was first major development after the homininis separated from the pan.20 Bipedal locomotion enabled carrying, of 20 obstetrical dilemma was eased when some amount of skull and brain growth---and motor development---were adaptively deferred into the postnatal period, increasing infant dependency and the need of postnatal care. The normal gait of the chimpanzees is a knuckle walk which involves an almost upright posture with support from the arms and fisted hands. Chimpanzees can walk bipedally although unlike Ardipithecus ramidus and A. Afarensis, 12 5. The evolution of the referential sign infants, of food from its source, and tools. The capacity to carry infants means that the infants do not have to hold on as do chimpanzees and bonobos and this has several consequences. Carrying means that infants can be born with greater dependence on their parents. Compared to the chimpanzee and bonobo, the human infant is less able to care for itself and cannot hold on to its mother. Thus carrying means that the infant can continue to gestate exutero. Given the limitations imposed by the diameter of the birth canal, this development would allow the head and more importantly the brain to continue to grow after birth. Carrying may also be associated with loss of body hair since infants no longer need to hang on, a task that would be made more difficulty by upright posture. Carrying food enables more permanent settlements and involves the sharing of food, and possibly division of labor. Carrying can also lead to more sophisticated tool manufacture. A tool maker is confronted with a question of what to do with the tool after making it? If I plan to leave it at the use site, then it does not make sense to put too much time and effort into its manufacture. However, if I plan to carry it with me, it does. In fact, carrying also allows for tool specialization.21 Neither the bipedal Ardipithecus ramidus (-4.4 MY) nor Australopithecus afarensus (-3 MY) have been found with manufactured tools, although such tools may have been constructed from wood and other perishable materials and hence not detected. Tool use For this reason, I propose the hypothesis that tool use may have played an important role in this development. While it has been suggested that language may have been needed to instruct others in tool making, other evidence suggests that learning to make the kind of tools made in the lower paleolithic by Homo Habilis and Homo Ergaster could be done through imitation ( Hewes 1993, Ambrose 2001, Widgen 2004). The earliest manufactured stone tools, known as the “Oldiwan tradition” date from 2.5 million years ago. Examples in the sidebar from left to right, include an end chopper, a heavy-duty scraper, a hammer stone, a flake chopper; a bone point, and a horn core tool or digger.22 In contrast to the view that symbolic interaction was a prerequisite for tool manufacture, I propose the opposite, that tools are the exaptative predecessor of words. This is because, as a manufactured item, a tool can be proffered as the signifier for a referential sign. We think of tools as something to do something, but a tool can be proffered too. For example, when I see someone brandishing a their anatomy does not show any major adaptations toward bipedalism. 21 Bipedal locomotion, as Armstrong et al (1994) point out would have freed the hands for signing as well. 22 http://www.handprint.com/LS/ANC/stones.html 13 5. The evolution of the referential sign spear, I ask what intent does that person have? Could it mean: (1) “let’s go hunting;” (2) “let’s make tools;” (3) “go away or I will hurt you with my tool;” or any number of other things. 23 When proffered, a tool is interpreted in much the same way as a sign, that is what is the intention of the profferer?24 At this point, the spear, with its interpretation, becomes an incipient referential sign. For an incipient referential sign to become a true referential sign, the meaning needs to become fixed by convention. Property Tools Signs Manufactured objects Yes Yes Involve intentionality Yes Yes Intentionally produced Yes Yes Capable of bearing meaning Yes Yes Referential statements. Potential Yes While tools do appear in other species, early human tools show a greater variety, both in form and in function. In this way, tools are likely to have provided the stimulus for developing referential signs.25 The capacity to recognize referential signs is wide-spread among other species. Dogs are well recognizing their names and of other objects. Chimpanzees, Bonobos and Gorillas have learned to recognize the gestures of American Sign Language and many have been shown to recognize human vocal words. This scenario explains the evolution of referential signs as the consequence of exaptation and not the result of a genetic change. To be sure, modern humans have developed a greater ability to recognize and use referential signs than the great apes and this does represent a genetic change. But as argued in chapter 02, this change we view geAlthough socialized chimpanzees and bonobos have the ability to learn analytic signs, no evidence of such usage has been found in their free ranging kin. The conclusion that analytic signs are a human development raises the question of when and how. First of all, we can rule out a genetic change, for it is clear that chimpanzees have no difficulty in learning and using analytic signs. If the development of analytic signs was not due to a genetic development, then we have to look for some conceptual development. Tools, Broca’s area and sign production Greenfield (1991), Ambrose (2001) and others have noted another important connection between tool use and referential signs. The fine muscle control needed to manufacture tools is located in Broca’s area. Broca’s area is better known as one of the two major modules of the brain most closely connected with language, Wernicke’s area being the other. Specifically, Broca’s is associated with the articulation of words and the production of syntax. Thus the increased use and 23 Needless to say, the richer the tool kit, the richer the sign system. George Herbert Mead (1934) proposes that the assignment of meaning to an object involves the imaginary completion of the signing act. Berger and Luckmann’s (1967) description where someone has thrown a knife at someone which sticks in the bedstead above him. After the thrower has fled the presence of the knife continues to remind one of the thrower’s intent. 25 This argument is similar to the Tool-Cue Model of Byers (1999) who proposed that tools the potential to serve as icons. 24 14 5. The evolution of the referential sign manufacture of tools could have placed increased pressure on the development of Broca’s area a development that could well have been an exaptation not only for the development of the vocal apparatus but subsequently syntax. Paratax The term paratax is similar to, but not identical to Bickerton’s (1990) protolanguage. Bickerton sees protolanguage as an incipient form of syntax and includes, in addition to the sentences of signing apes and two-year humans, early versions of pidgins, whereas paratax (Dwyer 1986 and 2009) on the basis of the formal properties displayed by the sentences of signing apes and young humans concludes that these do not represent syntactic grammars, and for this reason excludes incipient pidgins. Almost as soon as humans and other apes learn referential signs they produce paratactic sentences. The first such sentences in all populations are deictic paratax (Greenfield et al 2008), meaning that one of the words in the paratactic sentence is a referential sign and the other is a pointing gesture. Deictic paratax allows the individual to intentionally draw the other’s attention to an entity identified by the deictic gesture and comment on it using the referential sign. Deictic paratax is subsequently replaced by full paratax involving two referential signs. While full paratax involves the same topic and comment as deictic paratax, it has the advantage of including topics that are not physically present. This is because the deictic gesture must point to something physically present. The development of signifiers Although protohumans had the capacity to learn about 250 referential signs, these signs still have to be developed and this means finding new signifiers to attach to concepts (signifieds). At this time, as several authors have suggested, 26 these referential signs would have been represented by a variety of channels including manual, facial, and vocal gestures. The development of new signifiers is not an easy process and at this early stage that vocabulary grew slowly. Each new sign would require a new signifying gesture from one of these channels until vocality produced a much easier mechanism for generating signifiers. During this period it is likely that natural selection favored abilities to learn words, an interest in communicating with others and to generate new signifiers. The transition to vocality The transition to vocality based signifiers involved four interrelated developments described in the section on similarities and differences. In this section I review what these developments were. Using the source-filter model, we see changes in the source, the filter and the attenuators. Changes in the vocal tract 26 Armstrong, 1999; Armstrong, Stokoe & Wilcox, 1995; Campbell 2000, Corballis, 1992, 2002; Givòn, 1995; Hewes, 1973, 1996; Rizzolatti & Arbib, 1998. 15 5. The evolution of the referential sign Changes in the source involved changes in the larynx which became smaller. This allowed for longer utterances and the possibility of a sequence of several syllables. This would allow the production of a totally vocal paratactic sentence. Changes in the attenuators involved velic control and modifications to the tongue and lips. Velic control enabled the coupling and decoupling of the nasal resonator. The decoupling of the nasal resonator through the closure of the velum enabled the articulation of purely oral sounds which is a precondition for the articulation of true consonants. The tongue is essential for modifying the resonators to produce different vowels as described above. In addition the tongue can interfere with the oral resonator to inhibit resonance and produce dental and velar consonants, either stops or fricatives. The rounding of the lips alters the acoustic properties of the oral resonator by increasing its length and by effectively closing it off at one end. In addition the lips can, like the tongue, inhibit resonance to produce labial stops and fricatives. Changes in the filter involved the development of the coupled, triple resonator. Velic control enabled the coupling and decoupling of the nasal resonator. The descent of the larynx involved the emergence of the pharynx. The configuration of this triple resonator allowed the easy articulation of the cardinal vowels as well as the intermediate vowels [e] and [o], and [ɛ] and [ↄ]. This configuration also made it easier to attenuate the oral cavity for the articulation of consonants. The development of the complex syllable with true consonants The development of the complex syllable involved the attachment of consonants to either end of the syllable’s vocalic nucleus, though all hominini show no ability to produce true consonants owing to the lack of a velic closure (which separate oral from nasal sounds). At this point, let me speculate. Given that Vicki was capable of producing a velar closure, the control of the velic closure would create the possibility of a [k/ŋ] distinction. And ki ka ku given that chimpanzees are capable of glide (nonsyllabic vowels) onsets as in the waa ŋi ŋa ŋu bark could develop creating complex syllables beginning with [w]. With the wi wa wu development of the vowel [i], the glide [y] would also emerge. These developments yi ya yu would produce 12 different syllables as potential signifiers. The development of words with complex phonological signifiers would not have eliminated gestural signifiers, in fact they exist today, however as the vocal apparatus became more adept in producing complex syllables with more contrast, it would become the primary mechanism for representing referential signs. With a longer period of phonation, two syllable utterances became possible including two-syllable words and paratactic (analytic) sentences. New words could have arisen from reinterpreting a paratactic sentence as a word, a common human process known as lexification. Washoe has coined several lexified two word sentences including dirty monkey – to describe a monkey she didn’t like and candy-drink to describe a watermelon. From this process, polymorphemic, polysyllabic words arise. Almost all words in modern language with two or more syllables are morphologically complex or can be discovered to have been so even though they are no longer 16 5. The evolution of the referential sign transparent to its users and are taken to be morphologically simple. Most languages have what have come to be called cranberry morphemes. Such morphemes combine with recognizable morphemes to produce a new word. We know that a cranberry is a berry even though we do not know what a cran is. The same can be said for the rasp in raspberry. Such morphemes arise because although the free form of the word has fallen out of usage, the morpheme survives in compounds. Cranberry morphemes are of interest here because they show that strings of syllables can have representational, as opposed referential value and can be used to distinguish one word from another. Thus in time it is possible that these polymorphemic words would have lost their transparency and would be taken to be monosyllabic. Subsequent consonantal developments would include: Additional points of articulation: a labial stop creating a [p/m] contrast; an alveolar stop creating a [t/n] contrast; and a palatal stop creating a [c/ñ] contrast. These developments would allow 18 more complex syllables.27 These oral consonants are called stops. pi pa pu mi ma mu ti ta tu ni na nu či ča ču ñi ña ñu The consonants [p], [t] and [č] are voiceless. The term voiceless means that unlike vowels, they are produced without the vocal cords vibrating. In the production of vowels, the vocal cords are positioned to allow spontaneous voicing so that when air from the lungs is passed through the larynx, the vocal cords vibrate spontaneously. However, because the articulation of consonants impedes this flow of air the vocal cords stop vibrating and this is why these consonants are voiceless. bi ba bu It is possible to adjust the vocal cords so that they continue to vibrate even though di da du ji ja ju the oral resonator is occluded. When this done, the voiced counterparts of the gi ga gu consonants described above are produced. This produces an opposition between [p] and [b]; [t] and [d] and [č] and [j]; and [k] and [g]. This would add another 12 complex syllables to the inventory. It is also possible to interfere with spontaneous voicing with only a partial blocking of the oral cavity. This partial blocking creates local turbulence at the point of articulation. This turbulence, creates a noise that distinguishes it from the stop, thus distinguishing a [p] from an [f]; a [t] from an [s]; a [č] from an [š]; and a [k] from an [x] in the voiceless series and a [b] from a [v]; a [d] from a [z]; a [j] from a [ž] and a [g] from an [ɣ] adding 24 more contrastive syllables. These consonants are called fricatives. The examples here use open syllables, because closed syllables are not as common in the world’s languages, and probably developed at a later time. For example, it is of the case that closed 27 The palatal point of articulation is less common than the others (labial, alveolar and velar) and it is quite likely that they were not part of early complex syllables. This observation is supported by the fact that palatal consonants commonly arise when an alveolar or velar consonant precedes the vowel [i] and sometimes [u], a process known as palatalization. 17 5. The evolution of the referential sign syllables are derived from a sequence of two open syllables in which the final vowel is lost: CVCV CVC. There are of course many other developments, such as the development of the liquids [l] and [r] and consonant clusters (CCV) which further increases the inventory of contrastive syllables. Although not all 66 syllables would have developed with the onset of the complex syllable, this exercise does show the potential of the vocal apparatus to produce signifiers. With 66 contrasts in a single syllable, a two syllable sequence would have 4356 contrasts or potential signifiers. Changes in the brain As the representational system expands, greater pressure is likewise imposed on the communicative channel, in this case the auditory one including the vocal tract. When did referential signs appear? While it is impossible to directly determine when referential signs emerged, it is possible infer that their use is correlated with the development of the vocal tract and the language areas of the brain. The evidence for these physical developments leads de Boer (to appear) and others to conclude that the “that adaptations for speech must have first appeared in the common ancestor of Homo sapiens and Homo neanderthalensis and that complex speech must have had at least 400 000 years to evolve. The evidence de Boer cites developments in Homo ergaster (1My-200KY): enlargement of Broca’s area, though possibly as early as H. habilis (2-1MY); oral cavities that are either within or close to human range; and enlargement of the hypoglossal canal, though smaller than that of modern humans. The development of the pharynx through the lowering of the larynx to modern human dimensions and modern hearing abilities, however, was detected only in early Homo sapiens. The date of 400,000 BP is considerably before the date posited for syntax be it 50,000 BP for the Eurocentric hypothesis or 100,000 BP for the Afrocentric hypothesis. These conclusions are consistent with Jackendoff’s (1999) two stage model. The evidence for changes in the brain and vocal tract in H ergaster means that natural selection favoring these changes was in progress. And the fact that these changes were in progress strongly suggests that H ergaster was using referential signs. Although the fossil evidence cannot show most of the physiological changes, let alone the behavioral and semiological changes, it is highly likely that most of these took place in H ergaster and that these changes were in place with the appearance when Homo sapiens appeared on the scene. The diversification of referential signs With the discovery that things can be named comes the discovery that everything can have a name. 28 The words learned by three signing apes include things and actions that are of relevance to them. In appendix 2, I have listed all the words learned 279 by the signing apes: Nim (Terrace 1979); This is the discovery that Helen Keller made when she discovered that “everything has a name,” that her world was populated by referential signs and not just signals. 28 18 5. The evolution of the referential sign Koko (Patterson 1981); and Washoe (Gardner and Gardner 1989). 29 I have broken then down into categories so that the reader may see the types of words that are of relevance to these apes. The range of concepts is not restricted to simply the names of things and actions, though that is an important category, but to social relationships (including pronouns, games and people), emotions, senses, properties (often modifiers), times and locations. Strikingly absent from this are words associated with social institutions such as roles (mother, father, teacher), institutional names (such as school) though they may not have come up in the environment in which these beings were raised. Also absent are words indicating thought processes (I think, I expect), imagined events (future, other realities), and indirectly perceived entities (God and gravity). While these missing categories may be accidental in that the concepts behind these signs did not appear in the learner’s environment, it is far more likely that these signs were incomprehensible to these apes and hence unlearnable. Kanzi, as reported by Savage-Rumbaugh et all (n.d.), was one of the few apes (a bonobos) who learned to use the time words today and tomorrow. Although Kanzi did master these time concepts they appear to be incomprehensible to other apes. 5. Summary of the progression The beginning of the referential sign can be traced to the development of upright posture. This resulted in two exaptations. One consequence was the development of the pharynx and the three resonator system. The other was the development of sophisticated tools which served as the model for the referential sign. Once the concept of the referential sign developed, parataxis was possible, mostly deictic parataxis at first. There was also pressure to produce more signifiers to enable more referential signs. This resulted in the development of vocality which involved changes in the vocal tract, the brain and the development of the complex syllable including the phonemic principle. There are several consequences of the development of the referential sign which need to be explored in other papers. One paper addresses how symbolic interaction using referential signs leads to a more complex self. Another makes it clear that paratax is not a primitive form of syntax. Another However, in closing, I want to be clear that Paratax The following sequence summarizes the arguments advanced in this paper and the conclusion that the process of becoming human involved the interaction of several areas, the body, the mind, language and culture. 1) tools30 analytic signs and increased fine muscle development increased articulatory control and exaptation for syntax. 2) analytic signs symbolic interaction (paratactic) 3) pre-syntactic symbolic interaction the complex self; a) symbolic interaction enables intersubjectivity (the awareness of common knowledge), interest in what the other knows, differential knowledge, connecting time and space, labeling abstract concepts, negotiating, cross-modal perception. 29 30 The letters following each word indicates which apes acquired the sign. Note that italicized words are exaptations. 19 5. The evolution of the referential sign 4) analytic signs (pre-syntactic) syntactic signs (simple and nested) 5) syntactic signs syntactic symbolic interaction 6) syntactic symbolic interaction and the complex self institutions (culture) 20 5. The evolution of the referential sign Appendix 1: Keita’s referential and expressive statements (from Keita 1997:387) A referential statement - - [Has a] function-argument schema, such as ‘action (agent, patient)’, and ‘motion (theme)’. [Is] amodal in that its format of information is not specific to any cognitive modality (e.g., vision, olfaction, kinesthesis, etc.). [Is] decontextualized in the sense that it is removed from subjective experience. [Is] about a certain experience, but not a rendition of an experience itself. Thus, one may conceptualize unpleasantness in the referential dimension without actually feeling any unpleasantness. An expressive statement - - - - - 21 [represents] different facets of an experience... These include the affective, emotive and perceptual activation in an experience but do not include the rational construal of it based on such things as agentivity and causality. Iconicity is an important architectural principle in this dimension, and thus various facets of an experience do not stand in syntagmatic relationships. Rather, they are merely spatiotemporally contiguous. [presents] various kinds of information from different cognitive modalities remain modalityspecific, creating the subjective effect of evoking an image or “re-experience. [is] what some authors (Jakobson 1956; Lyons 1977) called the “expressive function” of language is subsumed in the affecto-imagistic dimension. [consists] of different units and architectural principles of representation.