Chapter 06 The Evolution of the lexical Sign

advertisement
Chapter 5
The evolution of referential signs
David Dwyer, Michigan State University
February 3, 2011
This paper is a working draft. Comments are welcome. Do not quote without permission.
1. Introduction
Jackendoff (1999), following Bickerton (1990), expanded on a three stage sequence for the
development of human grammar beginning with the use of “symbols” and their development and
followed by Bickerton’s protolanguage and ending in modern language. Jackendoff avoids a
discussion of how these symbols, which in this paper I call referential signs, might have
developed. He also does not discuss their properties. This paper explores these properties and
identifies the changes that were made in becoming the words of modern human language.1
The common ancestors of humans, chimpanzees and
bonobos (hominini) lived 5 to 6 million years ago.
Since that time humans and pan have evolved quite
differently. Humans have developed upright posture,
and the referential sign which the pan (chimpanzees
and bonobos) did not. This paper focuses on the
development of the referential sign in proto humans,
which is qualitatively different from the signals used
by other homininae.2
Although pan do not have the vocal apparatus to
generate words the way humans do, they can, using
the signs of American Sign Language, learn to
communicate with them. Given that the hominini and
even the homininae have many of the intellectual prerequisites for learning referential signs, we
can assume that these abilities were possessed by our common ancestor. In fact, gorillas also have
the ability to learn referential signs, so this argument can be extended to include the homininae.
These observations raise the following questions about referential signs:
1. How are referential signs different from other types of signs?
1
One of the crucial problems we encounter in our exploration of the evolution of language has to do with the failure to
provide precise characterizations to basic constructs. Most articles on this topic fail to provide any definition of terms
like language, syntax and communication and often use them interchangeably, with the consequence that we do not
always know when we agree or disagree conceptually. Bickerton’s protolanguage (discussed in section 4), for
example, has been so loosely characterized that some take to be some sort of primitive syntax. The discussion here
involves the development of referential signs and not syntax. I will use the Mead’s (1934) term symbolic interaction
to avoid the debate as to whether this is language or not.
2
The terminology in this area is awkward. For example, the branch representing the human line of development is not
strictly homo (hominid) until about midway between the time they separated from pan and the present. Becoming
human is generally associated with bipedal locomotion, but this is too is murky because Lucy (Australopithecus
afarensus) and Ardi (Ardipithecus ramidus) are human ancestors, but are not hominids.
5. The evolution of the referential sign
2. What intellectual abilities are needed for learning referential signs?
3. What representational differences are there between the signs that humans and pans use?
4. What behavioral, physiological and structural abilities enabled the shift to the production of
vocal signs?
5. How did referential signs develop in humans?
6. When did referential signs develop?
The remainder of the paper takes up these questions.
2. Properties of the referential sign
Referential signs constitute only one of the sign systems used in human language. The other sign
systems are tactic and representational. Tactic sign systems can be atactic, paratactic or syntactic.
An atactic sentence consists of the word. A paratactic sentence consists of two words which stand
in one of several possible case relationships. The words in a syntactic sentence, which are
potentially infinite, stand in a specific case relationship. In a syntactic sentence, the specific case
relationship specified for each constituent. Representational sign systems use vocal, manual, or
other gestures. In this paper, I discuss the vocal representation system in some detail. For a
further discussion of tactic systems see Dwyer (1986 and 2009 ms).
Signs versus symbols
Cassirer (1944) distinguishes between two types of signs: signals and symbols. Both the signal
and the symbol (which I call a analytic sign or word) are signs in the Saussurian sense of the word
because they consist of a signifier and a signified.3 The signifier is the token to which a concept is
attached, the signified. For example, the letters t-r-e-e signify the concept tree. Mead (1934) gives
the example of the mother hen who clucks when her chicks stray too far from her. Here, the
peeping of a mother hen signifies something like my chicks are too far away. The distinction
between signal and referential sign for Cassirer had to do with relationship between the signifier
and signified. For a signal, the connection was instinctive for a signal and for a referential sign it
was not instinctive and had to be learned.
As I show below, the referential sign is an artificial construct and this is why there is no essential
link between signifier and signified and why it has to be learned.4 Because users are free to attach
any concepts to a signifier which means that we can name concepts as long as we have signifiers to
represent them. This is why the signifier for the concept water is water in English, eau in French,
agua in Spanish and njai in Mende.
“Signals are “operators”; symbols are “designators” (Cassirer 1944:32). “In short, we may say
Various authors use the terms sign and symbol, as well as signal, quite differently. Saussure’s sign, for example, is
virtually the same as the symbol of White (1949), Mead (1934), and Cassirer (1944). To avoid confusion, I shall avoid
the use of the word symbol, except when citing others, and use the term sign as a general term encompassing both
signals and referential signs.
4
Saussure characterized a sign as consisting of a signifier and a signified. For the referential sign, the signified is the
concept and the signifier is the gesture (manual, facial or vocal) that represents it. Saussure also emphasized that
unlike signals, there is no fixed relationship between the signifier and signified for referential signs
3
2
5. The evolution of the referential sign
that the animal possesses a practical imagination and intelligence whereas man alone has
developed a new form: a symbolic imagination and intelligence” (Cassirer 1944:34). Thus
Cassirer and Mead believed that this distinction marked the key difference between humans and
other animals. As we learn more about the signing of nonhumans we will discover that the
transition between signal and referential sign is not the discrete barrier that we, like Cassirer and
Mead, assumed, but one of a number of incremental steps. For example, the discovery that all
hominini possess the ability to use referential signs means both that humans are not alone in this
ability and that this ability precedes the development of the referential sign. In addition, Cheney
and Seyfarth’s (1990) work showed that vervets use signs, because the acquisition of these signals
required some learning.
Because of the arbitrary connection between the signifier and signified we cannot react
instinctively to a signifier as we can with a signal. And because the issuance of a referential sign is
done so intentionally, we interpret this sign not only by identifying it, but by trying to discover
why the speaker sent the message? Was the intention to warn us, inform us, deceive us or
something else? Because of this, our entering into the world of referential signs transforms our
way of interacting with others for we no longer react to the others signals, but to the intention
behind these signals. This form of symbolic interaction also increases our ability to influence the
other verbally and our interest in what the other understands. This conclusion applies to all
homininae and that as a result of learning to use referential signs, chimpanzees, bonobos and
gorillas too have been transformed in a like way.
3. Similarities and differences between human and ape signing
To understand the evolution of the evolutionary sign it is necessary to recognize both differences
and similarities between ape and human signing. Similarities include intentionality, learnability,
expressive and referential sentences; differences include signals verses referential signs; the
representation of the signifiers of referential signs, and physiological differences in the vocal tract
and brain.
Communicative channel
Ape signifiers use vocal, visual (including facial and body gestures), and tactile channels, while
human signs are predominantly vocal. Furthermore, ape vocalizations involve a wide range of
sounds (whrrs, lip smacks,) human vocalizations are almost exclusively syllabic. Although apes
can produce syllables, they are capable of only one syllable per utterance, while humans are
capable of several dozen per utterance.5 The human reliance on the vocal tract reflects several
changes in the human vocal apparatus that we explore below. While the ape vocal tract is capable
of producing some vowels, [u] and [ǝ], they cannot produce the full set of vowels that the modern
human vocal tract can, [i, e, a, o, u and ǝ], nor can they add consonants before or after a vowel to
form a complex syllable.
Behavioral comparisons
5
Apes can produce longer utterances by using both ingressive (inhaling) phonation and egressive phonation.
3
5. The evolution of the referential sign
Comprehension. Many signing hominini show an ability to understand vocal and gestural
referential signs. Even though apes do not have the vocal apparatus to produce referential signs,
they can produce them using the hand gestures of American Sign Language.
Learnability. The capacity to produce referential signs the other apes are capable of learning to use
referential signs, even though they do not use them in the wild. However, the maximum
vocabulary size reported for a chimpanzee (Washoe) is 250 signs.6 While impressive this number
is small when compared to the human three-year old with about 1000 words, the human five-year
old with about 5000 words and the human adult with a vocabulary numbering in the thousands.
Intentionality. The development of the capacity to use referential signs includes the capacity to
intentionally control the production of signs.7 Because feral apes do not use referential signs
researchers once assumed that they did not have the ability to control the issuance of signs. This
changed when Washoe was taught to use the hand gestures of American Sign Language. While
much of ape signing involves requests for food, they also comment on things they see in
magazines, their relationship with others and the bonobo Kanzi has been reported to negotiate with
her human caretakers (Savage-Rumbaugh et al n.d.). All of these activities involve the intentional
use of signs. This means that intentionality of this sort preceded the development of the referential
sign and can be considered an exaptation for the development of the referential sign.
Along with the intention to communicate is the awareness that what you say can influence the
other’s behavior. This is seen most dramatically in the intent to deceive, but efforts to assist by
informing are equally significant. With the awareness that the other’s behavior can be influenced,
comes the increased interest in the other.
Phatic Communication. Humans like to talk even when they have nothing to say. Talking about
the weather is rarely done because people want to share crucial information, but rather as
Malinowski (1923) points out, to interact with others. I call this the gift of the gab. While this
ability is more pronounced in humans, chimpanzees also engage this type of activity.
Analytic statements. In addition to the distinction between signal and symbol, Cassirer (1944:29)
developed a second opposition, based on the work of Révész (1940). His distinction between
emotional and propositional language is similar to the distinction between referential and
expressive language proposed by of Kita (1997). For Kita a referential sentence descriptive of an
experience, while an expressive conjures up the emotive and perceptual activation in an
experience.” An analytic statement may describe “an experience, but not a rendition of an
experience itself. Thus, one may conceptualize unpleasantness in the referential dimension without
actually feeling any unpleasantness.” An expressive statement does the opposite.
Although Kita’s uses the term referential in opposition to expressive, Dwyer and Moshi (2002)
prefer the term analytic, not only because the term referential is used to describe the sign, but
because an analytic sentence, while using referential signs, actually analyze a situation and breaks
6
See appendix 2 for a composite listing of the words learned by the chimpanzees Washoe and Nim and the gorilla
Koko.
7
I use of the word intention to mean the willful use of the word.
4
5. The evolution of the referential sign
it down into its components such as agent, action, object. Nevertheless, the categories of analytic
and expressive provide humans with two modes for communicating their experience. We can
either describe it analytically or provide express our experience of it expressively. In our (Dwyer
and Moshi 2002) paper, we argued that true ideophones were not part of the referential scheme and
that ideophones were verbal expressives as were hand gestures and facial expressions. Although,
when compared to other languages, English ideophones are sparse, they do exist. Examples
include bang, pow, kerplunk, and zing. Ideophones can be used alongside referential statements.
The egg fell to the floor: splat. The hand and facial gestures that accompany much of our
conversation are expressives.
Thus following Kita, an analytic sentence describes an experience by removing the subject’s
emotional association from it and by breaking it down into words related to each other by case
relations. Expressive language on the other hand represents an experience by through imagery. It
is an emotional, as opposed to a rational, representation of experience; it arises from one’s
emotions and gives an indication about one’s emotional state in a given situation. These two
modes of expression are reflected in the terms reason and intuition, and at are at the heart of the
distinction between prose and poetry. While poetry is presented in analytic sentences, it is
intended to be understood expressively.
This distinction can also be found in Austin (1969) classification of the three dimensions of a
sentence: the locutionary; the illocutionary; and the perlocutionary. Austin’s locutionary consists
of the grammatical meaning of the sentence, that is, the analytical mode. The expressive mode is
represented by the other two dimensions: the illocutionary, how the speaker intends the listener to
take the statement (as a question, a request, an apology, a warning, etc.), and the perlocutionary,
how the speaker feels about the statement and the illocutionary dimension.
Austin pointed out that a sentence consists of more than the expression of information (the
locutionary dimension), because it involves an intended listener who is not only to understand
what is said (locutionary), but to understand what the sender expects the listener to do, and to
understand how strongly the sender feels about the statement. From this perspective, expressives
lack the referential (locutionary) dimension, but referential statements still contain the expressive
illocutionary and perlocutionary dimensions.
Once they have learned to sign, apes have no difficulty in using referential signs analytically.
Examples needed here.
Semiological Comparisons
The complex signifier. A simple signifier is one that cannot be broken into smaller elements. For
example, purr of a cat cannot be broken down into smaller elements that form part of the signifier
of another sign. In contrast, the signifiers of human referential signs are complex. For example,
the signifier for the word cat, be it the sequence of letters C-A-T, or the sequence of phonemes /kæ-t/ or the gestures of American Sign Language8 is complex. In each of these significations of the
8
In American Sign Language the word cat is signified by the holding of the right thumb and forefinger above the lip
and moving the hand to the right as though one was pulling a whisker
5
5. The evolution of the referential sign
concept of cat is complex because each signifier can be broken into smaller elements, called
phonemes, and because each of these units can be used in signifying other words. The sequence ca-t can be recombined to represent the word act; the sequence /k-æ-t/ can be recombined either /æk-t/ (act) or /t-æ-k/ (tack).
The phonemic principle is based on the Saussurian principle of semiological system in which the
number of the signs in a given system is finite. The individual sounds found in the sequences /k-æt/, /æ-k-t/and /t-æ-k/ are such signs and are termed phonemes. Because the set of phonemes for a
given language is finite, as Saussure pointed out, derives its value from the fact that is not any of
the other signs in the system. In the above example from English, /k/ is not /t/ and not /æ/, etc.
The phonemic principle means that, for any language, the number of phonemes used to represent
signifiers is drawn from a finite set and that that the signifier of a referential sign consists of a
string of one or more phonemes.
A typical inventory of phonemes for a modern human language is given in the table below.
Stops: Voiceless
Stops: Voiced
Fricatives: Voiceless
Voiced
Nasals
Liquids
Consonants (C)
Labial Dental
p
t
b
d
f
s
v
z
m
n
l and ř
Palatal
č
j
š
ž
ň
Vowels (V)
Front
High
i
velar
k
g
x
ɣ
ŋ
Mid
e
Low
Glides
y
Back
Rounded
u
o
a
w
Phonotactics. Not all strings are possible because there are restrictions on which sequences are
possible. The syllable provides the key to understanding these restrictions, because a signifier
consists of a string of one or more syllables and each phoneme is attached to one of these syllables.
Most typically a syllable has a nucleus consisting of a vowel (V) and an optional onset and coda
consisting of one or more consonants (V). The phonemes of a language are classified as either V
or C. Thus typical syllabic structures are CV (known as an open syllable) and CVC (known as a
closed syllable). The phonotactics of a language describe the permissible syllable structures.
There are additional complications to the phonotactics of many languages including consonant
clusters, complex nuclei (diphthongs) and subclasses of consonants and vowels that result in more
complex phonotactics, but the syllable structure described here is sufficient for our purposes.
This development of the complex syllable and the phoneme makes it possible to represent
thousands of referential signs using a small inventory of phonemes. This potential is even greater
with the development of the capacity to produce several syllables in one breath. These
developments are responsible to what I call the shift to vocality, that is the shift to using
exclusively vocally based signifiers for referential signs.
Physiological comparisons
The brain
6
5. The evolution of the referential sign
Two areas of the brain, known as Broca area and Wernicke’s area, are central to the processing of
language, though of course, many other parts of the brain contribute to the process of objectifying
thought. Broca’s area is located in the area of the brain which controls face and mouth movements
and the articulation of complex syllables. This involves both the movements of the tongue and
jaw, but also the sequencing of these movements. Generally speaking, Broca’s area has to do with
the sequencing of language, both syntactic and phonetic strings. Wernicke’s area has to do with
the recognition (and monitoring) of speech as well as being the place where referential signs are
stored. It was once thought that apes did not have these areas, but recent studies report that (see
below) they exist in other apes and some monkeys as well, though these areas have been found to
be larger and more developed in humans.
The arcuate fasciculus is a nerve bundle leading from Wernicke’s area to Broca’s area. It is larger
and more developed in humans than in other apes and primates. Apparently9 it consists of two
separate pathways according to Ploog (2002), reported by de Boer (to appear). One, called the
“cingulate vocalization pathway,” and is concerned mostly with expressing emotions and is found
in most primates. The second, called the “neocortical vocal pathway,” has to do with fine motor
control and appears to be “more developed in species that are more closely related to humans.”
These two pathways correspond to the expressive and analytic functions of speech.
Length of phonation
Phonation is the process of generating sound by passing air from the lungs across the vocal cords.
The length of phonation is the time that a given species can phonate in a single breath. Humans
have a much greater length of phonation than do the other apes. This is due to changes in the lungs
and the larynx. The chimpanzee lung has a volume that is consistent with the amount of oxygen
that can be absorbed in a breath. In contrast, the human lung holds four times the volume. This
greater capacity allows for longer phonation with a single breath.10 In addition, the human larynx
is narrower than the trachea (wind pipe) meaning that the vocal chords are much smaller than those
of the chimpanzee. Because that it takes less air for the human larynx to phonate, the length of
phonation is greater than that of the chimpanzee. These two changes account for the increased
length of phonation in humans.11
Vowel production
The hominid vocal tract is capable of producing syllables. Syllables consist of a vocalic nucleus
(vowel) as described above. Vowels are produced by the vocal tract which acts as an acoustic
filter which shapes sounds produced by the larynx. The modern human vocal tract consists of
three acoustic filters, the pharynx, the oral cavity and the nasal cavity.
9
It is not clear from Ploog whether these two pathways are part of the arcuate fasciculus or are other pathways.
And also allows one to hold one’s breath, another requirement of longer phonation.
11
With respect to breathing and gathering an adequate oxygen supply, the narrowing of the larynx is actually
maladaptive.
10
7
5. The evolution of the referential sign
An acoustic filter acts to reinforce certain
frequencies and to suppress (filter out) others.
This ability is a function of the length of the
filter and whether it is closed or open at one or
both ends. The reinforced frequencies are
known as formants. While there are many
formants produced for any given acoustic
filter, the two formants with the lowest
frequencies are sufficient to identify the
vowel. For example, a schwa, the vowel
sound in the word but, is produced by an
obstructed vocal tract. The same result can be
achieved using an open tube 17 cm long (the length
of the human vocal tract) and a sound source.
While the larynx provides a fundamental frequency,
it also produces overtones, also known as harmonics
and it is these overtones that the filter shapes into
formants.
There are two competing configurations of a 19 centimeter
long acoustic filter, (1) the single resonator and (2) the
coupled, double resonator. The single resonator has the
capability of producing a schwa [ǝ] and an [u]-like vowel.
The schwa as mentioned above involves an unattenuated
tube roughly 17 centimeters long, while the [u] involves
only the oral cavity and lip rounding which both lengthens
and closes off the tube.12 Interestingly, Hoover a Harbor
Seal could also produce these two vowels that resembled an[ǝ] and an /u/ and also reflects the
capabilities of a 17 centimeter long resonator.
The double resonator is capable of producing the cardinal vowels [i] (as in he), [a] (as in ha, and
[u] (as in who).13 In adult humans, these vowels are easily articulated with gross movements of the
tongue and lips. The [a] involves a closed pharynx and an open oral cavity. The [i] is the reverse
with an open pharynx and a closed oral cavity. The articulation of the [u] involves an open back
half of the oral cavity and a closed front half which is achieved with lip rounding. The pharynx
contains a duplicate of the oral cavity. These configurations produce formants which are
acoustically maximally distinct as shown in the
sidebar.
12
I need some study to support this assertion.
Actually the human vocal tract consists of three coupled resonators, because the nasal cavity is also an important
component of this system as is explained below.
13
8
5. The evolution of the referential sign
This arrangement allows for the simple movements of the tongue to produce vowels with
substantially different formants. For [i] the first two formants are high low; for [a] they are both
midrange; and for [u] they are both low. These three vowels, known as the cardinal vowels
represent the maximally distinct vowels of human language, meaning that the other vowels of
human language, both acoustically and articulatorily, fall within this range. The vast majority of
human languages also have the contrasts [e] (day) and [o] (go). They are formed by the tongue
being not so high in the oral cavity for [e] and not so far back for [o]. Many languages have more;
English for example has 11 vowel contrasts. With the single tube oral cavity, only two vowels are
possible, while with the coupled, 2-tube vocal tract, a much larger vowel inventory is possible.
In humans, the pharynx is about the same
length as that of the oral cavity. In
chimpanzees, the pharynx is much smaller.
During the course of human evolution, the
pharynx lengthened by the lowering of the
larynx while the oral cavity shortened
producing a two-tube filter in which the two
tubes are of equal length.14 Wind (1983)
suggested that the original creation of the
pharynx as a second resonator occurred as a
consequence of upright posture because upright posture resulted in a different positioning of the
head on the body.
The vowel [ǝ] is can be produced by both configurations, while the articulation of [a] requires the
coupled resonator configuration. Although [ǝ] can be produced by the coupled, two tube
resonator, it is less commonly used in human languages, presumably because of its acoustic
similarity to [a].
When the acoustic properties of the human two-tube vocal tract, it was thought that this
configuration (a coupled resonator of two tubes of equal length) was essential for the articulation
of human vowels. With this assumption, Lieberman and Crelin (1971) concluded that because
Neanderthals a relatively short pharynx, they could not have produced the vowels of modern
human languages. More recent research (Boe et al 2002) has shown that Neanderthals were of
capable of producing these vowels, albeit with much greater difficulty than humans. De Boer
(2006) also shows that earlier hominini could also produce several vowels, but probably not the
maximally distinct cardinal vowels, which require a coupled resonator configuration.
This evidence suggests that we need to distinguish between a modern human vocal tract in which
the coupled resonator is genetically fixed, and earlier versions in which the features of a double
resonator can be approximated. Given the greater ease of articulation of the genetically fixed
double resonator, natural selection would have favored its natural selection in sign users. The
14
These two tubes do not have to be of equal length to produce the cardinal vowels, but it is clear that the process of
vocal tract ended when the two tubes attained equal length. That is, there was no point in further lengthening the
pharynx.
9
5. The evolution of the referential sign
facility of the fixed configuration would also explain the shift to vocality. It is also possible that
the two-tube resonator was better suited to the articulation of consonants. Also, because of the
similarity between [a] and [ǝ] modern vowel systems tend not to use the [ǝ], though English is an
exception.15
The vibrating larynx produces the sound source for the vocal tract. In contrast to the other apes,
the human larynx has an opening that is narrower than the trachea (wind pipe) resulting in shorter
vocal cords. Shorter vocal cords require less energy to phonate than a longer ones. While this
development can be seen as positive with respect to phonation, it does make it more difficult to
inhale air for breathing. In addition to being shorter, the vocal cords, which vibrate to produce
sound (phonation), are thinner than those of other apes and this development too increased the
length of phonation. The lung, which powers the larynx, increased in volume but not in the ability
to absorb oxygen. The additional capacity would allow for longer phonation. Both of these
developments incased the length of time of phonation.
Contrastive nasal sounds
As mentioned above, the human vocal apparatus has a third resonator, the nasal cavity. Unlike the
oral-pharyngeal resonators, the nasal cavity cannot be changed in length, but it can be opened and
closed using the velum at the back of the oral cavity, something that chimpanzees cannot do.
When the velum is open, there is an additional nasal format at about 250 Hz which gives both
consonants and vowels a nasal quality. When the velum is closed, vowels and consonants are oral.
The development of the ability to control the velum is a precondition to the development of oral
consonants which form the basis of the complex syllable.
Oral Consonants
Oral consonants are produced with the velum closed and by the interference of the flow of air in
the oral resonator, so much so that it interrupts the filtering capacity of the resonator and hence the
syllabic character of the sound. This interruption may be complete in the case of the stops
(typically p, t and k and b, d and g), or partial in the case of the fricatives (typically f, s, š and x,
and v, z, ž and ɣ). Voiced consonants are produced with phonation, while voiceless consonants are
produced without phonation. The nasal consonants are produced like the voiced oral consonants,
but with the velum open. Consonants can be added to the beginning of the syllable (onset) or end
(coda), producing a complex syllable.16
Chimpanzees do not produce consonants and attempts to teach them to make these sounds have
been unsuccessful. In the early 1950s, Keith and Catherine Hayes (1951) began an experiment to
attempt to teach a chimpanzee named Vicki to produce syllables. After considerable training and
15
Not everyone supports the importance of the double resonator in the development of human orality. For example,
Ohala (2000), a phonetitian claims that the development of the two tube vocal tract “is independent of and thus
irrelevant to the evolution of speech.”
16
In addition, an onset or coda can contain a liquid (l and r) or a glides (y and w), which are not true consonants. This
is because the obstruction of the vocal resonator, it is not enough to disturb the filtering effect.
10
5. The evolution of the referential sign
coaxing, Vicki, was able to say the word “cup” and “papa” (phonetically [mamə], [papə] and
[kəp]).
Vicki had a number of problems articulating these words. First, her vowels were whispered
[voiceless], reflecting her difficulty in controlling her larynx. Second, Vicki could not produce a
velic closure which meant that all sounds would have a nasal coloring. To produce a nonnasal
sound, she had to cover her nose manually to block the passage of air through the nasal resonator.
Third, to produce the bilabial sounds [m] and [p] Vicki had to manually close her lips. This means
that the only consonant that Vicki could produce was [k] and this is with her nose manually
covered. Without the manual blocking of the nasal resonator, she would have a sound much like
the modern [ŋ] as in sing.17 This view is supported by the higher frequency of velars in most
lexical systems. Thus one may conclude that alveolars and labials were added later. 18
Summary of similarities and differences
Communicative similarities and differences between humans and other apes
Domain
Similarities
Differences
Physiological:
A vocal apparatus capable of producing
The larynx became a more efficient phonator;
The vocal tract noncomplex syllables and the vowels [ǝ] and
A genetically fixed coupled resonator evolved;
[u]. This apparatus includes the lungs which
The lungs increased in volume;
can power the larynx to produce sound that can
The tongue became more agile;
be filtered by the oral resonator.
Velic closure allowed purely oral sounds
These developments led to the ability to produce
complex syllables.
Physiological:
Have common modules in the brain attributable
Increased ability to control vocal articulations
The brain
to language (Broca’s area, Wernicke’s area and
(Broca’s area);
the arcuate fasciculus).19
Increased word storage capacity (Wernicke’s)
Increased ability to send messages from
Wernicke’s area to Broca’s area (arcuate
fasciculus)
Behavioral
Can learn referential signs; a word memory of at An increased urge to communicate phatically
least 250 words.
Can use referential signs intentionally.
Can recognize gestural signifiers signs and some
can recognize vocal signifiers.
Possess the ability to use analytic sentences.
Semiological
Uses all types of gestures for signaling
Relies predominantly on the vocal mode.
The appearance of the referential sign
The phonemic principle
A vocally based representation system using
complex syllables to signify referential signs.
17
This observation conflicts with Lieberman’s claim that not only could Neanderthals could not have articulated the
cardinal vowels, they could not have articulated the velar consonants [k] and [g] (reported by Jurmain 1997).
18
A film clip of Vicki producing these words appears in the Nova program entitled, “First Signs of Washoe”.
19
Although these modules were not developed sufficiently to produce complex syllables, they have the neural
hardware to comprehend and to produce analytic signs as well as to produce voiced syllables.
11
5. The evolution of the referential sign
These changes cannot be understood as autonomous developments but as dialectically intertwined
developments, not only of these features, but with other developments as I show below.
4. The evolution of the referential sign
The preceding section described the similarities and differences between the communication of
chimpanzees and humans. The similarities are important because they help to define the point of
divergence. The differences indicate the changes that the homo lineage underwent.
The source of the referential sign
Cassirer’s discussion suggests that signals and referential signs are very different types of signs.
Burling (1993:132) argues that signals “are too narrowly constrained by biology to be converted
into learned and conventionalized signals” by which he means referential signs. But if referential
signs are not derived from signals how did they arise? Given that other apes have no difficulty in
learning and using referential signs, it is clear that our ancestors were capable of learning to use
signs. But how is it that our ancestors had the ability to learn these signs when such signs were not
part of their environment?
Natural signs
Semiological signs are similar to what I call natural signs. All the apes, and I suspect most
mammals have the ability to interpret things and events in their environment for indications of
danger, safety, food and other needs. Natural signs are not simply responses to environmental
stimuli, but are conceptualizations about the environment. For example, a predator like a leopard
will appear in many different shapes and contexts. Nevertheless, the observer recognizes all these
manifestations as the same concept. Having done so, the observer can consider what the
appearance of this predator means and what needs to be done about it.
The difference between natural signs and semiological signs is not in It does not seem clear from the
discussions that humans were
their conceptualization but in their issuance. While users have no
preadapted.
control over the issuance of natural signs, they do with semiological
signs. They can decide which sign to send or not to send a sign at all. This is an ability possessed
by all hominoids.
It is clear from this discussion that proto humans were preadapted for learning referential signs.
For this reason, we can rule out a genetic change as the
source of the referential sign.
Lock and Bogan (2005:5) note that “the
An alternative scenario
As an alternative, let me offer the following scenario.
First, bipedal locomotion (and upright posture) was first
major development after the homininis separated from
the pan.20 Bipedal locomotion enabled carrying, of
20
obstetrical dilemma was eased when some
amount of skull and brain growth---and
motor development---were adaptively
deferred into the postnatal period,
increasing infant dependency and the need
of postnatal care.
The normal gait of the chimpanzees is a knuckle walk which involves an almost upright posture with support from
the arms and fisted hands. Chimpanzees can walk bipedally although unlike Ardipithecus ramidus and A. Afarensis,
12
5. The evolution of the referential sign
infants, of food from its source, and tools.
The capacity to carry infants means that the infants do not have to hold on as do chimpanzees and
bonobos and this has several consequences.
Carrying means that infants can be born with greater dependence on their parents. Compared to
the chimpanzee and bonobo, the human infant is less able to care for itself and cannot hold on to
its mother. Thus carrying means that the infant can continue to gestate exutero. Given the
limitations imposed by the diameter of the birth canal, this development would allow the head and
more importantly the brain to continue to grow after birth. Carrying may also be associated with
loss of body hair since infants no longer need to hang on, a task that would be made more
difficulty by upright posture.
Carrying food enables more permanent settlements and involves the sharing of food, and possibly
division of labor.
Carrying can also lead to more sophisticated tool manufacture. A tool maker is confronted with a
question of what to do with the tool after making it? If I plan to leave it at the use site, then it does
not make sense to put too much time and effort into its manufacture. However, if I plan to carry it
with me, it does. In fact, carrying also allows for tool specialization.21
Neither the bipedal Ardipithecus ramidus (-4.4 MY) nor Australopithecus afarensus (-3 MY) have
been found with manufactured tools, although such tools may have been constructed from wood
and other perishable materials and hence not detected.
Tool use
For this reason, I propose the hypothesis that tool use
may have played an important role in this development.
While it has been suggested that language may have been
needed to instruct others in tool making, other evidence
suggests that learning to make the kind of tools made in
the lower paleolithic by Homo Habilis and Homo
Ergaster could be done through imitation ( Hewes 1993,
Ambrose 2001, Widgen 2004). The earliest
manufactured stone tools, known as the “Oldiwan tradition” date from 2.5 million years ago.
Examples in the sidebar from left to right, include an end chopper, a heavy-duty scraper, a hammer
stone, a flake chopper; a bone point, and a horn core tool or digger.22
In contrast to the view that symbolic interaction was a prerequisite for tool manufacture, I propose
the opposite, that tools are the exaptative predecessor of words. This is because, as a manufactured
item, a tool can be proffered as the signifier for a referential sign. We think of tools as something
to do something, but a tool can be proffered too. For example, when I see someone brandishing a
their anatomy does not show any major adaptations toward bipedalism.
21
Bipedal locomotion, as Armstrong et al (1994) point out would have freed the hands for signing as well.
22
http://www.handprint.com/LS/ANC/stones.html
13
5. The evolution of the referential sign
spear, I ask what intent does that person have?
Could it mean: (1) “let’s go hunting;” (2) “let’s
make tools;” (3) “go away or I will hurt you with
my tool;” or any number of other things. 23 When
proffered, a tool is interpreted in much the same
way as a sign, that is what is the intention of the
profferer?24 At this point, the spear, with its
interpretation, becomes an incipient referential
sign. For an incipient referential sign to become a
true referential sign, the meaning needs to become
fixed by convention.
Property
Tools
Signs
Manufactured objects
Yes
Yes
Involve intentionality
Yes
Yes
Intentionally produced
Yes
Yes
Capable of bearing
meaning
Yes
Yes
Referential statements.
Potential Yes
While tools do appear in other species, early human tools show a greater variety, both in form and
in function. In this way, tools are likely to have provided the stimulus for developing referential
signs.25
The capacity to recognize referential signs is wide-spread among other species. Dogs are well
recognizing their names and of other objects. Chimpanzees, Bonobos and Gorillas have learned to
recognize the gestures of American Sign Language and many have been shown to recognize
human vocal words.
This scenario explains the evolution of referential signs as the consequence of exaptation and not
the result of a genetic change. To be sure, modern humans have developed a greater ability to
recognize and use referential signs than the great apes and this does represent a genetic change.
But as argued in chapter 02, this change we view geAlthough socialized chimpanzees and bonobos
have the ability to learn analytic signs, no evidence of such usage has been found in their free
ranging kin. The conclusion that analytic signs are a human development raises the question of
when and how. First of all, we can rule out a genetic change, for it is clear that chimpanzees have
no difficulty in learning and using analytic signs. If the development of analytic signs was not due
to a genetic development, then we have to look for some conceptual development.
Tools, Broca’s area and sign production
Greenfield (1991), Ambrose (2001) and others have noted another important connection between
tool use and referential signs. The fine muscle control needed to manufacture tools is located in
Broca’s area. Broca’s area is better known as one of the two major modules of the brain most
closely connected with language, Wernicke’s area being the other. Specifically, Broca’s is
associated with the articulation of words and the production of syntax. Thus the increased use and
23
Needless to say, the richer the tool kit, the richer the sign system.
George Herbert Mead (1934) proposes that the assignment of meaning to an object involves the imaginary
completion of the signing act. Berger and Luckmann’s (1967) description where someone has thrown a knife at
someone which sticks in the bedstead above him. After the thrower has fled the presence of the knife continues to
remind one of the thrower’s intent.
25
This argument is similar to the Tool-Cue Model of Byers (1999) who proposed that tools the potential to serve as
icons.
24
14
5. The evolution of the referential sign
manufacture of tools could have placed increased pressure on the development of Broca’s area a
development that could well have been an exaptation not only for the development of the vocal
apparatus but subsequently syntax.
Paratax
The term paratax is similar to, but not identical to Bickerton’s (1990) protolanguage. Bickerton
sees protolanguage as an incipient form of syntax and includes, in addition to the sentences of
signing apes and two-year humans, early versions of pidgins, whereas paratax (Dwyer 1986 and
2009) on the basis of the formal properties displayed by the sentences of signing apes and young
humans concludes that these do not represent syntactic grammars, and for this reason excludes
incipient pidgins. Almost as soon as humans and other apes learn referential signs they produce
paratactic sentences. The first such sentences in all populations are deictic paratax (Greenfield et
al 2008), meaning that one of the words in the paratactic sentence is a referential sign and the other
is a pointing gesture. Deictic paratax allows the individual to intentionally draw the other’s
attention to an entity identified by the deictic gesture and comment on it using the referential sign.
Deictic paratax is subsequently replaced by full paratax involving two referential signs. While full
paratax involves the same topic and comment as deictic paratax, it has the advantage of including
topics that are not physically present. This is because the deictic gesture must point to something
physically present.
The development of signifiers
Although protohumans had the capacity to learn about 250 referential signs, these signs still have
to be developed and this means finding new signifiers to attach to concepts (signifieds). At this
time, as several authors have suggested, 26 these referential signs would have been represented by a
variety of channels including manual, facial, and vocal gestures. The development of new
signifiers is not an easy process and at this early stage that vocabulary grew slowly. Each new sign
would require a new signifying gesture from one of these channels until vocality produced a much
easier mechanism for generating signifiers. During this period it is likely that natural selection
favored abilities to learn words, an interest in communicating with others and to generate new
signifiers.
The transition to vocality
The transition to vocality based signifiers involved four interrelated developments described in the
section on similarities and differences. In this section I review what these developments were.
Using the source-filter model, we see changes in the source, the filter and the attenuators.
Changes in the vocal tract
26
Armstrong, 1999; Armstrong, Stokoe & Wilcox, 1995; Campbell 2000, Corballis, 1992, 2002; Givòn, 1995; Hewes,
1973, 1996; Rizzolatti & Arbib, 1998.
15
5. The evolution of the referential sign
Changes in the source involved changes in the larynx which became smaller. This allowed for
longer utterances and the possibility of a sequence of several syllables. This would allow the
production of a totally vocal paratactic sentence.
Changes in the attenuators involved velic control and modifications to the tongue and lips. Velic
control enabled the coupling and decoupling of the nasal resonator. The decoupling of the nasal
resonator through the closure of the velum enabled the articulation of purely oral sounds which is a
precondition for the articulation of true consonants. The tongue is essential for modifying the
resonators to produce different vowels as described above. In addition the tongue can interfere
with the oral resonator to inhibit resonance and produce dental and velar consonants, either stops
or fricatives. The rounding of the lips alters the acoustic properties of the oral resonator by
increasing its length and by effectively closing it off at one end. In addition the lips can, like the
tongue, inhibit resonance to produce labial stops and fricatives.
Changes in the filter involved the development of the coupled, triple resonator. Velic control
enabled the coupling and decoupling of the nasal resonator. The descent of the larynx involved the
emergence of the pharynx. The configuration of this triple resonator allowed the easy articulation
of the cardinal vowels as well as the intermediate vowels [e] and [o], and [ɛ] and [ↄ]. This
configuration also made it easier to attenuate the oral cavity for the articulation of consonants.
The development of the complex syllable with true consonants
The development of the complex syllable involved the attachment of consonants to either end of
the syllable’s vocalic nucleus, though all hominini show no ability to produce true consonants
owing to the lack of a velic closure (which separate oral from nasal sounds).
At this point, let me speculate. Given that Vicki was capable of producing a velar closure, the
control of the velic closure would create the possibility of a [k/ŋ] distinction. And
ki ka ku
given that chimpanzees are capable of glide (nonsyllabic vowels) onsets as in the waa
ŋi ŋa ŋu
bark could develop creating complex syllables beginning with [w]. With the
wi wa wu
development of the vowel [i], the glide [y] would also emerge. These developments
yi ya yu
would produce 12 different syllables as potential signifiers.
The development of words with complex phonological signifiers would not have eliminated
gestural signifiers, in fact they exist today, however as the vocal apparatus became more adept in
producing complex syllables with more contrast, it would become the primary mechanism for
representing referential signs.
With a longer period of phonation, two syllable utterances became possible including two-syllable
words and paratactic (analytic) sentences. New words could have arisen from reinterpreting a
paratactic sentence as a word, a common human process known as lexification. Washoe has
coined several lexified two word sentences including dirty monkey – to describe a monkey she
didn’t like and candy-drink to describe a watermelon. From this process, polymorphemic,
polysyllabic words arise. Almost all words in modern language with two or more syllables are
morphologically complex or can be discovered to have been so even though they are no longer
16
5. The evolution of the referential sign
transparent to its users and are taken to be morphologically simple.
Most languages have what have come to be called cranberry morphemes. Such morphemes
combine with recognizable morphemes to produce a new word. We know that a cranberry is a
berry even though we do not know what a cran is. The same can be said for the rasp in raspberry.
Such morphemes arise because although the free form of the word has fallen out of usage, the
morpheme survives in compounds. Cranberry morphemes are of interest here because they show
that strings of syllables can have representational, as opposed referential value and can be used to
distinguish one word from another. Thus in time it is possible that these polymorphemic words
would have lost their transparency and would be taken to be monosyllabic.
Subsequent consonantal developments would include:
Additional points of articulation: a labial stop creating a [p/m] contrast; an alveolar
stop creating a [t/n] contrast; and a palatal stop creating a [c/ñ] contrast. These
developments would allow 18 more complex syllables.27 These oral consonants
are called stops.
pi pa pu
mi ma mu
ti ta tu
ni na nu
či ča ču
ñi ña ñu
The consonants [p], [t] and [č] are voiceless. The term voiceless means that unlike vowels, they
are produced without the vocal cords vibrating. In the production of vowels, the vocal cords are
positioned to allow spontaneous voicing so that when air from the lungs is passed through the
larynx, the vocal cords vibrate spontaneously. However, because the articulation of consonants
impedes this flow of air the vocal cords stop vibrating and this is why these consonants are
voiceless.
bi ba bu
It is possible to adjust the vocal cords so that they continue to vibrate even though
di da du
ji ja ju
the oral resonator is occluded. When this done, the voiced counterparts of the
gi ga gu
consonants described above are produced. This produces an opposition between
[p] and [b]; [t] and [d] and [č] and [j]; and [k] and [g]. This would add another 12 complex
syllables to the inventory.
It is also possible to interfere with spontaneous voicing with only a partial blocking of the oral
cavity. This partial blocking creates local turbulence at the point of articulation. This turbulence,
creates a noise that distinguishes it from the stop, thus distinguishing a [p] from an [f]; a [t] from
an [s]; a [č] from an [š]; and a [k] from an [x] in the voiceless series and a [b] from a [v]; a [d]
from a [z]; a [j] from a [ž] and a [g] from an [ɣ] adding 24 more contrastive syllables. These
consonants are called fricatives.
The examples here use open syllables, because closed syllables are not as common in the world’s
languages, and probably developed at a later time. For example, it is of the case that closed
27
The palatal point of articulation is less common than the others (labial, alveolar and velar) and it is quite likely that
they were not part of early complex syllables. This observation is supported by the fact that palatal consonants
commonly arise when an alveolar or velar consonant precedes the vowel [i] and sometimes [u], a process known as
palatalization.
17
5. The evolution of the referential sign
syllables are derived from a sequence of two open syllables in which the final vowel is lost: CVCV
 CVC.
There are of course many other developments, such as the development of the liquids [l] and [r]
and consonant clusters (CCV) which further increases the inventory of contrastive syllables.
Although not all 66 syllables would have developed with the onset of the complex syllable, this
exercise does show the potential of the vocal apparatus to produce signifiers. With 66 contrasts in
a single syllable, a two syllable sequence would have 4356 contrasts or potential signifiers.
Changes in the brain
As the representational system expands, greater pressure is likewise imposed on the
communicative channel, in this case the auditory one including the vocal tract.
When did referential signs appear?
While it is impossible to directly determine when referential signs emerged, it is possible infer that
their use is correlated with the development of the vocal tract and the language areas of the brain.
The evidence for these physical developments leads de Boer (to appear) and others to conclude
that the “that adaptations for speech must have first appeared in the common ancestor of Homo
sapiens and Homo neanderthalensis and that complex speech must have had at least 400 000 years
to evolve. The evidence de Boer cites developments in Homo ergaster (1My-200KY):
enlargement of Broca’s area, though possibly as early as H. habilis (2-1MY); oral cavities that are
either within or close to human range; and enlargement of the hypoglossal canal, though smaller
than that of modern humans. The development of the pharynx through the lowering of the larynx
to modern human dimensions and modern hearing abilities, however, was detected only in early
Homo sapiens.
The date of 400,000 BP is considerably before the date posited for syntax be it 50,000 BP for the
Eurocentric hypothesis or 100,000 BP for the Afrocentric hypothesis. These conclusions are
consistent with Jackendoff’s (1999) two stage model. The evidence for changes in the brain and
vocal tract in H ergaster means that natural selection favoring these changes was in progress. And
the fact that these changes were in progress strongly suggests that H ergaster was using referential
signs. Although the fossil evidence cannot show most of the physiological changes, let alone the
behavioral and semiological changes, it is highly likely that most of these took place in H ergaster
and that these changes were in place with the appearance when Homo sapiens appeared on the
scene.
The diversification of referential signs
With the discovery that things can be named comes the discovery that everything can have a name.
28
The words learned by three signing apes include things and actions that are of relevance to them.
In appendix 2, I have listed all the words learned 279 by the signing apes: Nim (Terrace 1979);
This is the discovery that Helen Keller made when she discovered that “everything has a name,” that her world was
populated by referential signs and not just signals.
28
18
5. The evolution of the referential sign
Koko (Patterson 1981); and Washoe (Gardner and Gardner 1989). 29 I have broken then down into
categories so that the reader may see the types of words that are of relevance to these apes. The
range of concepts is not restricted to simply the names of things and actions, though that is an
important category, but to social relationships (including pronouns, games and people), emotions,
senses, properties (often modifiers), times and locations. Strikingly absent from this are words
associated with social institutions such as roles (mother, father, teacher), institutional names (such
as school) though they may not have come up in the environment in which these beings were
raised. Also absent are words indicating thought processes (I think, I expect), imagined events
(future, other realities), and indirectly perceived entities (God and gravity). While these missing
categories may be accidental in that the concepts behind these signs did not appear in the learner’s
environment, it is far more likely that these signs were incomprehensible to these apes and hence
unlearnable. Kanzi, as reported by Savage-Rumbaugh et all (n.d.), was one of the few apes (a
bonobos) who learned to use the time words today and tomorrow. Although Kanzi did master
these time concepts they appear to be incomprehensible to other apes.
5. Summary of the progression
The beginning of the referential sign can be traced to the development of upright posture. This
resulted in two exaptations. One consequence was the development of the pharynx and the three
resonator system. The other was the development of sophisticated tools which served as the model
for the referential sign.
Once the concept of the referential sign developed, parataxis was possible, mostly deictic parataxis
at first. There was also pressure to produce more signifiers to enable more referential signs. This
resulted in the development of vocality which involved changes in the vocal tract, the brain and the
development of the complex syllable including the phonemic principle.
There are several consequences of the development of the referential sign which need to be
explored in other papers. One paper addresses how symbolic interaction using referential signs
leads to a more complex self. Another makes it clear that paratax is not a primitive form of syntax.
Another However, in closing, I want to be clear that Paratax
The following sequence summarizes the arguments advanced in this paper and the conclusion that
the process of becoming human involved the interaction of several areas, the body, the mind,
language and culture.
1) tools30  analytic signs and increased fine muscle development  increased articulatory
control and exaptation for syntax.
2) analytic signs  symbolic interaction (paratactic)
3) pre-syntactic symbolic interaction  the complex self;
a) symbolic interaction enables intersubjectivity (the awareness of common knowledge),
interest in what the other knows, differential knowledge, connecting time and space,
labeling abstract concepts, negotiating, cross-modal perception.
29
30
The letters following each word indicates which apes acquired the sign.
Note that italicized words are exaptations.
19
5. The evolution of the referential sign
4) analytic signs (pre-syntactic)  syntactic signs (simple and nested)
5) syntactic signs  syntactic symbolic interaction
6) syntactic symbolic interaction and the complex self  institutions (culture)
20
5. The evolution of the referential sign
Appendix 1: Keita’s referential and expressive statements
(from Keita 1997:387)
A referential statement
-
-
[Has a] function-argument schema, such as ‘action
(agent, patient)’, and ‘motion (theme)’.
[Is] amodal in that its format of information is not
specific to any cognitive modality (e.g., vision,
olfaction, kinesthesis, etc.).
[Is] decontextualized in the sense that it is removed
from subjective experience.
[Is] about a certain experience, but not a rendition
of an experience itself. Thus, one may
conceptualize unpleasantness in the referential
dimension
without
actually feeling any
unpleasantness.
An expressive statement
-
-
-
-
-
21
[represents] different facets of an experience...
These include the affective, emotive and
perceptual activation in an experience but do not
include the rational construal of it based on such
things as agentivity and causality.
Iconicity is an important architectural principle in
this dimension, and thus various facets of an
experience do not stand in syntagmatic
relationships.
Rather, they are merely
spatiotemporally contiguous.
[presents] various kinds of information from
different cognitive modalities remain modalityspecific, creating the subjective effect of evoking
an image or “re-experience.
[is] what some authors (Jakobson 1956; Lyons
1977) called the “expressive function” of language
is subsumed in the affecto-imagistic dimension.
[consists] of different units and architectural
principles of representation.
Download