Explaining Colour Term Typology - Linguistics and English Language

advertisement
Explaining Colour Term Typology
 Basic colour terms are colour words which:
- are highly salient
- their extensions aren’t included within those of any other
colour terms
So in English the basic colour terms are red, orange, yellow,
green, blue, purple, pink, brown, grey, black and white.
There are many more colour words that aren’t basic, e.g.
crimson, navy, chartreuse, magenta and turquoise.
 What needs explaining?
- Prototype properties
- The range of colours denoted by each colour word varies
between languages.
- As does the number of basic colour terms, from 2 to 11 (or
maybe 12).
- But there are cross-linguistic regularities in colour term
systems.
Prototype Categories
Colour words are prototype categories.
 Some colours are better examples of the word red than others
are.
We can say:
'a good red'

'sort of red'

'slightly red'
 There is usually a single colour which is the best example of the
colour word (the prototype).
 The further a colour is from the prototype the less good it is as
an example of the colour category.
 Colour categories have fuzzy boundaries.

It's not clear exactly which colours are members of the
category.

Some colours are marginal members.
Colour Term Typology
Berlin and Kay (1969) found clear cross-linguistic regularities in
colour term systems.
They proposed that languages evolved from having only 2 basic
colour terms, and gradually added more over time until they
reached a ceiling of a maximum of 11 basic terms. (Languages
never lose basic colour terms.)
 The prototypes of basic terms from all languages fall into
discrete clusters.
 People are very consistent in their choice of prototype (but not
in where they place boundary colours).
white white
black
red
green or yellow
green and yellow
blue
brown
purple pink orange
grey
Kay and Maffi (1999)
- Their results are based on the world colour survey which has
collected data from 110 languages.
- There are six foci: black, white, red, yellow, green and blue
which are the prototypes of most colour terms in most
languages.
- Languages almost always partition the colour space so that
each colour will be named by one basic colour term –
exceptions to this are rare but do exist.
white-red-yellow + black-green-blue
white + red-yellow + black-green-blue
white + red-yellow + black + green-blue
white + red + yellow + black + green-blue
white + red + yellow + black + green + blue
white + red + yellow + black-green-blue
white + red + yellow + green + black-blue
white + red + yellow-green-blue + black
white + red + yellow-green + blue + black
- But there are exceptions to this evolutionary order.
MacLaury (1997)
- In languages with composite green-blue the focus is in green,
other times in blue, and sometimes it is focussed in either
green or blue.
- It’s common for purple, pink, orange and brown to emerge
before the green-blue term has split into green and blue,
sometimes this even happens before yellow-red splits.
- The order of emergence of derived categories tends to be
purple, brown, pink, grey then orange, but there’s lots of
variation in this order.
- Co-extension – sometimes two colour terms overlap, covering
more or less the same range of colours, but with foci in
different places.
- Different speakers of the same language can be at different
evolutionary stages – and in some languages most speakers
don’t fit into any evolutionary classification.
- Some colour systems are based mainly on lightness, while
most are based more on hue.
Explaining Language Typology
What causes this kind of typological pattern?
- Individual psychology.
Primary
Linguistic Data
Language
Acquisition
Device
Individual's
Knowledge of
Language
Chomsky’s Language Acquisition Device
- Or an interaction of psychological and social processes.
Language
Acquisition
Device
Primary
Linguistic Data
Individual's
Knowledge of
Language
Arena of
Language Use
Hurford's Diachronic Spiral
 Typological patterns may be the result of evolutionary biases
over a number of generations, not simply a restriction on the
kinds of language which are learnable.
Explanations of Colour Term Typology
Is colour term typology due to innate learning biases/UG which
restricts the kind of colour term systems which can be learned.
Or are there just evolutionary pressures which tend to favour the
development of some kinds of colour categories over a number of
generations.
How to test this:(1) Create a computational model of colour term acquisition.
- Can this model learn attested colour term systems but not
unattested ones.
(2) Create multiple copies of the model.
- Each copy will be an artificial person.
- They can talk to each other over several generations, and learn
from one another. (Every so often an older person will die and
be replaced by a new person who doesn’t know any
language.)
- Do the languages which emerge in these simulations have the
same properties as real languages?
Computational Evolutionary Linguistics:
Expression-Induction Models
- These models should be distinguished from models of
biological evolution.
- They simulate historical/diachronic/cultural/glossogenetic
change.
There have been several such models of syntax.
Kirby (1999) :- An individual tries to express a set of meanings.
If they have constructions in their language that allow a meaning
to be expressed then they use that.
Otherwise they just make up a new language form.
- A new learner then takes this set of utterances and tries to
learn the underlying language.
- The learner then becomes the speaker, and tries to express a
new set of meanings to a new learner.
This process is repeated over many generations, and after a time
a compositional syntactic system emerges.
Kirby used phrase structure rules and logical predicates and
symbols to represent language, but Batali (1998) produced
similar results using neural nets.
Tony Belpaeme (2002):
Factors influencing the origins of colour categories
A previous expression-induction model of colour term evolution.
 A population of agents was created, each of which could learn
colour categories using an adaptive network (a type of neural
network).
 The agents try to communicate by attaching words to the
colour categories – they try to use a word that discriminates a
target colour from its context.
- If communication is successful then the hearer will strengthen
the association between the target colour and the colour word
used.
- If communication is unsuccessful, then the speaker will tell
the hearer the correct colour.
 Periodically an agent will die and be replaced by a new agent,
so as to simulate evolution over several generations.
- Over time a coherent language evolves.
- But the languages don’t conform to typological restrictions.
A Bayesian Model of Colour Term Acquisition
The model learns using Bayesian inference.
- Bayesian Inference is a statistical procedure based on Bayes’
rule, which was proved by Bayes (1763).
- Though Bayesian inference was developed much more
recently.
Bayesian inference allows us to calculate the probability of a
hypothesis given some relevant data.
- But we can only so this if we know
(1) how likely each hypothesis is a priori.
(2) How likely we would have been to observe the data if the
hypothesis is correct.
- If we know both (1) and (2), we can calculate exactly how
probable each hypothesis is using Bayes’ rule.
P(h) P(d | h)
P(h | d) =---------------P(d)
General consequences:
- Hypotheses with the highest a priori probabilities also have
the greatest a posteriori probabilities.
- Hypotheses which accurately predict the data are more likely
than those which don’t.
Using Bayesian Inference as a Psychological Model
In this case the a priori probabilities assigned to hypotheses will
correspond to a person’s belief in how likely each possibility is
before they begin learning.
Data can be anything people observe from which they could learn.
 But why is it likely that people learn colour words with
Bayesian inference?
- It’s arguably the optimal way to learn.
- Bayesian learning replicates much of the empirical evidence
concerning colour terms.
- There’s already some evidence to suggest that people are
Bayesian from other computational models.
Tenenbaum and Xu (2000) showed that a Bayesian model of
learning the meanings of concrete nouns made very similar
generalisations to those made by people.
- So this provides empirical support for the proposal that people
learn word meanings using Bayesian inference.
Griffiths and Tenenbaum (2000) showed that people seem to use
Bayesian inference when judging the frequency of periodic events
from examples.
What Evidence do Children use to Learn Colour
Words?
 Children aren't taught language explicitly.
 They learn by observing other peoples speech.
But to learn meanings they must be able to work out what a word
was used to mean.
 This would be the particular colour a word was used to identify.
So the input to learning must be examples of colours and the words
that were used to identify those colours.
 Learning consists of generalising from these examples to the full
range of colours which a colour word can be used to identify.
The Conceptual Colour Space
Physically light varies from red with the longest wavelengths, to
blue with the shortest.
But that's not how we perceive colour.
 Perceptually colour has a three dimensional structure.

Hue

Saturation

Lightness
red
purple
orange
yellow
blue
turquoise
yellow-green
green
At present the model is concerned only with the hue dimension.
The Bayesian Model
Children a priori assume that each word denotes a continuous
range of colour, so a possible hypothesis is like this:
extent of denotation
of word
All such ranges of the colour space are considered to be equally
likely a priori.
But we can't be sure that all the examples are accurate: Accurate examples must come within the hypothesis.
 Erroneous examples appear anywhere.
The learner must decide on a probability which corresponds to how
confident they are that each example is accurate (in the results
reported here that’s 0.5).
high probability
hypothesis
low probability
hypothesis
To work out just how likely it is that a colour word can denote any
particular colour, we can just add up the probability of all the
hypotheses which include that colour in their denotations.
- This is called hypothesis averaging.
- And the model is a Bayes’ Optimal Classifier.
We can equate the probability that a word denotes a colour with the
colours degree of membership in the colour category.
- This will define a fuzzy set, where the degree of membership
can vary between 1 (full membership) and 0 (not a member at
all).
Learning the English Colour Words
degree of membership in
colour category
1
0.9
0.8
0.7
RED
0.6
ORANGE
YELLOW
0.5
GREEN
BLUE
0.4
PURPLE
0.3
0.2
0.1
0
hue (red at left to purple at right)
The fuzzy denotations of the English colour words after five
examples of each.
 Prototypes
 Gradation of membership
 Fuzzy boundaries
Learning Berinmo Colour Words
degree of membership in
colour category
1
0.9
0.8
0.7
0.6
Kel
Mehi
0.5
Nol
Wor
0.4
0.3
0.2
0.1
0
Hue (red at left, to purple at right)
Denotations of Berinmo colour words after ten examples of each.
 Berinmo has 5 color words, but 'wap' only includes light colours,
so doesn't appear on the graph.
 Green and blue are both included in the term 'nol'
 The dark term, 'kel' extends into much lighter colours, but only
for purple hues.
degree of membership in
colour category
Learning with Unreliable Evidence
1
0.9
0.8
5 accurate
0.7
5 accurate with 5 random
0.6
10 accurate
10 accurate with 10 random
0.5
First 5 Accurate Examples
0.4
First 5 Random Examples
Next 5 Accurate Examples
0.3
Next 5 Random Examples
0.2
0.1
0
0
20
40
60
80
100
hue (red at left to purple at right)
Learned denotations for English 'green'
When 50% of the data is random noise: With 10 examples

Prototype is roughly correct

But the category boundary is very unclear
 With 20 examples

Performance approaches that with only accurate
examples.
 So the model can learn in realistic and not just idealised
situations.
Does the Model Sufficiently Constrain the Learnable
Languages?
- The model can learn real colour systems from natural
languages.
- But it can equally well learn colour term systems that don’t
have any of the properties of those typically seen in real
languages.
1
Degree of membership in
colour category
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Colour (red at left, to purple at right)
This colour term system:
- doesn’t partition the colour space
- and the term on the right doesn’t have prototype properties
because the person has seen so many examples of it.
 So the acquisitional model can’t explain why colour term
systems have these properties.
Evolving Colour Term Systems
Ten copies of the model were made, to simulate a community of
ten speakers.
Each person starts knowing one example of a colour word, but
each person knows a different word.
A speaker is chosen
A hearer is chosen
A colour is chosen
The speaker says the word which they
think is most likely to be a correct label
for the colour based on all the examples
which they have observed so far.
The hearer hears the word, and
remembers the corresponding
colour.
Occasionally older people will die and be replaced by new people.
One time in a thousand the speaker will make up a completely new
word.
Evolutionary Results
A coherent colour term system emerges which is shared by all
people, although the exact ranges of colour which each person uses
each word to denote vary a bit.
Degree of membership
in colour category
The following graph shows the denotations learned by one person
in a typical simulation. On average people observed 60 colour
words during their life, and this person was near to the end of his
lifespan.
1
0.8
0.6
0.4
0.2
0
Colour (red at left, to purple at right)
- The colour words partition the colour space.
- And each colour word has prototype properties.
But what about Typological Patterns?
MacLaury (1999) – colour terms are predominantly focussed on
certain colours (data from World Colour Survey).
Heider (1972) found that focal colours were named more rapidly
than non-focal colours.
When a Munsell chip was shown to a subject for 5 seconds, and
then 30 seconds later the subject was asked to select the chip s/he
had been shown from an array of Munsells, subjects were more
accurate for focal colours than for non-focal colours.
- This was true both for American undergraduates and monolingual Dani speakers, even though Dani only has two basic
colour terms.
When Dani were taught names for colours, they made more
mistakes on learning labels for non-focal colours than for focal
colours.
 So, focal colours are easier to remember, and are more salient
than non-focal colours.
Neurophysiologically Determined Foci
There are six elemental sensations, corresponding to focal white,
black, red, yellow, green and blue.
- This is supported by neurophysiological studies, measuring
the response rates of cells in the retina of macaque monkeys
using electrodes. (Although Kay and Maffi (1999) claim that
those studies don’t put the foci in quite the right places.)
- And by psychological studies, showing that some colours are
seen as special.
- And also by studies which show clusters of colour term foci
on certain Munsell chips.
There is some evidence that focal green and blue appear more
similar to each other than do red and yellow, yellow and green,
or blue and red (MacLaury, 1997).
Adding Learning Biases
A new version of the model was created in which people were
more likely to remember focal examples than non-focal ones.
- All focal colours would be remembered.
- But only 5% of non-focal colours were remembered.
Changes were made to the model so it would allow for the
uneven frequencies of colour examples.
Firstly the foci were evenly spaced, and 20 simulations were
carried out in each of the conditions of people remembering on
average 20, 30, 60, 90, and 120 colour examples during their
lifetimes.
For the purposes of the analyses, only terms for which a person
had seen at least four examples were included.
- A colour term was considered to name all those colours for
which the person considered it to be the best colour word out
of all those that they knew.
Number of
colour terms
focussed on
this colour
180
160
140
120
100
80
60
40
20
0
Colour (red at left to purple at right).
The colour terms are predominantly focussed on the innate foci.
Number of
colour terms
focussed on
this colour
50
40
30
20
10
0
Colour (red at left to purple at right)
This is the same graph, but for simulations without innate foci –
there are no colours which are predominantly chosen as colour
term prototypes.
Number of colour terms
which have their prototypes
and innate foci in these
positions
400
350
300
250
200
150
100
50
0
Category
prototype
Innate focus
Shows whether the prototype
and foci are in the left, middle,
right or intermediate fifths of
the colour terms range
The category prototype tends to be in the middle of the term, as
does the innate focus.
Number of
colour terms
with prototype
and innate foci
in these
positions
400
350
300
250
200
150
100
50
0
Category prototype
Innate focus
prototype and innate foci location
With no innate foci, the prototype still tends to be in the middle
of the term, but the foci locations can be anywhere.
Average number of
colour terms
known by an adult
12
10
8
6
4
With innate foi
Without innate foci
2
0
30
60
90
120
All
Average number of examples
remembered during a
person's lifetime
People who use colour terms more often tend to have more
colour terms in their languages.
- Though adding foci seems to reduce the number of colour
terms.
But why do we see typological patterns?
Languages have red, yellow, green and blue terms before any
derived terms emerge (though quite often derived terms will
emerge before blue and green split).
We should see yellow-red composites in systems with only two
terms, and green-blue terms should be seen very frequently.
We should see some yellow-green terms, but no red-blue ones.
The only three colour composite should be yellow-green-blue.
The most common derived term should be purple.
orange should be less frequent, but can appear before purple.
We shouldn’t see lime or turquoise terms at all.
Changing Innate Foci Locations
The hues are numbered from 1 to 40.
- Red was placed at hue 5.
- Yellow at 17
- Green at 24
- Blue at 30
The simulations were repeated, again 20 runs were done in each
condition, plus 20 extra ones with a life expectancy of 150.
D D D D D*D D D D C C C C C C C C*C C A A A A A*A A B B B B*B B B B B B B D D D
D D D D D*D D D D D D A A C C C C*C A A A A A A*A A A B B B*B B B B B B B B B D B
D D D D*D D D D D A A C C C C C*C C A A A A A*A A A B B B*B B B B B B B B*B B D D
D D D*D D D D D D C C C C C C*C C C C A A A A A*A B B B*B B B B B D D D D D
The languages were then analysed as follows
- Only people over half the average age were included.
- The name that they would give to each hue was found (as
shown above – each row is one person).
- Each term was then classified. If it contained innate foci, it
would be classed as red, yellow-green, blue-red-yellow etc.
- Terms which were in between innate foci were classed as
orange, lime, turquoise or purple.
- If people disagreed, the classification supported by the most
people was chosen, and, if this was tied, a classification
containing fewer foci was preferred over one with more.
Frequencies of Term Types
Red 116
Yellow 97
Green 86
Blue 90
Red-Yellow 4
Yellow-Green 4
Green-Blue 16
Blue-Red 1
Yellow-Green-Blue 11
Green-Blue-Red 2
Purple 34
Orange 25
Lime 4
Turquoise 4
System Type Frequencies
Only system types which occur more than once are included.
 6 colour systems:
orange, purple, red, yellow, green, blue 7
 5 colour systems:
purple, red, yellow, green, blue 8
orange, red, yellow, green, blue 7
 4 colour systems:
red, yellow, green, blue 42
 3 colour systems:
red, yellow, green-blue 9
red, blue, yellow-green 2
red, blue, yellow-green-blue 2
 2 colour systems:
red, yellow-green-blue 8
red-yellow, green-blue 4
yellow, green-blue-red 2
Only the systems with green-blue-red categories are unattested.
Grue Category Foci
Where is the prototype in green-blue composite terms?
Is it sometimes mainly in green, sometimes mainly in blue, and
sometimes there is no strong bias either way?
There were 16 grue categories in the simulations:
- In 8, at least 75% of people put the focus nearest blue.
- In 1, at least 75% of people put the focus nearest green.
- In seven there was no clear majority as to which focus was
preferred.
Sometimes a person would choose both a green and blue prototype
as almost equally good examples of the category.
- But usually the innate focus nearest to the centre of the category
is chosen as the best example.
Is the Evolutionary Hypothesis Supported?
Do languages add colour terms but never lose them?
- With an average of 90 examples being observed during a
person’s lifetime, after 18000 iterations (20 lifetimes) we have
red, yellow, green, blue and green-blue.
- Then from 27000 till 72000 we have red, yellow, green, blue.
- At 81000 purple is added.
- But at 90000 it’s been lost.
If we reduce the life expectancy so people see an average of 60
examples, we still get mainly red, yellow, green, blue systems,
but at one stage we briefly get a red, yellow, green-blue system.
So, it seems that the number of colour terms depends mainly on
how often people use colour terms during their lifetimes.
 But we should expect random drift, where sometimes colour
terms will be gained, and sometimes lost.
So maybe the evolutionary hypothesis only holds when societies
are increasing in technological complexity.
What about the Future?
Can we have more than 11 Basic Colour Terms?
If people use colour words more often in the future, will new
terms become basic, such as turquoise and lime?
New simulations were done, were people would observe on
average 180, 210 or 240 colour examples during their lifetimes.
We see 74 orange, 30 lime, 24 turquoise and 114 purple.
There are 60 red, 63 yellow, 60 green and 63 blue.
There are also 2 yellow-green and one each of green-blue and
blue-red and yellow-green-blue.
 So we do see lime and turquoise terms emerging.
But if we look at overall systems, the most common system is
orange, purple, purple, red, yellow, green, blue.
- So maybe we’re more likely to get another basic purple term
before lime or turquoise become basic.
There aren’t many systems with lime or turquoise which don’t
have at least two purple or orange terms.
References
Batali J. (1998). Computational simulations of the emergence of grammar. In
James R. Hurford, Michael Studdert-Kennedy and Chris Knight (Eds.)
Approaches to the Evolution of Language: Social and cognitive biases.
Cambridge: Cambridge University Press.
Bayes (1763). An Essay Towards Solving a Problem in the Doctrine of Chances.
Philosophical Transactions, Volume 53, pages 370-418.
Belpaeme, Tony (2002) Factors influencing the origins of colour
categories. Ph.D. Thesis, Artificial Intelligence Lab, Vrije Universiteit
Brussel.
Berlin, B. & Kay, P. (1969). Basic Color Terms. Berkeley: University of
California Press.
Chomsky, N. (1972). Language and Mind. New York, NY: Harcourt Brace
Jovanovich Inc.
Dowman, M. (2001). A Bayesian Approach to Colour Term Semantics.
Lingu@scene, Volume 1.
Griffiths, T. L. & Tenenbaum, J. B. (2000). Teacakes, Trains, Taxicabs and
Toxins: A Bayesian Account of Predicting the Future. In L. R. Gleitman & A. K.
Joshi (Eds.) Proceedings of the Twenty-Second Annual Conference of the
Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum Associates.
Heider, E . R. (1972). Universals of Color Naming and Memory. Journal of
Experimental Psychology, 93:10-20.
Hurford, J. R. (1987). Language and Number The Emergence of a Cognitive
System. New York, NY: Basil Blackwell.
Kay, P. & Maffi, L. (1999). Color Appearance and the Emergence and Evolution
of Basic Color Lexicons. American Anthropologist, Volume 101, pages 743-760.
Kirby, S. (1999) Learning, Bottlenecks and the Evolution of Recursive Syntax, in
Briscoe, Edward, Eds. Linguistic Evolution through Language Acquisition:
Formal and Computational Models. Cambridge University Press.
MacLaury, R. E. (1997). Color and Cognition in Mesoamerica: Construing
Categories as Vantages. Austin, Texas: University of Texas Press.
Tenenbaum, J. B. & Xu, F. (2000). Word Learning as Bayesian Inference. In L.
R. Gleitman & A. K. Joshi (Eds.) Proceedings of the Twenty-Second Annual
Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum
Associates.
Download