Morphology in Word Grammar

advertisement
Default inheritance, Word Grammar morphology and
French clitics1
Richard Hudson, draft March 2014
Abstract
After a general introduction to the theory of Word Grammar (WG), including a
discussion of why a cognitive perspective is important, the paper focuses on two
issues: the theory of default inheritance, and (as an example of defaults in action) a
detailed analysis of the morphosyntax of French clitics. For default inheritance (DI),
there are six potential problems which the paper addresses, and solves:
1.
2.
3.
4.
5.
6.
generality (how to generalize beyond morphology)
reliability (how to ensure monotonicity)
certainty (how to recognise and resolve conflicts)
relevance (how to avoid inheriting irrelevant properties)
economy (how to avoid storing inherited properties)
‘sensibleness’ (how to avoid silly classifications).
WG avoids or solves these problems by assuming a network structure rather than
attribute-value matrices, and by restricting DI to tokens. For French clitics, the
analysis takes clitics as words realized by affixes which each have a ‘hostform’, a
schematic morphological structure containing ordered position slots. Each clitic has
an abstract syntactically-oriented relation to its hostform (such as ‘subject’ or ‘3rdperson direct object’) which is mapped onto one of the position slots by general rules,
but these general mappings vary between the default hostform, found (surprisingly)
with affirmative imperatives, and the exception, found with all other verbs. According
to this analysis, clitics show the same default orders as their non-clitic equivalents:
subject before the verb, and direct object followed by indirect after the verb.
1.
Theoretical background
Perhaps the most important and distinctive characteristic of Word Grammar (as
described, among other places, in Hudson 2007, Hudson 2010, Gisborne 2010; called
‘WG’ in the rest of this chapter) is its cognitive orientation. Not only does it assume
that linguistic structures are ultimately conceptual structures, but it also joins other
versions of cognitive linguistics in rejecting cognitive modularity. Instead of treating
language structure as 'sui generis', WG treats it as just an example of ordinary
cognitive structure, with similar properties to the structures we use for remembering
events, people, social relations and so on. Similarly, language processing is just an
example of general-purpose processes such as attention, classification, binding and
inference.
Why should linguists, and morphologists in particular, concern themselves
with cognition? After all, the mainstream tradition of morphology produces abstract
analyses of patterns such as verb paradigms which can easily be seen as existing in
their own right, without any connection to other parts of the world. It is very tempting
to think we can study the formal properties of morphological patterns without
considering how they relate either to people or to the rest of language, leaving
1
Thanks to Nik Gisborne for detailed comments on an earlier draft.
processing matters to the psychologists (and syntax to the syntacticians). A hundred
years ago this may well have been a wise position as a defence against speculative
psychology, but now that cognitive science is so well developed it is indefensible.
This is especially so when we consider a psychological notion such as ‘defaults’,
which makes no sense outside human cognition; defaults are part of our everyday
analysis of reality, but they are arguably no more part of that reality itself than the
categories to which they attach.
Moreover, language itself self-evidently resides in people’s brains, so it is part
of our minds, whether or not we are interested in minds, and any theory of how
language is organised, whether intentionally cognitive or not, must eventually be
reconciled with a theory of how minds are organised. In other words, “a theory that
aspires to account for language as a biologically based human faculty should seek a
graceful integration of linguistic phenomena with what is known about other human
cognitive capacities and about the character of brain computation” (Jackendoff
2011:586). In order to minimise rethinking at a later stage, it would be wise to prepare
for it by immediately integrating at least some of the most elementary findings of
cognitive science (such as memory networks, spreading activation and default
inheritance).
Psycholinguists and linguists are all 'partners in the broader linguistic
enterprise.' (Ferreira 2005) Discussing psycholinguists, Ferreira warns: 'their focus is
on processing, but the representations presumably being generated are linguistic.
Therefore, it would be foolish to ignore insights from linguistic theory about the
nature of those structures.' For linguists, the same argument applies but arguably even
more strongly as we seek to develop more sophisticated theories of language
structure.
A cognitive orientation to morphology does not simply mean applying
psycholinguistics to morphological phenomena, as in the recent debate about the
single route versus the dual route for processing regular and irregular forms (Pinker
1998). There has been a great deal of productive psycholinguistic work on
morphology (Marslen-Wilson 2006), but it has been conducted against the
background of mainstream linguistic theories which were not designed with cognitive
issues in mind; for instance, virtually all this work has assumed that ‘the lexicon’ is
distinct from ‘the grammar’ (or ‘the rules’), whereas cognitive linguists agree in
rejecting this distinction. One consequence of the conservative assumptions made by
psycholinguists about language structure is that psycholinguistic research has had
very little effect on theories of linguistic structure – and in particular, very little effect
on theories of morphology.
This is the intellectual background to the theory of WG, which tries to
combine reasonably uncontroversial findings of cognitive science with well motivated
assumptions about language structure. For instance, take the simple fact that knowing
a language is remembering a lot of facts – that the word pronounced /kæt/ means
‘cat’, that prepositions allow a noun as their complement, that past-tense verbs locate
the situation described at a point in time before the moment of speaking, and so on
and on. How does human cognition handle facts? An uncontroversial answer, called
the Network Notion, is that each fact links two concepts in a network (Reisberg
2007:252), so conceptual knowledge consists of a network of concepts. This answer
also implies a definition of what a ‘concept’ is: simply a node in the network, an atom
without any internal structure. This is very different from the ‘network’ notion of
morphology in Network Morphology (Brown, Hippisley) or Construction
Morphology (Booij), in which the nodes are words or word forms with internal
structure. The Network Notion recognises nothing but linked atoms, where each atom
is the meeting point of at least two relations to other atoms. In this view, the word
CAT is the meeting point of links to the meaning ‘cat’, to the form {cat}, to the word
class 'noun', and so on, each of which is in turn simply an atomic meeting point.
One of the issues in cognitive sciece is whether the network is simply an
associative network - a collection of undifferentiated associations - or something more
sophisticated, with links of different types. At least in AI, the consensus is that links
themselves need to be classified, and this is certainly the view from linguistics, where
the various links from CAT are traditionally distinguished clearly in terms of relations
such as 'meaning', 'realisation' and so on - the 'attributes' of any theory which uses
attribute-value feature structures.
Similarly in WG, where there are some very general and fundamental linktypes:
 ‘is-a’ (e.g. ‘Richard is-a linguist’)
 argument and value (e.g. the 'meaning' relation between CAT and 'cat' has CAT as
its argument and 'cat' as its value)
 identity (e.g. Richard is identical to the person who wrote this paper)
 quantity (e.g. the the number of legs we expect a cat to have is exactly 4).
These are the elementary relations, and there may be no others. In contrast, the
number of more specific relation-types is open-ended because relations such as
‘father’ or ‘meaning’ are merely learnable concepts, like entities such as ‘cat’ or the
word CAT. What distinguishes relational concepts from non-relational concepts is
their 'argument' link to one entity and their 'value' link to another.
These ideas are illustrated in Figure 1, which shows part of the network which
defines the concept ‘cat’. Each of the square boxes names an entity, while the ovals
name relations. Each relation is linked to its value by an arrow pointing at the value,
and to its argument by a simple curve without a point. Finally, the small triangle
indicates the ‘is-a’ relation iconically, with its broad base on the super-category and
its (smaller) apex pointing towards the (smaller) sub-category. In words, Smudge is a
cat, a general category which is defined as the meaning of the word CAT and as the
purrer in the typical act of purring.
meaning
purrer
purring
cat
Smudge
Figure 1: A network for 'cat'
CAT
The conceptual networks presented in diagrams like this, unlike neural
networks, are not intended to be models of brain structures. It seems almost certain
that the brain does not allocate a single neuron to each concept, but linguistic analysis
depends crucially on the assumption that we represent each linguistic concept
separately, whether it is a phoneme, a word, a word class, a relation or a meaning. In
short, the networks of WG are ‘symbolic networks’ with one node per concept, and
vice versa. Morphological analysis requires very clear and stable representations of
concepts such as Figure 1 rather than the much more diffuse and opaque ‘subsymbolic’ representation of concepts found in ‘distributed’ models of neural networks
(Onnis and others 2006); indeed, we might even use morphology as evidence for
separating the mind and the brain.
On the other hand, the clarity of a symbolic network such as Figure 1 should
not mislead us into thinking that such networks are stable and unchanging. As part of
a human mind, a symbolic network is highly dynamic. There are three reasons for
this. One is that we are using the network for processing our experiences, for planning
actions and for thinking. All three kinds of use require changes to the network as we
create new nodes for our new experiences, plans and thoughts, and as we enrich these
nodes in the ways explained below; for example, when we see a cat, we must assign it
a new node even if we eventually classify it as an example of a cat that we already
know. Most such nodes disappear within minutes, or even seconds, of appearing.
Another cause of changes in the network is learning, whether learning new low-level
concepts (such as the neighbour’s new cat) or some higher-level generalisation
induced from existing concepts. And the third cause is the activation in the underlying
brain circuits which guides and follows our thinking. We know that concepts are
easier to retrieve if we have used them more frequently or more recently (Ellis
2002a), and that an experience can ‘prime’ related concepts (Reisberg 2007:257), so
we can be reasonably sure that activation levels vary both on a long-term scale and in
the short term. These changes of activation affect the underlying brain-cells directly,
but their indirect effect on the mental network is profound as they guide us in
retrieving information. We shall see below how important they are in WG when
considering the logic of default inheritance.
Where do procedures fit into this view of the mind? The network itself is, of
course, purely declarative, even when parts of it are changing. For instance, the link
which shows that ‘cat’ is the meaning of the word CAT is simply there, as a
declarative fact; it is not a procedure, nor is it the instruction for a procedure. As a
fact, it is quite neutral as to directionality or timing, so it is just as relevant to speaking
as it is to listening. It allows a speaker to find the word for 'cat', just as it allows a
listener to find the meaning of CAT. This principle is especially important for
morphology, where there is a strong procedural tradition expressed for instance in
classroom formulae for building verbs such as: “first take the infinitive, then knock of
the final –r, then add ... ”. This approach is represented at a theoretical level by
Paradigm Function Morphology (Stump 2006a), but declarative approaches offer a
viable alternative. The choice between declarative and procedural approaches is
clearly a fundamental question for research.
Like other branches of cognitive linguistics, WG assumes that knowledge
(including language) is based on the learner’s experiences far more than on innate
knowledge. When applied to language, what we know is based on other people’s
usage. Usage-based learning explains why some words are far more active than others
(Bybee 2010), as explained above, but it also means that we can store a great deal of
fine detail about the items we used – about their social context, their phonetic details,
and their linguistic context. There is a great deal of psycholinguistic evidence that
'language acquisition is a process of dynamic emergence and that learners' language is
a product of their history of usage in communicative interaction.' (Ellis 2002b:297),
but common experience also confirms that we know, and use, a great deal of itemspecific information; for instance, I know that the word ilk is very limited and I would
certainly notice any example and remember something about the speaker and social
context. However, if generalisations are based on memorised exemplars, it follows
logically that memory for detail must go well beyond idiosyncracies such as the
properties of ilk. As children, we must have induced regular patterns such as those for
plural nouns from a collection of very similar particular cases; and there is no
evidence that we forget these exemplars after creating the generalisation.
To summarise the argument so far, the cognitive orientation of WG leads to
the Network Notion, which in turn leads to:
 a view of mental structures as distinct from brain structures,
 a view of concepts as atoms defined only by their relations to other concepts,
 a view of relations either as basic links, such as ‘is-a’ or ‘argument’, or as
relational concepts in an open-ended and hierarchically organised list,
 a view of networks as constantly changing with new nodes being added and
lost or learned, and with underlying activation levels changing with
experience.
2.
The logic of default inheritance
If knowledge is held in a declarative network, it is important to know what mental
tools we have for exploiting it; and, if language knowledge is just ordinary
knowledge, we may expect these same tools to provide what we need in processing
language. Once again the cognitive orientation matters because it rules out any theory
of language processing which requires assumptions specific to language, such as a
dedicated ‘morphology module’ which might apply between other modules dedicated
to syntax and to phonology; it also rules out processing models which apply just to
speaking or just to hearing (e.g. Levelt and others 1999).
WG provides five general procedural tools for exploiting the network (Hudson
2010:70-101).
 Node-creation: this allows us to create new nodes for handling elements of
ongoing experience, as well as for new concepts that we create for induced
generalisations. Node-creation also creates new links to the new node. We
shall see below that node-creation includes the whole of default inheritance for
enriching the newly created nodes.
 Binding: this includes the familiar anaphoric binding patterns discussed in
syntax and logic, such as the binding of a pronoun by its antecedent; but it is
much more general, because it allows us to bind any two existing nodes as ‘the
same’, without actually merging them into a single node. For instance, this is
the effect of hearing a sentence such as That building is the post office, or
indeed of simply realising that the building concerned (already known) is the
post office. It may even extend into perception, where 'the binding problem' is
the challenge of explaining how we coordinate colour and shape in vision
(Reisberg 2007:55). It is unclear whether binding is in fact a separate process
from node-creation, because the logical effect of identifying nodes A and B
can be achieved by creating a new node C with 'is-a' links to both A and B.



Activation: this affects the activation level of a given mental node (transmitted
via the neurons that underly it). The activation is an observable physical
reality in neuroscience, and the way it changes in mental networks is one of
the main research themes of cognitive neuroscience. One point of general
agreement is that it spreads out from one mental node to all its neighbours in a
rather indiscriminate way, which gives rise to 'priming' effects in experiments
in which hearing one word makes a related word easier to retrieve; for
example, the word nurse can be shown to prime doctor thanks to the activation
of the former spreading indiscriminately to the latter (Reisberg 2007: 251-7).
Attention: this is closely related to the notion of activation because attention
seems to be at least in part a matter of channeling activation so that concepts
that we focus attention on receive extra activation (Reisberg 2007:112).
Attention is important in any theory of language structure and processing if we
think of language as a means for channeling the hearer's attention. For
instance, I say dog in order to make you 'think of' a dog - i.e. in order to get
you to pay attention to that concept, thanks to the 'meaning' link between the
two. When it comes to morphology, the links between forms and meanings
become much more complicated, but they still serve as a route map for the
hearer's attention.
Default inheritance: this is the process by which new nodes are enriched by
inference, and the topic of the present section. In essence, default inheritance
allows generalisations to have exceptions. The logic of default inheritance
comes from Artificial Intelligence (Luger and Stubblefield 1993: 387-9), but
the basic insight is the same as in the massive literature on prototypes in
cognition (Rosch 1978, Taylor 1995).
However intuititively obvious it may seem, default inheritance has been
subject to a great deal of theoretical debate in the research literature of both logic and
computational linguistics (Bouma 2006, Briscoe and others 1993, Carpenter 1992,
Daelemans and Desmedt 1994, Flickinger 1987, Lascarides and others 1996, Luger
and Stubblefield 1993, Pelletier and Elio 2005, Russell and others 1993, Touretzky
1986, Vogel 1998). This literature identifies serious problems, especially for attempts
to implement default inheritance in computer programs. These problems are so
serious that some would argue that default inheritance cannot actually underlie human
reasoning. This conclusion is disappointing, considering how obvious the basic idea
is: defaults generalise except when overridden; even more simply, generalisations
allow exceptions.
The classic non-linguistic example of exceptionality is the fact that penguins
are birds, but don’t fly; as mentioned earlier, examples like this are often discussed in
the cognitive literature on ‘prototype effects’ - the finding that we recognise some
examples of a concept as 'better' or clearer (so for instance a sparrow is a better
example of a bird than a penguin is, and a table is a clearer example of furniture than
an ashtray). Prototype effects are taken as evidence that general concepts such as 'bird'
or 'furniture' are defined by their typical (or prototypical) members rather than by
necessary and sufficient conditions. WG accepts this analysis, and explains it in terms
of default inheritance logic. If the logic allows any property of any concept to be
exceptional, it follows that any example of any concept may be exceptional;
consequently, examples can be ranked according to how many exceptional properties
they have.
The exceptionality of penguins can be represented in WG notation as in Figure
2, where once again the small triangle signals the is-a relation. In prose, a penguin is-a
bird, and in this diagram we are also imagining a ‘token’ – i.e. some exemplar entity –
which is-a penguin. The typical bird flies (is the flier in the activity of flying), and
conversely, the typical flier is a bird. The fact that this flying happens is indicated by
the number (‘#’) ‘>0’, meaning that one would expect to find some examples of flying
for every bird. However, exceptionally, penguins don’t fly; this is shown by the ‘0’
number, meaning that one would expect no instance of flying in the case of penguins.
Because the token is-a penguin, it inherits 0 rather than >0 for flying – in other words,
we don’t expect it to fly. Apart from the WG notation that it introduces, the main
point of this diagram is the conflict between 0 and >0 which is resolved in favour of 0
by the logic of default inheritance because penguins are exceptions, and exceptions
always defeat defaults.
#
flier
bird
>0
flying
#
0
penguin
#
token
0
Figure 2: Penguins as an exception in WG notation
The immediate question is what role default inheritance (DI) can play in
morphology, but we must first address the general objections to DI mentioned above.
The following paragraphs will identify all the problems with DI of which I am aware,
and offer solutions.
The first problem is generality: we need a domain-general theory of DI that
applies throughout cognition, and not just in morphology. This is a serious problem
for any theory which defines default inheritance in terms of attribute-value matrices,
because as the name implies, these only allow values to be overridden. If the example
of penguins not flying is correctly analysed here, it cannot be expressed as an
exceptional value, but as an exceptional argument: 'penguin' has an exceptional
property in which the exceptionality lies in the argument (no flying) rather than in the
value (the flier). Of course, it could be argued that this fact could be reworded so as to
reverse the directionality, giving something like ‘a penguin’s means of locomotion is
not flying’, but this misses the point that penguins simply cannot fly – it’s not just that
they prefer swimming. In any case, it would surely be wrong to base the theory of DI
on the assumption that arguments can never be exceptional. WG avoids this problem
by using a network which is not equivalent to an attribute-value matrix. A WG
network is not a DAG (directed acyclic graph) because it allows cycles in which a
pair of nodes A and B may be related in both directions, creating a cycle; in fact,
cycles are common and play an important part in WG.
The second problem is reliability: how to cope with a ‘non-monotonic’ logic
which allows earlier inferences to be overridden by later ones. If later overriding is
possible, no inference is reliable, because at any point we may discover some
overriding inference. For instance, if the token in Figure 2 were to inherit from ‘bird’
before it inherited from ‘penguin’, the first inference would turn out later to be
invalid. The solution is obvious: design the inheritance process in such a way that the
token inherits from ‘penguin’ before it inherits from ‘bird’. Consequently, WG
incorporates an algorithm for inheritance (Hudson 2010:90) which consists of two
processes:
 ‘the searcher’, which searches for inheritable properties.
 ‘the copier’, which copies them down to the inheriting node.
The bottom-up search strategy is guaranteed by a very simple principle: only newlycreated nodes inherit; in other words, inheritance is part of the node-creation process
mentioned above, and since tokens are always, by definition, at the foot of the is-a
hierarchy, there is only one direction in which the searcher can go: up. This principle
of Token-only Inheritance will also solve a number of other problems as we shall see
below. In short, the supposed non-monotonicity of DI is an illusion; it is actually
strictly monotonic, so every inference is reliable. For example, in the case of
penguins, ‘cannot fly’ is inherited first, so ‘can fly’ is never inherited. However it
should also be noted that this solution is only available in a cognitively based theory
of morphology which includes psychological processes such as node-creation.
The third problem is certainty: recognising clashes and winners. DI is all
about resolving the conflicts between defaults and exceptions, so it is important both
to recognise when there is a conflict, and to know which of the competitors should
win. In a case like Figure 2, both the conflict and the outcome are straightforward,
because ‘>0’ and ‘0’ are competing as the values of the same relation, and ‘0’ must
win because it is inherited first. However, there are two situations where the outcome
is less obvious.
One is multiple inheritance, where a token inherits from two sources – an
important scenario in inflectional morphology, which relies on multiple inheritance
from both a lexical category (a lexeme) and an inflectional one (e.g. dogs inherits
from both DOG and Plural). In general, multiple inheritance works smoothly
precisely because inherited properties are orthogonal - i.e. each source contributes a
different set of properties (e.g. in morphology, the lexeme contributes the base and the
inflection the affixes); but conflicts can arise, as in the famous ‘Nixon diamond’
(Touretzky 1986), in which the American president Richard Nixon inherited a
positive attitude to war from his membership of the Republican party, but a negative
one from his Quaker background. But where such conflicts arise, it doesn’t show a
problem in the logic, but in the world. People do hold conflicting beliefs and behave
in contradictory ways, and multiple inheritance simply explains why the beliefs and
behaviours are contradictory. Such conflicts may even arise in morphology, as in the
case of the *I amn’t gap, which I believe can be explained by the conflict between
inheriting the form am from ‘first person singular’ and aren’t from ‘negative’
(Hudson 2000). If this analysis is correct, the logic of default inheritance explains
why the conflict cannot be resolved. In contrast, Network Morphology, with its base
in the default logic of DATR, cannot explain the gap because DATR is designed so
that such conflicts can never arise.
The other area of uncertainty is where a term (i.e. value or argument) of one
relation is defined indirectly by a term of another relation – i.e. in attribute-value
terminology, by ‘re-entrancy’ (Bouma 2006). Figure 3 shows a typical example of this
situation, in which an English verb’s default present-tense fully-inflected form
(abbreviated to ‘full’) is the same as its base (as in they walk), but the verb BE is an
exception, because its present-tense form is {are} rather than the base {be}. As usual,
the triangles indicate is-a links, so the nodes labelled A and B are examples of the
‘full’ relation, and D is an example of {are}.
verb
full
{be}
BE
present
C
base
base
A
BE, present
{are}
B
token
D
Figure 3: By default, a verb’s present form is its base, but BE is an exception
The uncertainty lies around the relation ‘base’: in relation to 'present', C binds
its base to its ‘full’, so is the same true of {are} in relation to ‘BE, present’? At the
point where the searcher for this token inherits from ‘present’, one might expect it
also to inherit the binding of the ‘full’ and base relations; and though it is easy to
imagine situations where it doesn’t matter, there may be situations where it does
matter, and it is important for the token either to inherit the extra relation, or not to
inherit it. Once again, the solution is obvious: let the network decide. If {are} is the
base as well as the ‘full’ of ‘BE, present’, show this in the network; if not, not (as in
Figure 3). Thus, when the token D inherits from {are}, it also inherits all the relevant
relations, but it doesn’t later add any further relations to {are}. Put more formally:
 if the searcher for token t finds a property [T, R, X], where t is-a T and R is a
relation with T and X as its terms,
 and if there is already a property with t and x (an example of X) as its terms
o then the copier creates a copy [t, r, x], where r is an example of R;
 and if there is already a property with t as its term and r as its relation,
o then the copier does nothing (i.e. the existing property overrides the
new one),

otherwise, the copier creates a copy [t, r, x], where x is a new example of X.
The fourth problem for DI is relevance: how to avoid irrelevant enrichment. The
reason for enriching tokens is to provide information which we can’t derive simply
from observation. The problem is that most of the information we could inherit is
simply irrelevant in most situations. If you see a cat sunning itself, it is relevant to
know that cats like to be stroked, but not that they suckle their babies; and if you see
penguins in the zoo, it may be relevant to know that they can’t fly, but not to know
that they have skin. This is a serious problem in a realistic model of inheritance
because the process of searching and copying takes time and resources, and because
of the sheer quantity of information that we know about general categories, and could
therefore inherit. The WG solution exploits its cognitive basis by invoking activation
levels and the fact that these apply to relational concepts as well as to entities. In any
situation some relations and entities will be more active than others, reflecting
different degrees of relevance, so the searcher simply ignores properties in which the
relation is below some threshold of activation.
The fifth problem is economy: how to avoid filling our memory with
inheritable information. One might think that once we have inherited some fact, we
might store it away for use on future occasions, and this does seem to be true for facts
that we use very frequently; but very frequently used facts may be only a small
minority of the total, and it seems that most inheritable facts simply disappear after
we have inherited them. For instance, there is evidence from the past tenses of English
verbs that frequently-used forms are stored while those that are used less frequently
are not stored, but created as needed (Bybee 1995). The WG explanation once again
invokes the Tokens-only Inheritance principle: if only tokens inherit, then inherited
properties are not attached to permanent categories in memory, but vanish with the
tokens to which they are attached. This leaves us with a question about how
inheritable facts attach to very frequently accessed entities, but at least we have
changed the problem.
Finally, we have the problem of sensibleness: how to avoid unmotivated
classifications. The problem has been presented like this:
... if we define a penguin as a bird that does not fly, what is to prevent us from
asserting that a block of wood is a bird that does not fly, does not have
feathers and does not lay eggs? (Luger and Stubblefield 1993:389)
The solution, again, is rather obvious: recognise that this is a problem for
classification, not for DI. It is classification that establishes the initial is-a link
between a token and some stored type, and DI is only responsible for enriching the
token on the basis of that classification. What is needed is a sensible theory of
classification such as the WG theory based on activation levels (Hudson 2010:93-8).
The conclusion to which this discussion leads is that the supposed problems of
DI are all easy to solve in a cognitively-based theory such as WG, so DI is indeed a
suitable logic for work in morphology. The rest of this chapter will illustrate the
benefits of being able to refer to defaults in morphology.
3.
Morphology and realization
WG includes a theory of morphology which is inferential and realizational, so it
belongs to the Word and Paradigm family (Stump 2001:3). In other words,
morphological structure is not just a case of syntactic structure within the word.
Indeed, this choice is almost forced on WG by its theory of syntax, which is based on
word-word dependencies rather than on phrase structure. A phrase-structure analysis
of Cows moo might consider whether the division of cows into {cow} and {s} is the
same kind of split as when cows moo is split into cows and moo, but this option is not
available in a theory which does not recognise any units of syntax larger than the
word.
The Network Notion means that realization is a relation rather than a process;
so the plural of COW has a 'realization' relation to {cow} and {s}, rather than being
realized by the process of adding {s} to {cow}. It could be argued that these two
descriptions are just different metaphors for the same pattern, but there is an important
difference. If realization is a process, then it takes place in time and when more than
one process is involved, they can be ordered in time, as ordered rules: first do this,
then do that. But as I mentioned in section 1, ordered procedures are problematic in
linguistics because of the asymmetry of speaking and listening. What is needed is a
time-neutral, declarative statement which can be exploited equally easily when
speaking or listening. Moreover, if some complex forms are stored rather than
created, ordered rules for creating them are irrelevant but their structures still need to
be recorded; for example, if we store the form {{walk}{ed}} as the past tense of
WALK, we need to be able to describe this structural pattern rather than providing a
recipe for creating it. Consequently, WG morphology has no ordered rules or
procedures, but analyses structures simply in terms of static declarative relations, such
as the relations ‘full’ and ‘stem’ in Figure 3.
One of the controversial issues in realizational theories is whether words are
realized by phonological objects or by special morphological objects, such as morphs
(a term that I prefer to the much more abstract ‘morpheme’ which tends to encroach
on the territory of morphosyntactic categories). At one time WG favoured direct
realization by phonology (Hudson 1984:53, Hudson 1990:181), but the current WG
answer accepts special morphological objects, i.e. a level of morphology separating
syntax from phonology. The evidence for this position comes from several different
directions, all independently pointing to the same conclusion:
 The argument from conflicting segmentation: Imagine a small child whose
vocabulary already includes both know and nose and is now confronted with
an example of knows. Since knows and nose are homophones they must have
the same phonological structure and segmentation, namely the syllable /nəʊz/.
But it is highly likely that the child also recognises the similarity between
knows and know in their shared realisation /nəʊ/; but in the phonological
structure, this is not a unit - it is merely part of a syllable. The only way in
which the child's mind can recognise the similarity between knows and know
is by dividing the former's realisation in a non-phonological way; in short, by
recognising the morph {know}, co-existing with, and mapped onto, the
phonological pattern /nəʊz/.
 The argument from recycling: The general principle that ‘the rich get richer’,
which underlies so much human behaviour (Barabasi 2009), means that we try
to ‘recycle’ existing meanings (Hudson and Holmes 2000) and forms in new
words. This is most clearly seen in folk etymology, where ordinary people (not
linguists) reanalyse complex words in terms of existing words, regardless of
the meaning. For instance, when hamburger (originally, a sausage from
Hamburg) was adopted in English, its first syllable was identified with the
form of ham, even though the sausage didn't actually contain ham, leading
eventually to creations such as cheeseburger. In short, {ham} is a morph
shared by both ham and hamburger rather than just a bit of phonology.

The argument from priming: psychological experiments show that
phonological priming (e.g. nurse – verse) dies out much faster than
morphological priming (e.g. contain – retain) (Frost and others 2000). This
shows that morphological and phonological patterns must be cognitively
distinct.
These psychological arguments strongly support the traditional ‘morphomic’ view
(Aronoff and Volpe 2006, Blevins 2003) that morphology is a distinct level between
syntax and phonology, with its own units (morphs, wordforms and intermediate-sized
units), its own classes (root, prefix, suffix, etc.) and its own relations (realization,
stem, full, etc.). Typically, then, words have meaning and are realized by morphology,
not phonology; morphs are basically meaningless but realize words and are realized
by phonology (and graphology); and phonology realizes morphology and is realized
by phonetics. This typical arrangement is shown in Figure 4. However, there is no
reason to believe that our minds are incapable of representing other relations, such as
a direct relation between phonology and meaning – indeed, this seems to be precisely
what we have in intonation where 'tunes' carry meaning independently of the words
and syntactic structures with which they combine. (Another such case is the existence
of ‘phonesthemes’ such as glitter, glisten, glow, gleam, glare and glint, where the
pattern /gl/ is related to the idea of bright light.)
concept
'cat'
meaning
word
CAT
realization
morph
{cat}
realization
syllable
/kæt/
Figure 4: four levels: semantics, syntax, morphology, phonology
The architecture of language outlined above is a model of language
representation, the kinds of mental structures that we need to represent the things we
hear and say (and read and write, if we add a level of graphology in parallel with
phonology). It distinguishes three kinds of entities apart from meanings (syntactic,
morphological and phonological or graphological) related by realization relations – a
conservative model of language architecture. But two familiar elements of traditional
morphology are missing.
First, as in other branches of cognitive linguistics, there is no attempt to
separate ‘the grammar’ from ‘the lexicon’. In many models these are the names for
supposedly separate parts of the underlying system, involving either general rules or
specific lexical entries; but if cognitive linguistics is right, then properties are
inherited down an inheritance hierarchy which has no boundary between ‘general’
and ‘specific’. Generality is simply a matter of degree, with the same item – such as
the lexeme CAT – inheriting its properties from concepts ranging from very general
(‘word’) to very specific (‘CAT’).
And secondly, as in Network Morphology (Hippisley) and other sub-varieties
of Declarative Morphology (Neef 2006), there are constraints but no rules. Instead of
rules, we have the generalisations which are available for inheritance and which act as
constraints on possible representations but do not ‘create’ them. One advantage of
avoiding rules in this way is that the information is neutral between production and
perception, whereas a rule such as ‘to form the plural, add {s}’ applies to production,
and needs to be rephrased for perception along the lines of ‘if you see a word ending
in {s}, recognise it as plural’. Another advantage of the static analysis is its neutrality
between storage and creation, as required by any theory which allows forms to be
either stored or created as needed. In both cases, the structures are covered by the
same generalisations, so the morphological structure of, say, walks is {{walk}+{z}},
regardless of whether it is stored ready-made or created as needed.
As mentioned earlier, this rule-free approach contrasts with Paradigm
Function Morphology (Stump), where rules play a fundamental role. A single PFM
rule may be logically (and psychologically) equivalent to a relation, but it is unclear
whether the same is true of entire blocks of rules. Since Stump invokes his analysis of
Fench pronominal clitics as evidence for rule blocks, section 5 of this chapter shows
that the same data can be analysed without recourse to rule blocks. First, though, we
need a brief introduction to the WG theory of morphosyntax.
4.
Morphosyntax in WG
This section explains how WG handles morphosyntax, the syntax-oriented part of
morphology which maps syntactic categories onto morphological structures consisting
of morphs. Morphophonology, which relates these morphs to their phonological
realisations, is discussed briefly elsewhere (Hudson 2007:81-100) but is relatively
underdeveloped in WG.
The basis of the WG approach is to rely heavily on a rich set of relations,
starting with the ‘realization’ relation and its sub-types. If {cat} is the realization of
the lexeme CAT, then what is the relation between {cat} and the plural of CAT,
called ‘CAT, plural’? Clearly there is a relation, and it too is an example of
realization, but equally clearly, it cannot be exactly the same kind of relation as the
first. In short, ‘realization’ breaks down into a number of more specific relations such
as ‘base’ and ‘full’ (meaning ‘fully inflected form’) which we have encountered
already. It will be recalled that WG allows relational concepts as well as entity
concepts to be classified and sub-classified in a hierarchy. In this analytical system,
therefore, the ‘full’ of ‘CAT, plural’ is {{cat}{s}}, while its ‘base’ is {cat}; and ‘fully
inflected form’ and ‘base’ are both sub-types of the more general relation
‘realization’. These two relations cover the morphosyntax of the word cats, which is
inherited in a regular way from the inflectional category ‘plural’ as shown in Figure 5.
full
plural
CAT
base
base
s-variant
CAT, plural
full
{{cat}{s}}
{cat}
base
s-variant
Figure 5: Cats as an inflection of cat
The crucial role in this analysis is played by another relation, called 'variant'. It
is exemplified here by 's-variant', which defines the relation between a plural noun's
fully-inflected form and its base. (The same relation also defines the singular form of
a present-tense verb.) The diagram doesn't try to add any further details about the
regular pattern, but it shows that in the case of CAT, the s-variant is {{cat}{s}}. In
contrast, the s-variant of MOUSE would be the irregular {mice}. The 'variant' relation
also provides a convenient mechanism for syncretism which avoids the arbitrary
choices of direction involved in rules of referral (Zwicky 1985b). For example, the
syncretism of the ‘perfect participle’ and ‘passive participle’ in English is expressed
by invoking the same 'en-variant' relation in both cases, rather than by arbitrarily
selecting one as basic and deriving the other from it.
This example shows how WG treats a simple example of inflectional
morphology; the most controversial characteristic of WG inflectional morphology is
probably the use of inflectional word-types such as ‘plural’ instead of
morphosyntactic features (Hudson 2010:44). Features are permitted by WG, but they
coexist with the much more flexible is-a hierarchy, so in principle a distinction such
as that between singular and plural can be accommodated in either way:
 hierarchically, by recognising 'plural' as a subclass of 'noun', with singular nouns as
the default.
 in terms of the feature (or attribute) 'number' and two values, 'singular' and 'plural'.
Features are well motivated in only one case: when there is agreement (X and Y both
have the same value for the feature Z). Only then can we be sure that the logic of the
system requires a feature. In contrast, hierarchical subclassification is a normal part of
any classification system, and in morphology it has the great advantage of providing a
natural way of showing markedness by identifying the unmarked case as the default.
Indeed, even when a feature is justified by agreement, it can be combined with a
hierarchical classification; for example, English agreement rules recognises a feature
'number', whose value is by default 'singular', but (exceptionally) 'plural' for plural
nouns.
The peculiarity of inflectional morphology is that the morphosyntactic
categories that it distinguishes are only available via morphology. In contrast,
derivational morphology is one of the mechanisms for relating ordinary lexemes
which are classified in terms of ordinary lexical classes. For example, it is
derivational morphology that recognises the morph {farm} in both the ordinary verb
FARM and the ordinary noun FARMER. Similarly, compounding relates two bases to
a third which includes them both, as where the base of FARMHOUSE contains the
bases of both FARM and HOUSE. The WG structures for these examples are shown
in Figure 6 with most labels omitted for simplicity; ‘1’ means ‘the first part’.
agentive
FARMHOUSE
FARMER
FARM
{farmhouse}
er-variant
{farmer}
{farm}
1
Figure 6: Derivation by suffixation and compounding
In addition to inflectional and derivational morphology, morphosyntax handles
various kinds of mismatch between syntactic and morphological structure, including
fusion and cliticization (Camdzic and Hudson 2007, Hudson 2007:104-15). Fusion
maps two words onto a single morph. For example, French au replaces the expected à
le, as in au jardin, ‘to the garden’, compared with à la maison, ‘to the house’, and in
English the abbreviated form of you are is arguably a single morph with the same
pronunciation as your (or possibly even the same morph as in your).
LE
À
{au}
BE, present
YOU
{your}
Figure 7: Fusion
In contrast, cliticization maintains the expected one-one relation between
words and morphs, but reduces the word's realisation to a mere affix instead of the
expected full wordform. At the level of syntax, a clitic has a normal place in the
dependency structure of syntax. But in morphology, as a mere affix it is often said to
need a ‘host’ word to support it. Take example (1).
(1)
The boys’ll fix it.
Here ‘WILL,present’ is realized by a suffix {’ll}, so this attaches to the preceding
word for support. But it cannot be an extra part of the wordform {{boy}{s}} because
this is already ‘full’ – i.e. fully inflected, without any space for a further suffix.
Consequently, the clitic must create its own morphological ‘host’ containing both it
and the supporting word: {{{boy}{s}}{‘ll}}. Strictly speaking, therefore, a clitic's
host is not a word but a hostform, a purely morphological entity without any
counterpart in syntactic structure (e.g. boys'll is not a syntactic unit). As a theoretical
construct, it is a normal part of morphological structure: a formal template which
accommodates morphs in a rigid and sometimes arbitrary order (Stump 2006b), in
contrast with the 'layered' structures which are also found. In a language like English,
hostforms are very simple, just as morphological structure in general tends to be; so
an English hostform has only two slots, which are filled according to the ordinary
word-order rules of syntax - in other words, we have just 'simple clitics' (Zwicky
1985a). As we shall see below in the analysis of French, the template can be much
more complex, just as the ordinary inflectional morphology is. Indeed, it is tempting
to speculate about a connection between these two types of complexity: maybe
'special clitics' such as those found in French are only found in languages that also
have complex templates for inflectional morphology?
A simplified syntactic and morphological structure for example (1) is shown
in Figure 8, where the node labelled ‘X’ is the host of the clitic affix {’ll}, and
provides two ordered ‘slots’ (actually, relations) labelled ‘0’ (for the main one, the
'anchor') and ‘+1’ (for the affix). This is actually just the same kind of morphological
structure as the one that relates the node labelled ‘{{boy}{s}}’ to its parts, {boy} and
{s}, but these relations are omitted for simplicity.
The
‘ll
boys
fix
it.
full
full
{‘ll}
{{boy}{s}}
0
+1
X
host
Figure 8: A simple clitic: ... boys'll ...
This simple example provides the basic ingredients for the analysis of French
clitics in the next section:
 an affix rather than a full wordform as the fully inflected form (‘full’) of the
clitic.
 a hostform, which is stored as a completely schematic morphological form
with a structure but no specific phonological realization of its own., but which
in particular instances brings together the morphological realizatios of the
clitic and its host.
 a ‘host’ relation linking this hostform to the clitic affix.
 a relation such as ‘+1’ from the hostform to the clitic affix
 a relation labelled ‘0’ from the hostform to its anchor, a full wordform.
This is enough apparatus to analyse simple clitics, but special clitics, where the clitic
is in a special position, need a little more.
5.
French clitic pronouns
The challenges of French clitic pronouns are well known because of the ordering and
co-occurrence constraints that go beyond anything we might explain in terms of
syntax or semantics (Hawkins and Towell 2001:67-71). Since they have been
presented by Stump as a test-case for the explanatory power of Paradigm-Function
Morphology, it may be interesting to compare the PFM analysis with one based on
WG; the following analysis builds on the one in Hudson 2001 and Hudson 2007:1113, but introduces some significant improvements.
The special clitics that are attached to verbs (not all of which are pronouns) are
illustrated in the following examples:
(2)
Il ne le lui
y
donnerait pas.
he not it to-him there would-give not
He wouldn’t give him it there.
(3)
Il ne te
le donnerait pas.
he not to-you it would-give not
He wouldn’t give you it.
(4)
Il y
en
mangerait.
he there from-it would-eat
He would eat some of it there.
These examples include the following:
 il: a subject pronoun
 ne: a negative marker, paired (here) with pas.
 le: a direct-object pronoun.
 lui: an indirect-object pronoun.
 y: a word meaning ‘there’ or ‘to it’.
 en: a word meaning ‘from it’ or ‘of it’.
(Some of these ‘words’ could be analysed as fusions of a preposition de, ‘of, from’ or
à, ‘to, at’, with a pronoun following the model of au for * à le mentioned above; for
instance, lui = *à le (human), en = *de le (non-human). For present purposes, this
possibility is irrelevant; all that matters is that the morphological realization is an
affix.)
These facts call for an analysis in terms of a template in which the order of
morphs is simply stipulated as a linear sequence:
Table 1: French clitic pronouns
A
subject
B
C
neg non-third or
reflexive object
je, tu,
il, elle,
nous,
vous,
ils,
elles
ne
me, te, se, nous,
vous
D
third-person nonreflexive direct
object
le, la, les
E
third-person
non-reflexive
indirect object
lui, leur
F
to
G
of
y
en
As one might expect, since only one position is available for each group, only one
member of each group is allowed, even if syntax allows two to combine – a clear
example of the effects of morphology mentioned above. This rules out examples such
as (5), in which two pronouns from group C are combined.
(5)
*Il se
me présentera.
he himself to-me will-introduce
He will introduce himself to me. (Or: He will introduce me to himself.)
According to Stump's presentation, ‘ethic datives’ are allowed as exceptions, as in (6).
(6)
Il te nous a passé un de ces savons.
he ‘you’ to-us has passed one of those soaps
He gave us an incredible telling-off.
The obvious possibility is that ethic datives have an extra position slot between B and
C. This slot also seems to be available, for some speakers, in what Stump calls ‘cliticclimbing’, where a dative pronoun belongs to an adjective as in (7).
(7)
Jean me te
semble fidèle.
John to-me to-you seems faithful
John seems to me to be faithful to you.
This detail is largely irrelevant here and can be left to future research.
However there are some challenging complications which require a more
sophisticated theory of morphosyntax:
 Typically, these forms all stand before the verb as in (2) to (4), but if the verb is
affirmative imperative, all but A and B follow the verb, as in Fais-le!, ‘Do it!’
(but: Ne le fais pas! ‘Don’t do it!’)
 In affirmative imperatives, the order of groups C and D is reversed, as in Donnezle-moi! ‘Give it to me!’ In this case, the ordinary form of me and te (you) is
replaced by moi and toi.
 Although groups C and E occupy different positions, they cannot combine as in
*Il se lui présentera. ‘He will introduce himself to him.’
 If the pronouns belong syntactically to a non-finite verb depending on an auxiliary
verb (avoir ‘have’ or être ‘be’), they attach morphologically to the auxiliary (as in
Il le leur a envoyé, ‘He has sent it to them.’), and likewise if the non-finite verb
depends on faire, ‘make’ (as in Il les lui fera manger, ‘He will make him eat
them’).
 If the verb to which the pronouns belong syntactically depends on laisser, ‘let’,
envoyer, ‘send’ or a perception verb, the pronouns may attach morphologically
either to their own verb or to the verb on which this depends, as in either Tu les lui
laisses lire? or Tu la laisses les lire?, both meaning ‘Do you let her read them?’
(Hawkins and Towell 2001:70).
In addition to these morphosyntactic details, clitic pronouns show peculiarities
in morphophonology which point clearly to their being involved in a morphological
structure rather than merely in a syntactic structure mapped directly onto phonology.
For example, the object pronoun le or la can be omitted provided it would
immediately precede the pronoun lui (Bonami and Boyé 2007):
(8)
Paul la lui apportera.
Paul it to-him will-bring
(9)
Paul lui apportera.
The object pronoun must be present in the syntax, because it is required by the
valency of apportera, but the condition for its omission is the presence of lui in the
morphology.
How should French clitic pronouns be treated? A widespread view is that
since these constraints are clearly morphological rather than syntactic, they must be
handled by the machinery of inflectional morphology (Miller and Sag 1997, Bonami
and Boyé 2007), an asssumption shared by Stump's very sophisticated analysis in
terms of PFM. However, there is an alternative which deserves consideration: that the
clitic pronouns are, in fact, clitics - syntactic units realised directly by affixes. In other
words, they are not part of inflectional morphology, even though their analysis shares
the same morphological apparatus such as templates. The key difference between the
two approaches concerns the categories involved. The inflectional analysis requires
extra morphosyntactic features mediating between syntax and morphology, whereas
the clitic analysis does not. For example, (8) is syntactically and semantically very
similar to (10).
(10) Paul apportera la viande à Jean.
Paul will-bring the meat to John
In the inflectional analysis, the verb in (8) carries morphosyntactic features which
reflect the presence of the two pronouns and which can be invoked by special
syntactic rules which explain how the full NPs and PPs of (10) can be replaced by the
mere features of (8). In contrast, the clitic analysis dispenses with these mediating
features; so instead of saying, for example, that la realizes a feature which is
interpreted syntactically and semantically as though it was an ordinary object, the
clitic analysis says that la is an ordinary object with the morphological peculiarity of
being realised by an affix, {la}.
The clitic analysis is cognitively much more plausible. After all, the relation
between the morphological form and its non-clitic counterparts is fully transparent even to the extent of using ordinary determiners le, la, les as pronouns - and the
easiest way to understand the choice between (say) la viande and the clitic la is to
give them the same syntactic status, just as in the English choice between the meat
and it. It would be very strange if learners of French could only understand this rather
obvious pattern by postulating inflectional features of the verb.
I now present a WG version of the clitic analysis. We already have the
apparatus introduced earlier for simple clitics. The key concept is 'hostform', a
morphological template which is extremely schematic in the grammar but which in
particular cases has a concrete realization just like any other concept; so in (8) the
hostform is la lui apportera. Although in a network analysis it is just an atom, it has a
link to each of the 'slots' which may be part of it and whose order is defined by labels
such as ‘0’ (for the host’s anchor) and ‘+1’ (for the first part after 0). If we apply this
framework of analysis to the elementary ordering in Table 1, we can say that French
clitics have a host whose network structure consists of eight parts (not all of which are
present on every occasion), numbered from ‘-7’ to ‘-1’ and (finally) ‘0’, the anchor.
This kind of ordering is very familiar in general cognition, as it represents our ability
to order random pairs of numbers – to know, for example, that 3 is ordered in between
1 and 7. Somehow, then, each clitic must be mapped onto one of these ordered parts;
and the anchor verb is always mapped onto ‘0’.
One of the most challenging facts is the apparently exceptional behaviour of
affirmative imperatives, allowing examples like (11) to break the normal pattern in
(12).
(11) Donnez-le-moi!
Give it to-me
(12) Paul me le donnera.
Paul to-me it will-give
To start with I shall ignore affirmative imperatives as an exception, but the discussion
will in fact lead to a very different view in which they actually represent the basic
pattern - a complete reversal of the obvious analysis.
Where do hostforms come from? It is easy to see that each clitic could be
stored with a schematic hostform in which it already has its position, but suppose two
clitics both depend on the same verb: if each of them has a different hostform, how
will these combine to define their relative positions? The solution lies in the operation
of binding that we have already invoked, and which binds selected concepts to other
concepts to show that they are, in some sense, the same. The mechanism for binding
is very general, applying in language to operations as diverse as anaphoric binding
and parsing, not to mention a host of applications outside language; it relies heavily
on activation levels. In the case of clitics, it merges any two hostforms that are highly
active, so if we have two clitics, each with its own position in its own hostform,
binding would merge the two hostforms into a single one which assigns them
different positions - in other words, a 'clitic cluster'. For example, in the sequence la
lui apportera, la and lui each need a hostform with a verb as its anchor, and at an
early point in the process of speaking or hearing that is all the processor knows; but
since they are both highly active at the same time, and their potential hostforms have
compatible properties, and in any case there is only one candidate verb, the two
hostforms get bound together.
But even if we can assign all clitics to a single hostform, how will this help to
position them before the verb? The answer is again obvious: the merged hostform has
a schematic anchor which must be a verb, so we bind the most active verb to it.
Notice that this mechanism only generates hostforms when they are needed by a clitic.
The result is a single hostform which has bidirectional relations to each clitic but a
single relation to the verb. For example, in the case of la lui apportera, this whole
sequence is the hostform of the clitic la, with the latter as its '-4' (as explained below)
and the verb as its anchor (labelled '0').
I can now start to illustrate the structures that I believe we need for special
clitics in French. Figure 9 shows the structure for (13).
(13) Il la leur
présentera.
he her to-them will-introduce
He will introduce her to them.
In the diagram, the node labelled ‘X’ is the merged host of the three clitics'
realizations, and assigns each form a position number relative to the verb; '3SMS' is a
syntactic representation of il ('third singular masculine subject'). It is clear how this
structure will guarantee the right order of elements, but what is less clear is why the
requirements of morphology should always override those of syntax (Sadock 1991).
However intuitively obvious it may be, this doesn't seem to follow from the normal
principles of default inheritance and requires more work on the underlying logic of
the WG system.
3SMS
3PI
3SFO
PRÉSENTE, future, 3sg
realization
{il}
{la}
-7
home
{leur}
-4
-3
{{{présent}{er}}{a}}
0
X
Figure 9: Partial structure for Il la leur présentera.
Linking the verb to the hostform allows its properties to affect those of the
hostform. As mentioned earlier, affirmative imperative verbs affect clitics differently
from other verbs, with enclitics rather than proclitics. To explain these differences we
assign affirmative imperatives to a special kind of hostform, with different effects on
ordering. This will solve the problem, but it also creates a different problem: how to
map the clitics to the position slots. If {leur}, for example, inherits the label ‘-3’
because this is what it needs when it is attached before the verb, how can it have a
different number such as ‘+2’ when it follows the verb? The answer is to introduce a
more abstract set of relations to mediate between clitic forms and their position. Thus
instead of assigning ‘-3’ directly to {leur}, we first identify {leur}, within the
hostform, as the realization of ‘third-person non-reflexive indirect object’
(abbreviated to ‘3io’), and then link this category to different positions according to
the type of hostform. This means that in Figure 9 we should add even more relations
between ‘X’ and the clitics, including a ‘3io’ arrow from ‘X’ to {leur}, which would
by default be paired with a ‘-3’ position. Exceptionally, however, the ‘3io’ arrow
would be paired in a post-verb hostform with a ‘+2’ position. The latter analysis for
(14) is shown in Figure 10.
(14) Présentez la leur!
introduce her to-them
Introduce her to them!
PRÉSENTE, imperative
la
leur
{la}
{leur}
realization
{{présent}{ez}}
3do
0
3io
+1
host
+2
X
Figure 10: An affirmative imperative, Présentez la leur!
We are now ready to reverse the obvious analysis, as promised earlier. Why
should affirmative imperatives be different from all other verbs in their effect on
clitics? Seen simply in terms of morphosyntactic features, this peculiarity is indeed
peculiar. But seen in terms of clitics, these verbs have two important and highly
relevant properties: they allow neither a subject pronoun nor a negative ne – the first
two clitics in the ordering of Table 1. Both of these clitics have functional reasons for
preceding the verb:
 The subject clitic precedes it because its position is one of the devices used to
distinguish declarative and interrogative clauses (e.g. Tu m'aimes 'You love me'
versus M'aimes-tu? 'Do you love me?)
 The negative clitic ne precedes the verb so that the verb can separate it from the
post-verbal particle or pronoun with which it is normally paired (e.g. Tu ne
m'aimes pas 'You do not love me'. Je ne dors jamais 'I never sleep').
Suppose we assume that these two clitics have to precede the verb, and, further, that
all the clitics form a single clitic cluster. The result is that all the clitics are dragged
before the verb to act as proclitics. These functional pressures are admittedly fairly
weak, and in particular the pressure from the subject clitic, because this is only
loosely attached to the other clitics. Not only can it be inverted in interrogatives
without affecting the position of other clitics, as just mentioned, but unlike the objects
it can be shared by two verbs (Bonami and Boyé 2007), as in (15) compared with
(16).
(15) Il lira ce livre et le critiquera.
he will-read this book and it will-criticize
‘He will read this book and criticize it.’
(16) * Il le lira aujourd’hui et critiquera demain.
he it will-read today and will-criticize tomorrow.
‘He will read it today and criticize it tommorrow.’
Nevertheless, it is clear, and generally agreed, that the subject pronouns really are
clitics; for example, it is impossible to coordinate them (as in *Il et elle viennent, 'He
and she come'). And given that the other clitics clearly form a clitic cluster, it seems
reasonable to assume that a non-inverted subject is also part of this cluster.
My suggestion, therefore, is that it is the subject and negative clitics that are
responsible for the proclitic ordering, without which clitics would be enclitics. In
other words, contrary to our assumptions so far, clitics are enclitics by default, but the
default is overridden where the hostform contains a subject and/or negative. This
suggestion would be quite counterintuitive if we applied it to verb classes, as there is
no other reason for taking affirmative imperatives as the default inflection for verbs;
but the suggestion actually applies not to verbs but to hostforms. Since hostforms are
separate from their anchor verbs, it is quite possible that the default hostform has a
non-default verb as its anchor.
Further support for this analysis comes from the order in which enclitics
appear with affirmative imperatives: exactly the same as the order found in syntax,
where objects, complements and adjuncts follow the verb and only subjects precede it.
Moreover, direct objects precede indirect objects; for example, the order in (17) is the
same as the syntactically required order in (18), in contrast with the order found with
proclitics as in (19) and (20).
(17) Donnez le nous! 'Give it us!'
(18) Donnez le livre à Paul! 'Give the book to Paul!'
(19) Il nous le donnera. 'He will give it to us.'
(20) *Il le nous donnera. 'He will give it to us'.
What this means is that by default French clitics are actually simple clitics, following
the ordinary word-order rules of syntax, rather than special clitics. It is only in the
non-default case that they require special ordering rules.
We now turn to another curiosity of Table 1: the fact that groups C (me, te, se,
nous, vous) and E (lui, leur) cannot combine even though they occupy different
position slots and are completely compatible in syntax. This odd restriction is
illustrated in (21) to (24).
(21) Je présenterai Jean à Marie. ‘I will introduce John to Mary.’
(22) Je le lui présenterai. ‘I will introduce him to her.’
(23) Je me présenterai à Marie. ‘I will introduce myself to Mary.’
(24) *Je me lui présenterai. ‘I will introduce myself to her.’
Combining a member of C with a member of E is just as impossible as combining two
members of C, as in *Je me te présenterai. ‘I will introduce myself to you.’
In this case we may find an explanation in the abstract relations which allow
the same clitic to occur either as a proclitic or as an enclitic. Following a suggestion
made by Stump, suppose we assign C and E the same abstract category, which we can
call 'io' since they are all able to act as indirect object, even though group C can also
act as direct object. This shared category explains why C and E are mutually
exclusive. As for the differences in position, these can be explained by taking one of
the positions as the default, with the other as an exception. We shall see shortly that
there are in fact good reasons to take E as the default for 'io', whose properties are
overridden by pronouns in group C (called ‘1/2/r’ for ‘first- or second-person or
reflexive’).
If these suggestions are right, then the relevant grammar is shown, in part, in
Figure 11. In words, enclitic pronouns, found in the enclitic hostform of an
affirmative imperative, show the default ordering, with third-person direct objects
before other pronouns (so-called ‘io’, which includes direct objects). All other kinds
of verb have a proclitic hostform. This includes a subject (which may be responsible
diachronically for the proclitic positioning) and locates first- and second-person, and
reflexive, ‘io’ pronouns in a different position from non-reflexive third-person
indirect objects (the default case). For simplicity, the diagram omits other clitics such
as the negative.
3do
enclitic
hostform
+1
io
+2
subj
-7
3do
-4
proclitic
hostform
io
-3
1/2/r
-5
Figure 11: A grammar for French clitic pronouns
The analysis so far has solved all the morphological problems, so we are ready
to turn to the syntax. I have assumed so far that each clitic depends, in syntactic
structure, directly on its host verb, but this need not be so.
(25) Je les ai trouvés. ‘I have found them.’ (les depends on trouvés)
(26) Tu la laisses les lire? ‘Do you let her read them?’
(27) Tu les lui laisses lire? ‘Do you let her read them?’ (les depends on lire)
(28) Jean en mange beaucoup. ‘John eats a lot of it’ (en depends on beaucoup, ‘a
lot’)
(29) Jean lui a été fidèle. ‘John has been faithful to her.’ (lui depends on fidèle)
In all these cases, the pronoun’s host is a finite verb higher up the dependency chain
than the word on which it depends. This ‘clitic climbing’ is easy both to describe and
to explain if we assume that some verbs, such as auxiliaries, have hostforms of their
own, and that this hostform is available for binding to any available clitics as
described earlier. Thus if ai in (25) has a hostform, and general binding tries to bind
the hostform of les to any other active hostform then it will attach les to ai rather than
to trouvés. A similar explanation will apply to verbs such as laisser, ‘let’ and faire,
‘make’ which allow 'clitic climbing'. In the case of laisser, this is optional so we get
the choice between (26) and (27). In contrast, non-verbs such as beaucoup and fidèle
don't have a hostform, so their clitic complement must attach to a verb as in (28) and
(29). All these syntactic complications turn out to be rather simple.
The proposed analysis of French clitics has solved all the problems that I listed
earlier, but this was only possible because of certain theoretical premises of WG:
 the logic of default inheritance (which allows enclitics as default)
 the process of binding (which allows hostforms to bind to each other)
 the subclassification of relations (which allows '1/2/r' to be a subcase of the 'io'
relation)
 the assumption that the grammar is sensitive to cognitive contexts such as
activation levels.
References
Aronoff, Mark and Volpe, Mark 2006. 'Morpheme', in Keith Brown (ed.)
Encyclopedia of Language and Linguistics, Second Edition. Oxford: Elsevier,
pp.274-276.
Barabasi, Albert L. 2009. 'Scale-Free Networks: A Decade and Beyond', Science 325:
412-413.
Blevins, J. P. 2003. 'Stems and paradigms', Language 79: 737-767.
Bonami, Olivier and Boyé, Gilles 2007. 'French pronominal clitics and the design of
Paradigm Function Morphology', in Geert Booij (ed.) On-line Proceedings of
the Fifth Mediterranean Morphology Meeting (MMM5) Fréjus 15-18
September 2005, University of Bologna.
Bouma, Gosse 2006. 'Unification, Classical and Default', in Editor-in-Chief:-á-á Keith
Brown (ed.) Encyclopedia of Language & Linguistics (Second Edition).
Oxford: Elsevier, pp.231-238.
Briscoe, Ted, Copestake, Ann, and De Paiva, Valeria 1993. Inheritance, defaults, and
the lexicon. Cambridge England: Cambridge University Press
Bybee, Joan 1995. 'Regular Morphology and the Lexicon', Language and Cognitive
Processes 10: 425-455.
Bybee, Joan 2010. Language, Usage and Cognition. Cambridge: Cambridge
University Press
Camdzic, Amela and Hudson, Richard 2007. 'Serbo-Croat Clitics and Word
Grammar', Research in Language (University of Lodz) 4: 5-50.
Carpenter, Bob 1992. The Logic of Typed Feature Structures. Cambridge: University
of Cambridge Press
Daelemans, W and Desmedt, K 1994. 'Default inheritance in an object-oriented
representation of linguistic categories', International Journal of HumanComputer Studies 41: 149-177.
Ellis, Nick 2002a. 'Frequency effects in language processing: a review with
implications for theories of implicit and explicit language acquisition.', Studies
in Second Language Acquisition 24: 143-188.
Ellis, Nick 2002b. 'Reflections on frequency effects in language processing', Studies
in Second Language Acquisition 24: 297-339.
Ferreira, Fernanda 2005. 'Psycholinguistics, formal grammars, and cognitive science',
The Linguistic Review 22: 365-380.
Flickinger, D (1987). Lexical rules in the hierarchical lexicon. Stanford PhD
dissertation. PhD dissertation, Stanford University.
Frost, Ram, Deutsch, Avital, Gilboa, Orna, Tannenbaum, Michael, and MarslenWilson, William 2000. 'Morphological priming: Dissociation of phonological,
semantic, and morphological factors', Memory & Cognition 28: 1277-1288.
Gisborne, Nikolas 2010. The event structure of perception verbs. Oxford: Oxford
University Press
Hawkins, Roger and Towell, Richard 2001. French Grammar and Usage. London:
Arnold
Hudson, Richard 1984. Word Grammar. Oxford: Blackwell.
Hudson, Richard 1990. English Word Grammar. Oxford: Blackwell.
Hudson, Richard 2000. '*I amn't.', Language 76: 297-323.
Hudson, Richard 2001. 'Clitics in Word Grammar', UCL Working Papers in
Linguistics 13: 243-294.
Hudson, Richard 2007. Language networks: the new Word Grammar. Oxford: Oxford
University Press
Hudson, Richard 2010. An Introduction to Word Grammar. Cambridge: Cambridge
University Press
Hudson, Richard and Holmes, Jasper 2000. 'Re-cycling in the Encyclopedia', in Bert
Peeters (ed.) The Lexicon/Encyclopedia Interface. Amsterdam: Elsevier,
pp.259-290.
Jackendoff, Ray 2011. 'What is the human language faculty?: Two views', Language
87: 586-624.
Lascarides, A., Briscoe, Ted, Asher, N, and Copestake, A. 1996. 'ORDER
INDEPENDENT AND PERSISTENT TYPED DEFAULT UNIFICATION',
Linguistics and Philosophy 19: 1-90.
Levelt, Willem, Roelofs, Ardi, and Meyer, Antje 1999. 'A theory of lexical access in
speech production', Behavioral and Brain Sciences 22, 1-45.
Luger, George and Stubblefield, William 1993. Artificial Intelligence. Structures and
strategies for complex problem solving. New York: Benjamin Cummings
Marslen-Wilson, William 2006. 'Morphology and Language Processing', in Keith
Brown (ed.) Encyclopedia of Language & Linguistics, Second edition. Oxford:
Elsevier, pp.295-300.
Miller, P. H. and Sag, Ivan 1997. 'French clitic movement without clitics or
movement', Natural Language & Linguistic Theory 15: 573-639.
Neef, Martin 2006. 'Declarative Morphology', in Editor-in-Chief:-á-á Keith Brown
(ed.) Encyclopedia of Language & Linguistics (Second Edition). Oxford:
Elsevier, pp.385-388.
Onnis, Luca, Christiansen, Morten, and Chater, Nick 2006. 'Human Language
Processing: Connectionist Models', in Keith Brown (ed.) Encyclopedia of
Language & Linguistics (Second Edition). Oxford: Elsevier, pp.401-409.
Pelletier, Jeff and Elio, Renee 2005. 'The case for psychologism in default and
inheritance reasoning', Synthese 146: 7-35.
Pinker, Steven 1998. 'Words and rules', Lingua 106: 219-242.
Reisberg, Daniel 2007. Cognition. Exploring the Science of the Mind. Third media
edition. New York: Norton
Rosch, Eleanor 1978. 'Principles of categorization', in Eleanor Rosch & Barbara
Lloyd (eds.) Cognition and categorization. Hillsdale, NJ: Lawrence Erlbaum,
pp.27-48.
Russell, Graham, Ballim, Afzal, Carroll, John, and Warwick-Armstrong, Susan 1993.
'A practical approach to multiple default inheritance for unification-based
lexicons', in Ted Briscoe, Valeria De Paiva, & Ann Copestake (eds.)
Inheritance, defaults and the lexicon. Cambridge: Cambridge University
Press, pp.137-147.
Sadock, Jerrold 1991. Autolexical Syntax: A theory of parallel grammatical
representations. Chicago: University of Chicago Press
Stump, Gregory 2001. Inflectional Morphology: A Theory of Paradigm Structure.
Cambridge: Cambridge University Press
Stump, Gregory 2006a. 'Paradigm Function Morphology', in Editor-in-Chief:-á-á
Keith Brown (ed.) Encyclopedia of Language & Linguistics (Second
Edition). Oxford: Elsevier, pp.171-173.
Stump, Gregory 2006b. 'Template Morphology', in Keith Brown (ed.) Encyclopedia
of Language & Linguistics. Oxford: Elsevier, pp.559-562.
Taylor, John 1995. Linguistic Categorization: Prototypes in linguistic theory. Oxford:
Clarendon
Touretzky, David 1986. The Mathematics of Inheritance Systems. Los Altos, CA:
Morgan Kaufmann
Vogel, Carl 1998. 'A Generalizable Semantics for a Default Inheritance Reasoner',
Zwicky, Arnold 1985a. 'Clitics and particles', Language 61: 283-305.
Zwicky, Arnold 1985b. 'How to describe inflection', in Mary Niepokuj, Mary Van
Clay, Vassiliki Nikiforidou, & Deborah Feder (eds.) Proceedings of the 11th
annual meeting of the Berkeley Linguistics Society. Berkeley: Berkeley
Linguistics Society, pp.372-386.
Download