p1a: parameters are properties of universal principles

advertisement
WHERE, IF ANYWHERE,
ARE PARAMETERS?
1
FREDERICK J. NEWMEYER
UNIVERSITY OF WASHINGTON,
UNIVERSITY OF BRITISH COLUMBIA,
AND SIMON FRASER UNIVERSITY
2
WHERE, IF ANYWHERE, ARE PARAMETERS?
3
 Crosslinguistic variation — a bit of an
embarrassment.
 If there is a universal grammar, then why
aren’t all languages exactly the same?
 If grammars are shaped functionally, then
why is there so much morphosyntactic
variation?
WHERE, IF ANYWHERE, ARE PARAMETERS?
4
 Earliest work — not much focus on variation.
 The goal — show the sameness of all languages at an
underlying level and formulate universal principles.
 Surface variation attributed to language-particular
rules or filters.
WHERE, IF ANYWHERE, ARE PARAMETERS?
5
‘Modern work has indeed
shown a great diversity in the
surface structure of languages.
... Insofar as attention is
restricted to surface structures,
the most that can be expected is
the discovery of statistical
tendencies, such as those
presented by Greenberg
(1963).’ (Chomsky 1965: 118)
• The last sentence of the quote was
considered notorious by some.
NOAM CHOMSKY IN
THE 1960s
WHERE, IF ANYWHERE, ARE PARAMETERS?
6
Klima (1964):
Dialects differ
according to how
their rules are
ordered.
EDWARD KLIMA, 1931-2008
WHERE, IF ANYWHERE, ARE PARAMETERS?
7
Bach (1965): Markedness
conventions apply to
languages that are
‘typologically consistent’
and ‘typologically
inconsistent’.
EMMON BACH, 1929-2014
WHERE, IF ANYWHERE, ARE PARAMETERS?
8
 Since the late 1970s crosslinguistic variation has been
handled in terms of parameters.
 Parameters have been central to mainstream generative
grammar for several decades (the P & P approach).
 Parametric (and other) variation is ideal for illustrating
how many theoretical options there are for handling a
particular phenomenon.
 There have been several architecturally different
approaches to variation, each entailing a different
division of labour.
WHERE, IF ANYWHERE, ARE PARAMETERS?
9
‘Even if conditions are language- or rule-particular, there
are limits to the possible diversity of grammar. Thus, such
conditions can be regarded as parameters that have to be
fixed (for the language, or for the particular rules, in the
worst case), in language learning … It has often been
supposed that conditions on applications of rules must be
quite general, even universal, to be significant, but that
need not be the case if establishing a ‘parametric’
condition permits us to reduce substantially the class of
possible rules.’ (Chomsky 1976: 315)
• Why then? Kim (1976) showed that Korean obeys a form of
the Tensed-S-Condition, even though Korean does not
distinguish formally between tensed and non-tensed clauses.
WHERE, IF ANYWHERE, ARE PARAMETERS?
10
 An inspiration for parameters — the work of
Jacques Monod and François Jacob (Monod 1972
and Jacob 1977).
JACQUES MONOD, 1910-1976
FRANÇOIS JACOB, 1920-2013
WHERE, IF ANYWHERE, ARE PARAMETERS?
11
 Slight differences in timing and arrangement of
regulatory mechanisms that activate genes could
result in enormous differences:
‘Jacob’s model in turn provided part of the
inspiration for the Principles and Parameters
(P&P) approach to language …’ (Berwick and
Chomsky 2011: 28)
WHERE, IF ANYWHERE, ARE PARAMETERS?
12
 MY GOALS IN THIS TALK:
 To outline the approaches to parameters since the
late 1970s.
 To discuss their strengths and their weaknesses.
 To conclude that a non-parametric approach is best.
APPROACHES TO PARAMETERS
13

P1: Macroparametric approaches
P1a: Parameters are properties of universal principles
P1b: Parameters are properties of entire grammars

P2: Microparametric approaches
P2a: Parameters are restricted to the idiosyncratic properties of lexical items
P2b: Parameters are restricted to the inflectional system
P2c: Parameters are associated with functional heads

P3: Parameters are stated at the interfaces
P3a: Parameters are stated at the PF interface
P3b: Parameters are stated at the C-I interface

P4: Parameters are not part of UG. Much of variation is a ‘third factor’ effect
P4a: Grounded approaches
P4b: Reductionist approaches
P4c: Epigenetic approaches

P5: Parameters are not part of UG. Languages differ from each other in terms of
the rules that they possess. Possible rules are constrained by both UG and ‘third
factor’ considerations
P1: MACROPARAMETRIC APPROACHES
14
 I’ll call any approach that partitions languages into
broad typological classes ‘macroparametric’.
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
15
 This is the core idea of the P & P approach.
 Both the principles of UG and the possible parameter settings
are part of our genetic endowment:
‘[W]hat we ‘know innately’ are the principles of the
various subsystems of S0 [= the initial state of the
language faculty — FJN] and the manner of their
interaction, and the parameters associated with these
principles. What we learn are the values of these
parameters and the elements of the periphery …. The
language that we then know is a system of principles with
parameters fixed, along with a periphery of marked
exceptions.’ (Chomsky 1986: 150-151)
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
16
 The original idea was that there are a small number
of parameters and small number of settings.
 This allowed two birds to be killed with one stone.
 Parametric theory could explain the rapidity of
acquisition, given the poor input, and explain the
crosslinguistic distribution of grammatical elements.
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
17
 All of the subsystems of principles have been
assumed to be parameterized.
 Many examples from work in GB:
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
18
BINDING (Lasnik
1991). Principle C is
parameterized to
allow for sentences of
the form Johni thinks
that Johni is smart in
languages like Thai
and Vietnamese.
HOWARD LASNIK
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
19
GOVERNMENT (Manzini and Wexler
1987). The notion ‘Governing
Category’ is defined differently in
different languages.
MARIA-RITA MANZINI
KENNETH WEXLER
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
20
BOUNDING (Rizzi
1982). In English,
NP and S are
bounding nodes
for Subjacency, in
Italian NP and S’.
LUIGI RIZZI
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
21
X-BAR (Stowell 1981). In
English, heads precede their
complements; in Japanese
heads follow their
complements.
TIM STOWELL
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
22
CASE and THETATHEORY (Travis 1989).
Some languages assign
Case and/or Theta-roles to
the left, some to the right.
LISA TRAVIS
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
23
 Fewer and fewer parameterized principles have been
proposed in recent years.
 Why? Because there are fewer and fewer principles!
 The thrust of the MP — to reduce the narrow
syntactic component and to reinterpret broad
universal principles as economy effects of efficient
computation.
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
24
 And economy principles are
generally assumed not to be
parameterized.
 The ‘Strong Uniformity Thesis’
(Boeckx 2011): Principles of narrow
syntax are not subject to
parameterization; nor are they
affected by lexical parameters.
CEDRIC BOECKX
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
25
 Chomsky has explicitly disparaged the search for
universal principles:
‘[T]ake the LCA (Linear Correspondence Axiom)
[Kayne 1994]. If that theory is true, then the phrase
structure is just more complicated. Suppose you find
out that government is really an operative property.
Then the theory is more complicated. If ECP really
works, well, too bad; language is more like the spine
[i.e., poorly designed — FJN] than like a snowflake
[i.e., optimally designed]’ (Chomsky 2002: 136).
P1A: PARAMETERS ARE PROPERTIES OF
UNIVERSAL PRINCIPLES
26
 So if no UG principles and hence no parameters
associated with them, then where to handle variation
and the parameters that capture it?
 Two diametrically opposed hypotheses:
 1.
Parameters are properties of entire
grammars.
 2.
The sorts of elements that can be
parameterized are very restricted.
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
27
 There has always been the sense that there is
something right about the idea that parameters
are broad and global, rather than tied to
particular functional categories.
 Parameters from the beginning have been
proposed that are not tied to specific UG
principles.
 Rather they seem to be a fact about the language
as a whole.
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
28
Whether or not V moves to I in a particular language
(Emonds 1978; Pollock 1989)
JOSEPH EMONDS
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
29
Whether V moves to C (to derive V2
order) (den Besten 1977/1983)
HANS DEN BESTEN, 1949-2010
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
30
Whether N incorporates into V (Baker 1988)
MARK BAKER
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
31
 If you hypothesize that parameters are ‘macro’ in
scope and at the same time diminish the number of
UG principles, then it follows that parameters have
to be properties of entire languages.
 The Parameter Hierarchy of Mark Baker’s Atoms of
Language is the best known example of treating
parameters as properties of entire
languages.
(Baker 2001)
32
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
33
Some other recent
macroparameters:
Snyder (2001) Compounding
parameter. The grammar
[disallows*, allows] formation of
endocentric compounds during the
syntactic derivation. [*unmarked
value]
WILLIAM SNYDER
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
34
Boskovic and Gajewski (2011) NP/DP
macroparameter. NP languages lack articles, permit
left-branch extraction, scrambling, and NEG-raising;
DP languages do not.
ŽELJKO BOŠKOVIĆ
JON GAJEWSKI
P1B: PARAMETERS ARE PROPERTIES OF
ENTIRE GRAMMARS
35
Huang (2007) points to many features that
distinguish Chinese-type languages from
English-type languages, including:
a generalized classifier system
no plural morphology
extensive use of light verbs
no agreement, tense, or case
morphology
no wh-movement
radical pro-drop
etc.
JAMES HUANG
P2: MICROPARAMETRIC APPROACHES
36
 P2a: Parameters are restricted to the
idiosyncratic properties of lexical items
 P2b: Parameters are restricted to the inflectional
system
 P2c: Parameters are associated with functional
heads
 What they have in common is that parametric variation
is quite localized.
 This idea is often referred to now as the ‘Borer-Chomsky
Conjecture’.
P2: MICROPARAMETRIC APPROACHES
37
 Hagit Borer, in Parametric Syntax (Borer 1984),
made two proposals, which she may or may not have
regarded as variants of each other (P2a and P2b):
HAGIT BORER
P2A: PARAMETERS ARE RESTRICTED TO
THE IDIOSYNCRATIC PROPERTIES OF
LEXICAL
38 ITEMS
‘Interlanguage variation would be restricted to the
idiosyncratic properties of lexical items..’ (Borer
1984: 2-3)
 Borer gave an example of a rule that inserts a preposition
in Lebanese Arabic, that does not exist in Hebrew:
f ------> 1a / [PP …
NP]
P2A: PARAMETERS ARE RESTRICTED TO
THE IDIOSYNCRATIC PROPERTIES OF
LEXICAL
39 ITEMS
Manzini and Wexler (1987) pointed to language-particular
anaphors that have to be associated with parameters:
cakicasin and caki in Korean; sig and hann in Icelandic.
MARIA RITA MANZINI
KENNETH WEXLER
P2A: PARAMETERS ARE RESTRICTED
TO THE IDIOSYNCRATIC PROPERTIES
40
OF LEXICAL
ITEMS
 Every language has thousands of lexical items.
 But nobody entertained the possibility that every
lexical item might be a locus for parametric
variation.
 So P2A gave way to P2B: Parameters are tied to the
inflectional system.
P2B: PARAMETERS ARE RESTRICTED TO
THE INFLECTIONAL SYSTEM
41
‘It is worth concluding this chapter by reiterating the
conceptual advantage that reduced all interlanguage
variation to the properties of the inflectional system.’
(Borer 1984: 29)
 The problem is that not all lexical items are part of the
inflectional system and not all inflections are lexical.
 Here Borer took ‘inflectional’ in a pretty broad sense:
Case and agreement relations, theta-role assignment, etc.
P2B: PARAMETERS ARE RESTRICTED TO
THE INFLECTIONAL SYSTEM
42
 But even Borer recognized that inflectional parameters could not
handle some of the best known cases of parameters — head-order and
subjacency, for example.
 Another important point: for Borer inflectional rules are not provided
by UG:
‘If all interlanguage variation is attributable to [the inflectional]
system, the burden of learning is placed exactly on that component
of grammar for which there is strong evidence of learning: the
vocabulary and its idiosyncratic properties. We no longer have to
assume that the data to which the child is exposed bear directly on
universal principles …’ (Borer 1984: 29)
 In this view, parametric variation was not only localized, it was
idiosyncratically language-particular.
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
43
 Borer was writing before the
lexical/functional category distinction was
elaborated.
 P2b gradually morphed into P2c:
Parameters are associated with functional
heads.
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
44
 Fukui (1988): The Functional Parameterization
Hypothesis posits that only functional elements in the
lexicon (that is, elements such as Complementizer,
Agreement, Tense, etc.) are subject to parametric
variation.
 Fukui himself exempted ordering restrictions from
this hypothesis.
 P2c is not a simple extension of P2b. There have been
countless functional categories proposed that have
nothing to do with inflection, no matter how broadly
‘inflection’ is interpreted: adverbs, topic and focus,
and so on.
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
45
 Confusingly — lexical parameterization and functional-category
parameterization are often used interchangeably.
 I’ll adopt the term ‘microparameter’ to cover all of P2 — the idea that
parameterization is localized in functional elements, whether categories or
features.
 Mark Baker has repeatedly stressed that once you adopt the Borer-
Chomsky Conjecture you are led inevitably to microparameters.
 That is, we end up characterizing small points of variation with few if any
global effects.
 This point has been disputed by Roberts and Holmberg, but
microparameters seem to entail few of the massive clustering effects
associated with macroparameters.
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
46
 Chomsky has often asserted that microparameters
are necessary in order to solve ‘Plato’s Problem’:
‘Apart from lexicon, [the set of possible human
languages] is a finite set, surprisingly; in fact, a
one-membered set if parameters are in fact
reducible to lexical properties. … How else could
Plato’s problem [the fact that we know so much
about language based on so little direct evidence
— FJN] be resolved?’ (Chomsky 1991: 26)
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
47
 There’s a widespread (but not universal!) opinion that
microparameters have both conceptual and empirical
advantages over macroparameters (Kayne 2000; Roberts
and Holmberg 2010; Thornton and Crain 2013):
RICHARD KAYNE
IAN ROBERTS
ANDERS HOLMBERG
ROSALIND THORNTON
STEPHEN CRAIN
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
48
ARGUMENTS FOR MICROPARAMETERS:
• 1.
They impose a strong limit on what can vary:
Crosslinguistic differences can now be reduced to
differences in features.
• 2.
They restrict learning to the lexicon (which
has to be learned anyway).
• 3.
Microparameters allow ‘experiments’ to be
constructed comparing two closely-related variants,
to pinpoint the possible degree of variation (Kayne
2000).
• 4.
Given the finite number of functional
categories, we can calculate the number of
parameters.
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
49
 There are a number of ways that the assumptions of the Minimalist
Program have entailed a rethinking of parameters and the division of
labour among the various components for the handling of variation.
 The take away line of the MP is well known:
‘We hypothesize that FLN only includes recursion and is the only
uniquely human component of the faculty of language.’ (Hauser,
Chomsky, and Fitch 2002: 1569)
MARK HAUSER
NOAM CHOMSKY
TECUMSEH FITCH
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
50
 I assume that by ‘recursion’ HCF mean the Merge
operation, and not sentential embedding.
 Under the strictest interpretation of this quote,
there’s no place for handling variation in the narrow
syntax at all.
P2C: PARAMETERS ARE ASSOCIATED
WITH FUNCTIONAL HEADS
51
 So where would parameterization be in this
conception? At the interfaces:
P3: Parameters are stated at the interfaces
P3a: Parameters are stated at the PF
interface
P3b: Parameters are stated at the C-I
interface
P3: PARAMETERS ARE STATED AT THE
INTERFACES
52
 Under this perspective, lexical items are subject to a
process of generalized late insertion of semantic,
formal, and morphophonological features after the
syntax. All variation is captured at that point.
 It’s not at all clear how the mechanics of this would
work — in particular taking bare unlabelled and
featureless trees as input to the PF and C-I.
P3: PARAMETERS ARE STATED AT THE
INTERFACES
53
 It’s also not clear that Chomsky’s position is as strong as
Boeckx’s.
 When you get over the shock value of the recursion-only
quote, you realize that for Chomsky there is a lot more to
both FLN and/or UG than recursion:
‘UG makes available a set F of features (linguistic
properties) and operations CHL … that access F to
generate expressions.’ (Chomsky 2000: 100)
 That would seem to allow for parametric variation to be
handled in the journey towards the interfaces.
P3: PARAMETERS ARE STATED AT THE
INTERFACES
54
 In fact, Chomsky has continued to attribute much
more to FLN/UG than just recursion:
‘[FLN] includes phonology, formal semantics, the
structure of the lexicon (morphology, words), etc.,
insofar as they are language-specific …’ (Chomsky,
Hauser, and Fitch 2005: np)
P3: PARAMETERS ARE STATED AT THE
INTERFACES
55
 As far as parametric variation in the narrow syntax is
concerned, Holmberg and Roberts (2014) construct
an argument that the differences between answers to
yes-no questions in Finnish vs. English/Swedish are
not a matter of vocabulary choice or morphological
rules, including spell-out rules.
 Instead, the syntactic construction of the universal
structure proceeds quite differently in the two cases,
employing Agree and Internal Merge in different
ways.
P3: PARAMETERS ARE STATED AT THE
INTERFACES
56
 There has also been some debate as to
whether there is parametric variation at
the C-I interface.
• Some have argued that the idea is
inconceivable.
P3: PARAMETERS ARE STATED AT THE
INTERFACES
57
 But for Ramchand and Svenonius (2008) the narrow
syntax provides a ‘basic skeleton’ to C-I, but
languages vary in terms of how much their lexical
items explicitly encode about the reference of
variables like T, Asp, and D.
GILLIAN RAMCHAND
PETER SVENONIUS
THE PRINCIPAL PROBLEMS WITH
PARAMETERS
58
 No macroparameter has come close to working.
 ‘Microparameter’ is just another word for
‘language-particular rule’.
 There would have to be hundreds, if not
thousands of parameters.
 Nonparametric differences among languages
undercut the entire parametric program.
NO MACROPARAMETER HAS COME
CLOSE TO WORKING
59
 This fact has fuelled the retreat to microparameters.
 The clustering effects are simply not there:
‘History not been kind to to the Pro-drop Parameter as
originally stated.’ (Baker 2008b:352)
‘In retrospect, [subjacency effects] turned out to be a
rather peripheral kind of variation. Judgments are
complex, graded, affected by many factors, difficult to
compare across languages, and in fact this kind of
variation is not easily amenable to the general format
of parameters …’ (Rizzi 2014: 16)
‘MICROPARAMETER’ IS JUST ANOTHER
WORD FOR ‘LANGUAGE-PARTICULAR
RULE’
60
 Let’s say that we observe two Italian dialects one
with a do-support-like structure and one without.
 We could posit a microparametric difference
between the dialects — maybe one with an attracting
feature that leads to do-support and one that doesn’t.
 But how does that differ in substance from saying
that one dialect has a rule of do-support that the
other one lacks?
‘MICROPARAMETER’ IS JUST ANOTHER
WORD FOR ‘LANGUAGE-PARTICULAR
RULE’
61
‘Thirty years ago, if some element moved
in one language but not in another, a
movement rule would be added to one
language but not to the other. Today, a
feature ‘I want to move’ (‘EPP’, ‘strength’,
etc.) is added to the elements of one
language but not of the other. In both
cases, variation is expressed by
stipulating it. Instead of a theory, we
have brute force markers.’ (Starke 2014:
140)
MICHAL STARKE
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS, OF PARAMETERS
62
 Tying parameters to functional categories was a strong conjecture when it
was proposed, since there were so few functional categories.
 And early quotes reflected that idea:
There are just ‘a few mental switches’ (Pinker 1994: 112)
STEVEN PINKER
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS, OF PARAMETERS
63
There are about 30 to 40 parameters (Lightfoot 1999:
259)
DAVID LIGHTFOOT
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS, OF PARAMETERS
64
‘There are only a few parameters’ (Adger 2003: 16)
DAVID ADGER
There are between 50 and 100 parameters (Roberts
and Holmberg 2005)
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS, OF PARAMETERS
65
 Janet Fodor (2001) observes that ‘it is standardly assumed
that there are fewer parameters than there are possible
rules in a rule based framework; otherwise, it would be less
obvious that the amount of learning to be done is reduced
in a parametric framework’.
JANET D. FODOR
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS, OF PARAMETERS
66
 Are there fewer parameters than rules? I doubt it.
 There has never been a consensus on any small set.
 Hundreds and hundreds of parameters have been
proposed.
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS, OF PARAMETERS
67
 Gianollo, Guardiano, and Longobardi (2008) propose 47
parameters for DP alone on the basis of 24 languages
[[only 5 are non-IE, representing only 3 families]].
 Longobardi and Guardiano (2011) up the total to 63 binary
parameters in DP.
GIUSEPPI LONGOBARDI
THERE WOULD HAVE TO BE HUNDREDS,
IF NOT THOUSANDS OF PARAMETERS
68
 One way to get around this problem would be to
posit non-parametric differences among languages,
thereby keeping a small number of parameters.
 The problem is that nonparametric differences
undercut the entire parametric program.
NONPARAMETRIC DIFFERENCES AMONG
LANGUAGES UNDERCUT THE ENTIRE
PARAMETRIC PROGRAM
69
 Are all morphosyntactic differences among languages
due to differences in parameter setting?
 Generally that has been assumed not to be the case.
 Outside of core grammar are:
‘… borrowings, historical residues, inventions, and so
on, which we can hardly expect to — and indeed would
not want to — incorporate within a principled theory of
UG’ (Chomsky 1981: 8-9)
NONPARAMETRIC DIFFERENCES AMONG
LANGUAGES UNDERCUT THE ENTIRE
PARAMETRIC PROGRAM
70
 In other words, from the beginning it has been assumed that
some language-particular features are products of
extraparametric language particular rules.
 There are many examples (word order in Hixkaryana; picture
noun reflexives and preposition stranding in English; etc.)
 If all syntactic differences were to be handled by a difference
in parameter setting, then, extrapolating to all of the syntactic
distinctions in the world’s languages, there would have to be
thousands — if not millions — of parameters.
 That’s obviously an unacceptable conclusion.
NONPARAMETRIC DIFFERENCES AMONG
LANGUAGES UNDERCUT THE ENTIRE
PARAMETRIC PROGRAM
71
 And then there’s PF syntax:
Some syntactic phenomena that have been
attributed to PF:
a.
extraposition and scrambling (Chomsky 1995)
b.
object shift (Holmberg 1999; Erteschik-Shir
2005)
c.
head movements (Boeckx and Stjepanovic
2001)
d.
the movement deriving V2 order (Chomsky
2001)
e.
linearization (i.e. VO vs. OV) (Chomsky 1995;
Takano 1996; Fukui and Takano 1998; Uriagereka 1999)

NONPARAMETRIC DIFFERENCES AMONG
LANGUAGES UNDERCUT THE ENTIRE
PARAMETRIC PROGRAM
72
f.
Wh-movement (Erteschik-Shir 2005)
NOMI ERTISCHIK-SHIR
NONPARAMETRIC DIFFERENCES AMONG
LANGUAGES UNDERCUT THE ENTIRE
PARAMETRIC PROGRAM
73
 Nobody has any clear idea about which
syntactic differences should be considered
parametric and which should not be.
 Needless to say, if learners need to learn
rules anyway, nothing is gained by positing
parameters.
MINIMALISM AND PARAMETERS
74
 In standard approaches to both macro- and
microparameters, the possible parameters and their
possible settings are part of innate UG.
 That idea is utterly incompatible with the minimalist
program:
‘… if minimalists are right, there cannot be any
parameterized principles, and the notion of
parametric variation must be rethought.’ (Boeckx
2011: 206)
MINIMALISM AND PARAMETERS
75
 ‘Minimalism is probably not the best framework to
investigate parameters …’ First, because theoretical
apparatus has been reduced to a minimum, second
because the Galilean style has replaced interest in
variation (Gallego 2011: 550).
ANGEL GALLEGO
MINIMALISM AND PARAMETERS
76
Well, all of this is part of what one might call the
‘Galilean style’: the dedication to finding
understanding, not just coverage. Coverage of
phenomena itself is insignificant’. (Chomsky
2002: 102)
 Accounting for variation was at the heart of GB. Now
it has become ‘insignificant’.
P4: PARAMETERS ARE NOT PART OF UG. MUCH OF
VARIATION IS A ‘THIRD FACTOR’ EFFECT
77
 Chomsky’s three factors in language design:
1. Genetic endowment
2. Experience
3. Principles not specific to the faculty of language
 The lack of a place for parameters in ‘1. Genetic endowment’ has led
researchers to look to 2. and to 3. as a locus for language variation.
 But 2. has often been taken to mean language-particular and hence
uninteresting.
 So there has been a concerted effort to resituate parameters in principles
not specific to language — least-effort and other principles of efficient
computation, processing principles, and the like.
P4: PARAMETERS ARE NOT PART OF UG. MUCH OF
VARIATION IS A ‘THIRD FACTOR’ EFFECT
78
 One obvious advantage to this is that not all third-
factor effects translate neatly into parameters.
 That is particularly true for the sorts of
crosslinguistic hierarchies proposed mainly in
functionalist work.
P4: PARAMETERS ARE NOT PART OF UG. MUCH OF
VARIATION IS A ‘THIRD FACTOR’ EFFECT
79
 Prepositional Noun Modifier Hierarchy (PrNMH):
(Hawkins 1983): If a language allows long elements
to intervene between a preposition and its object,
then it allows short elements.
JOHN A. HAWKINS
P4: PARAMETERS ARE NOT PART OF UG. MUCH OF
VARIATION IS A ‘THIRD FACTOR’ EFFECT
80
 It is far from clear how this generalization might be captured by
means of parameters, whether macroparameters or
microparameters.
 The same point could be made for other functionally-motivated
hierarchies.
 So let’s look at three ‘third factor’ approaches to parameters.
 Following the terminology used by Colin Phillips in his work on
island constraints and processing, I’ll call the first two ‘grounded
approaches’ and ‘reductionist approaches’.
 I’ll then move to the third type — epigenetic approaches.
P4A: GROUNDED APPROACHES
81
 A grounded approach is one in which some principle
of UG is grounded in — that is, ultimately derived
from — some third-factor consideration.
 A long tradition points to a particular constraint,
often an island constraint, and posits that it is a
grammaticalized processing principle.
P4A: GROUNDED APPROACHES
82
 Two examples from Fodor (1978):
The XX Extraction Constraint (XXEC): If at some point in
its derivation a sentence contains a sequence of two
constituents of the same formal type, either of which could
be moved or deleted by a transformation, the
transformation may not apply to the first constituent in the
sentence.
The Nested Dependency Constraint (NDC): If there are two
or more filler-gap dependencies in the same sentence, their
scopes may not intersect if either disjoint or nested
dependencies are compatible with the well-formedness
conditions of the language.
P4A: GROUNDED APPROACHES
83

Another example is the Final Over Final
Constraint (FOFC), proposed in Holmberg
(2000):

One consequence: There are COMP-TP
languages that are verb-final, but there are
no TP-COMP languages that are verb initial.
P4A: GROUNDED APPROACHES
84
 Holmberg and his colleagues
interpreted this constraint as a
parameterizable UG principle.
 Walkden (2009) points out that the
FOFC is accounted for by Hawkins’
processing theory and proposes to
build FOFC directly into UG.
GEORGE WALKDEN
 In fact, Mobbs (2014) builds
practically all of Hawkins’s parsing
theory into UG.
IAIN MOBBS
P4A: GROUNDED APPROACHES
85
 So in this view there are no parameters per se.
 A UG principle is ‘grounded’ in a parsing principle.
 And the UG principle itself is underspecified enough
to allow for the necessary degree of variation.
P4B: REDUCTIONIST APPROACHES
86
 A reductionist approach totally removes
from UG the burden of accounting for some
phenomenon.
 Rather some ‘third factor’ does all the work.
P4B: REDUCTIONIST APPROACHES
87
 Returning to the FOFC, Trotzke, Bader, and Frazier
(2013) provide evidence that a better motivated
account is to remove it entirely from the grammar,
since it can be explained in its entirety by systematic
properties of performance systems.
ANDREAS TROTZKE
MARKUS BADER
LYN FRAZIER
P4B: REDUCTIONIST APPROACHES
88
 They also deconstruct the Head-Complement parameter
in a similar fashion.
‘[T]he physics of speech, that is, the nature of the
articulatory and perceptual apparatus require one of
the two logical orders, since pronouncing or perceiving
the head and the complement simultaneously is
impossible. Thus, the head-complement parameter,
according to this approach, is a third-factor effect.’
(Trotzke, Bader, and Frazier 2013: 4)
 Which option is chosen has to be built into the grammar
of individual languages.
P4B: REDUCTIONIST APPROACHES
89
 Another example: Kayne (1994) provided an elaborate
UG-based explanation of why rightward movement is so
restricted in language after language.
 Ackema and Neeleman (2002) argue that the apparent
ungrammaticality of certain syntactic structures should
not be accounted for by syntax proper (that is, by the
theory of competence), but rather by the theory of
performance.
 The latter acts as a filter on possible linguistic
representations.
P4C: EPIGENETIC APPROACHES
90
 In an epigenetic approach to variation, parameters are
not provided by an innate UG.
 Rather, parametric effects arise in the course of the
acquisition process through the interaction of certain
third-factor learning biases and experience.
 UG creates the space for variation by leaving certain
features underspecified.
 One could also call this an ‘emergentist’ view of
parameters.
P4C: EPIGENETIC APPROACHES
91
 There are several proposals along these lines:
Gianollo, Guardiano, and Longobardi (2008)
Boeckx (2011)
Biberauer, Holmberg, Roberts, and Sheehan
(2014) (preceded by many papers by the same
four authors)
P4C: EPIGENETIC APPROACHES
92
 I’ll focus on Biberauer et al.
THERESA BIBERAUER
MICHELLE SHEEHAN
P4C: EPIGENETIC APPROACHES
93
 There are no parameters per se — all variation arrives through
the acquisition process guided by some third factor
assumptions about learning.
 The child is conservative in the complexity of the formal
features that it assumes are needed (Feature Economy)
 and liberal in its preference for particular features to extend
beyond the input (Input Generalization – Superset Bias).
 The idea is that these principles drive acquisition and thus
render parameters unnecessary, while deriving the same
effects.
P4C: EPIGENETIC APPROACHES
94
 Here’s their word order hierarchy:
P4C: EPIGENETIC APPROACHES
95
 Let’s say that a language is consistently head-initial
except in NP, where the noun follows its
complements.
 However, there is a definable class of nouns in this
language that do precede their complements.
 And a few nouns in this language behave
idiosyncratically in terms of the positioning of their
specifiers and complements.
P4C: EPIGENETIC APPROACHES
96
 In their theory, the child will go through the following
stages of acquisition, zeroing in step-by-step on the adult
grammar:
1. First s/he will assume that ALL phrases are headinitial, even noun phrases.
2. Next s/he will assume that ALL NPs are head-final
3. Next s/he will learn the class of exceptions to 2.
4. Finally, s/he will learn the purely idiosyncratic
exceptions.
 Other hierarchies are more complex — they depend on
many assumptions about the feature content of
particular categories.
P4C: EPIGENETIC APPROACHES
97
P4C: EPIGENETIC APPROACHES
98
 There a lot of questions that one can ask about this
scenario.
 Most importantly — is this really the way language
acquisition works in real life?
 Do children go from general to the particular, correcting
themselves as they go, gradually zeroing in on the correct
grammar?
 There’s a lot of disagreement among acquisitionists —
many argue for precisely the reverse set of steps.
P4C: EPIGENETIC APPROACHES
99
 Biberauer et al. themselves agree that these steps can
be overridden:
 (a) The early stages may correspond to a pre-
production stage (cf. Wexler's very early
parameter setting).
 (b) It is possible that acquirers pass through
certain stages very quickly as the counterevidence
is so readily available.
 (c) Frequency effects can distort the smooth
transition through the hierarchies.
P4B: REDUCTIONIST APPROACHES
100
 The second question is how much the child has to know in
advance of beginning the entire process.
 For example, the child has to know to ask ‘Are unmarked phi-
features fully specified on some probes?’
 That implies a lot of grammatical knowledge.
 Where does this knowledge come from and how do they
match this knowledge with the input?
 Still, this is a challenging approach that needs to be taken very
seriously.
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
101
 The microparametric approach to variation has a
kernel of reasonableness to it.
 Even closely related speech varieties can vary from
each other in many details.
 Microparameters capture this fact.
 But why call them ‘microparameters’? Why not just
‘rules’?
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
102
 Of course, I know why there has been resistance to
that.
 ‘Rules’ brings back the ghosts of pre-generative
structuralism, where it was believed by many that
languages could differ from each other without limit.
 And they bring back the spectre of early
transformational grammar, where grammars were
essentially long lists of rules.
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
103
 But to call a rule a rule is not to imply that anything
can be a rule.
 Possible rules are still constrained by UG.
 What’s in UG?
 Obviously the Merge operation or something
analogous — but there’s a lot more than that.
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
104
 It’s hard for me to see how the broad architecture of the
grammar could be learned inductively, for example.
 For example, there is very little evidence that the syntax
can have access to the segmental phonology.
 There’s no language with a rule that says that
displacement — Internal Merge, if you will — targets only
those elements with front vowels, for example.
 I have no trouble attributing that fact to UG.
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
105
 But the biggest constraint on what can be a rule derives
from processing.
 No language has a rule that lowers a filler exactly two
clauses deep, leaving a gap in initial position.
 In keeping with the distinction between what is possible
and what is probable, I would say that such a rule, while
theoretically possible, is so improbable (for processing
reasons) that it will never occur.
 I first put forward a proposal like this in my 2005 book.
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
106
 A quote from Norbert Hornstein that captures the
essence of my view of variation:
NORBERT HORNSTEIN
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
107
‘There is no upper bound on the ways that languages
might differ though there are still some things that
grammars cannot do. A possible analogy for this
conception of grammar is the variety of geometrical
figures that can be drawn using a straight edge and
compass. There is no upper bound on the number of
possible different figures. However, there are many
figures that cannot be drawn (e.g. there will be no
triangles with 20 degree angles). …
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
108
Similarly, languages may contain arbitrarily many
different kinds of rules depending on the PLD they
are trying to fit. However, none will involve binding
relations in which antecedents are c-commanded by
their anaphoric dependents or where questions are
formed by lowering a Wh-element to a lower CP.
Note that this view is not incompatible with
languages differing from one another in various
ways.’ (Hornstein 2009: 167)
P5: PARAMETERS ARE NOT PART OF UG. LANGUAGES DIFFER FROM EACH
OTHER IN TERMS OF THE RULES THAT THEY POSSESS. POSSIBLE RULES
ARE CONSTRAINED BY BOTH UG AND ‘THIRD FACTOR’ CONSIDERATIONS
109
 I’ll be the first to admit that this approach is in some ways a retreat
from the marvellous vision presented in Lectures in Government
and Binding in 1981.
 But we all agree that it’s better to start with the most ambitious
vision possible and to retreat from that when necessary, than to
think small and to stay small.
 Still the idea that a grammar is composed of language-particular
rules constrained by both UG principles and third-factor principles
hardly represents a defeat.
 I personally find it an appealing vision that promises to inform
research on crosslinguistic variation in the years to come.
110
THANK YOU!
Download