t-orders

advertisement
Draft of May 20, 2006, comments welcome
T-ORDERS
Arto Anttila and Curtis Andrus
Stanford University
Abstract
T-ORDER CONSTRUCTOR
is a Windows program that computes the implicational universals
(t-orders) hidden in optimality-theoretic factorial typologies (Prince and Smolensky
1993/2004). The input to the program is a factorial typology computed by OTSOFT
(Hayes, Tesar, and Zuraw 2003). The output is a Hasse diagram that displays the
implicational universals in visual format that is easy for the linguist to understand. These
diagrams have an important application in the study of quantitative variation: they spell
out the universal limits of quantitative variation permitted by the constraint set. These
limits hold true for several theories of variation, including Multiple Grammars, Partially
Ordered Grammars, and Stochastic Optimality Theory.
1. Linguistic variation
We start by a simple example of linguistic variation. The example comes from English
phonology, but our point is a general one. Similar examples could easily be drawn from
other languages as well as from other subfields of linguistics (morphology, syntax,
semantics).
In many dialects of English, word-final /t,d/ is variably deleted, as shown in (1):
(1)
It cost ~ cos’ five dollars.
It cost ~ cos’ us five dollars.
That’s how much it cost ~ cos’.
(t before a consonant)
(t before a vowel)
(t before a pause)
Whether t,d-deletion applies or not depends on various factors, in particular the quality of
the following segment. It is well known that t,d-deletion is more common before
consonants than before pauses and vowels (see Coetzee 2004: 218 and references there).
Typically, the process applies variably in all three environments depending on the
speaker, speech rate, lexical item, etc., but the quantitative tendency is robust and clear.
This is shown by the cross-dialectal data in (2).
1
(2)
t,d-deletion data from five dialects (Coetzee 2004: 218)
_C
_V
_##
Chicano English (Los Angeles)
(Santa Ana 1991:76, 1996:66)
n
% deleted
3,693 1,574 1,024
62
45
37
Tejano English (San Antonio)
(Bayley 1995:310)
n
% deleted
1,738 974
62
25
564
46
AAE (Washington, DC)
(Fasold 1972:76)
n
% deleted
143
76
37
73
Jamaican mesolect (Kingston)
(Patrick 1991:181)
n
% deleted
1,252 793
85
63
252
71
Trinidadian acrolect
(Kang 1994:157)
n
% deleted
22
81
43
21
16
31
Neu data
(Neu 1980:45)
n
% deleted
814
36
495
16
---
202
29
The quantitative generalization that holds across all five dialects is stated in (3):
(3)
(a)
(b)
In all dialects, the deletion rate is highest in _C.
Deletion rates in _V and _## may occur in either order.
Any satisfactory analysis should thus make the following predictions:
(4)
(a)
(b)
(c)
Exclude dialects where t/d-deletion rate is higher in _{V, ##} than in _C.
Include dialects where t/d-deletion rate is higher in _V than in _##.
Include dialects where t/d-deletion rate is higher in _## than in _V.
If we look at the quantitative facts in one dialect only, we see three different percentages,
e.g. 62-45-37, but cannot tell which differences are deep and which superficial. If we
look at the quantitative facts across several dialects, we learn that the first number is
always higher than the second and the third, and that the latter two can occur in either
order. A satisfactory theory of variation should make a distinction between these two
types of quantitative patterns: (i) QUANTITATIVE UNIVERSALS that do not vary across
dialects, presumably because they are hard-wired in universal grammar; (ii)
QUANTITATIVE PARTICULARS that are subject to cross-dialectal variation. t/d-deletion
provides a simple example of this distinction. We now proceed to illustrate how this
distinction emerges from Optimality Theory.
2
2. An optimality-theoretic analysis of t/d-deletion
Optimality Theory (OT, Prince and Smolensky 1993/2004) is a theory of constraint
interaction in generative grammar. For a clear exposition of the theory, see Kager 1999.
At the heart of Optimality Theory there are three important assumptions: (i) grammars of
natural languages consist of constraints that make potentially conflicting structural
demands; (ii) conflicts among constraints are resolved by strict ranking; (iii) constraints
are universal, rankings are language-specific.
We now outline an optimality-theoretic analysis of t,d-deletion originally
proposed by Kiparsky (1993). For alternative analyses, see Reynolds 1994 and Coetzee
2004:214-329. The analysis builds on five universal constraints:
(5)
Constraints on t,d-deletion (Kiparsky 1993):
*COMPLEX
Avoid consonant clusters within a syllable.
ONSET
Syllables have onsets.
PARSE
Segments belong to syllables.
ALIGN-LEFT-WORD
Syllables cannot straddle word boundaries.
ALIGN-RIGHT-PHRASE
Phrase-final consonants are also syllable-final.
The tableau in (6) shows one possible ranking of the five constraints. This ranking
predicts no deletion in the prevocalic environment. Instead, the t is resyllabified as the
onset of the following word. Deletion is predicted to occur in both preconsonantal and
prepausal environments. The theoretical assumption is that t is deleted if it is not parsed
as part of a syllable. Syllables are marked by square brackets.
(6)
Sample ranking. Winners: cost us (no deletion), cos’ me (deletion), cos’ (deletion)
INPUTS
OUTPUTS
cost us
(a)
[cost][us]
(b) [cos]t[us]
(c)  [cos][tus]
(a)
[cost][me]
(b)  [cos]t[me]
(c)
[cos][tme]
(a)
[cost]
(b)  [cos]t
cost me
cost
*COMPLEX
ONSET
*!
*
*!
ALIGN-L-W
ALIGN-R-P
PARSE
*
*
*!
*
*!
*!
*
*
*
3. What are t-orders?
What kinds of dialects does the analysis predict to be possible in principle? This can be
figured out by computing the FACTORIAL TYPOLOGY of the five constraints with the aid of
OTSOFT (Hayes, Tesar, and Zuraw 2003). The program will consider all the 5! = 120
permutations of the 5 constraints and work out the predicted output pattern in each case.
The result is displayed in (7). t-deletion is highlighted in grey.
3
(7)
Factorial typology
There were 6 different output patterns.
/cost us/:
/cost me/:
/cost/:
Output #1
[cost][us]
[cost][me]
[cost]
Output #2
[cos]t[us]
[cos]t[me]
[cost]
/cost us/:
/cost me/:
/cost/:
Output #5
[cos][tus]
[cos]t[me]
[cost]
Output #6
[cos][tus]
[cos]t[me]
[cos]t
Output #3
[cos]t[us]
[cos]t[me]
[cos]t
Output #4
[cos][tus]
[cost][me]
[cost]
The factorial typology in (7) reveals the following prediction that holds true of all
dialects:
(8)
Universal: t-deletion before a vowel or a pause implies t-deletion before a
consonant, but not vice versa.
This is an implicational universal of the sort familiar from typological literature: if a
pattern p occurs in a language, so will pattern q. This implicational universal correctly
captures the asymmetry between the preconsonantal environment and the other two
environments. However, this is not quite sufficient: all the dialects in (2) are variable and
the implicational universal comes out quantitatively, not categorically. We need
something more in order to talk about variation and quantitative patterns.
Here we adopt a particularly simple theory of variation: the Multiple Grammars
Theory. Its basic assumptions are stated in (9).
(9)
The Multiple Grammars Theory of variation:
(a)
Variation arises from multiple grammars within/across individuals.
(b)
The number of grammars predicting an output is proportional to the
frequency of occurrence of this output.
Assume an individual whose mental grammar consists of three grammars of types 1, 5, 6
in the factorial typology: At the moment of speaking, the individual reaches into the
grammar urn and selects a grammar at random. In the long run, the following quantitative
pattern will emerge: no deletion before vowels, 67% of deletion before consonants, and
33% of deletion before pauses.
(10)
Sample grammar: {#1, #5, #6}
/cost us/:
/cost me/:
/cost/:
Output #1
[cost][us]
[cost][me]
[cost]
Output #5
[cos][tus]
[cos]t[me]
[cost]
Output #6
[cos][tus]
[cos]t[me]
[cos]t
Del. rate
0/3
2/3
1/3
It is now easy to see that a universal quantitative prediction is being made. Given the
implicational universal in (8), the following quantitative universal must also hold:
4
(11)
Quantitative universal: t-deletion rate is at least as high before a consonant than
before a vowel or a pause.
The implicational universal states that if t is deleted anywhere, it is deleted before a
consonant. Since this statement holds for all individual dialects, it will hold quantitatively
for all combinations of dialects. In other words, it is never possible to pick a combination
of grammars that would result in more t,d-deletion before vowels and pauses than before
consonants. This is the empirical observation in (3a). In contrast, no implicational
universal is predicted to hold between the vowel and pause environments. In dialect 2,
t,d-deletion occurs before a vowel, but not before a pause. In dialect 6 the opposite
situation holds. This means that the two environments can occur in either order
quantitatively as well. This is the empirical observation in (3b).
4. What is the T-Order Constructor?
The problem with factorial typologies is that they are hard for humans to understand. The
typology in (7) is small and relatively easy to figure out. However, in any analysis of a
realistic size, factorial typologies tend to be much larger and the number of output
dialects often runs in the hundreds. OTSOFT does a good job computing factorial
typologies, but the result tends to be useless because the typology is too complex for the
linguist to understand. What is needed, then, is a way to make factorial typologies
understandable for humans. In this case, we need a way to figure out all the implicational
universals hidden in a factorial typology. This problem is solved by the T-OrderConstructor:
The T-Order-Constructor takes a factorial typology as input and returns all
the implicational universals implicit in it in a visual format that is easy for
the linguist to understand.
The present version of the program reads factorial typology files produced by OTSOFT
(Hayes, Tesar, and Zuraw 2003). The output is a Hasse diagram that displays the
implicational universals. For the t,d-deletion grammar in (6), T-Order-Constructor
produces the graph in (12). We call such graphs T-ORDERS.
5
(12)
t-order
Every optimality-theoretic grammar has a t-order. The t-order is linguistically interesting
because it describes two kinds of universals:

QUALITATIVE UNIVERSALS:

QUANTITATIVE UNIVERSALS:
The qualitative distribution of variants should never
conflict with the t-order. For example, the t-order excludes dialects with deletion
before a vowel (cos’ again), but not before a consonant (*cos’ me). Such dialects
cannot be derived by any constraint ranking.
The quantitative distribution of variants should
never conflict with the t-order. For example, the t-order excludes dialects where
the rate of deletion before a vowel (cos’ again) exceeds the rate of deletion
before a consonant (cos’ me). Such dialects cannot be derived by any
combination of constraint rankings.
Both the qualitative and quantitative universals are hard-wired in the constraints and thus
ranking-independent. All other quantitative relationships are language-particular,
ranking-dependent, and subject to cross-dialectal variation. These quantitative universals
hold true for several theories of variation, including Multiple Grammars, Partially
Ordered Grammars, and Stochastic Optimality Theory. This is because in all these
theories the factorial typology is the same.
In conclusion, we now know that the constraint set in (5) indeed derives the
empirical universals in (3) and excludes unnatural dialects with more t-deletion in cost us
and cost than in cost me.
6
5. About the program
The T-Order Constructor was programmed by Curtis Andrus in the Python programming
language in the spring of 2006. The program can be downloaded from
http://www.XXXXXX
.
The T-Order Constructor requires Graphvis software to be installed in order to draw torder graphs. More information on Graphvis software can be found at the following
URL:
http://www.research.att.com/sw/tools/graphviz/
To use the program, take the following steps.




Download and unzip the program file.
Run the program by clicking on tordergui.exe.
Open the factorial typology file conventionally named MYFILEDraftOutput.txt.
Select the desired user options (see below) and click on “Generate T-Order”.
The program will give you the option of saving the result into a text file (“Save T-Order”)
or into a graph file (“Save Graph”). The graph files are read with Graphvis software. The
program can output the t-order graph to any supported file format.
The program allows the following user options:
Save pair possibilities: Writes <input, output> pairs and implications into a text file.
Collapse cycles: Removes cycles, i.e. <input,output> pairs that imply each other, and
collapses the nodes into a box. The program computes the transitive reduction (i.e. the torder with all unecessary edges removed) using the property that the t-order is the
transitive closure of a relation (i.e. all edges implied by transitivity are in the t-order).
Split Disjoint Graphs: Splits the t-order graph into disjoint components and output each
component to a separate file. [Recommended, or else the graphs may get too large.]
7
Download