How many languages are there? At least 300 – and that’s just Northern Italy Raffaella Zanuttini, Georgetown University The answer to the question “How many languages are there in the world?” is interesting on many different levels – political, sociological, demographic, historical, etc. But how do we answer it? There is one obvious way of telling languages apart: different languages have different words, so I love you and Je t’aime are sentences expressing the same meaning in two different languages, English and French. But having a distinct set of words, or lexical items, is not a good criterion for identifying a language, either empirically or conceptually. It could lead us to distinguish varieties which may differ only in the use of some words (for example, soda versus pop), while otherwise sharing phonological, morphological and syntactic properties. More importantly, given that words are arbitrary pairings of sound and meaning, this criterion seems to address the question of how many different such pairings can be found in the world – for which the answer seems clear, i.e. an infinite number. A more insightful way to pursue an answer to our question may be found by building on one of the key ideas of that part of modern linguistic theory generally known as generative linguistics, founded by Noam Chomsky in the late 50’s. The proposal is that, in studying language, one need not necessarily focus on the “external” or extensional notion of language (E-language) - i.e. language as a set of words, as the object that can be heard or read. One could focus instead on an “internal” or intensional notion (I-language) – i.e. language as the abstract knowledge that allows native speakers to produce the object that can be heard or read. An analogy with bread may be helpful to illustrate these notions.1 Suppose we wanted to know how many different kinds of bread can be found in the world. We could either focus on the extensional notion of bread and examine the objects that are actually produced, or focus on a more abstract, intensional definition, and study the recipes for making bread that can be found in the world. What is the analog of a recipe in the case of language? The abstract knowledge of that language that a native speaker has, which allows him/her to produce it and understand it - what is called the native speaker’s competence. How do we study such abstract knowledge, or competence? We observe what constitutes a well formed expression of that language, as well as what counts as an ill formed linguistic form. For example, we observe that, for any native speaker of English, I love you is a well formed sentence, whereas I you love is not. Similarly, in Mary thinks that she’s smart, the pronoun she can refer to Mary; whereas in She thinks that Mary is smart, she cannot refer to Mary. These are facets of the knowledge of a native speaker, and must be represented in the model that we build. How? In the form of a grammar, which can be thought of as a set of instructions which lead the user to building all and only the well formed sentences of his/her language – just like a recipe is a set of instructions that guide one through the process of making bread. Switching the focus of investigation to grammars, or models of the native speakers’ competence of a given language, provides us a new way of thinking about how languages differ: we can compare grammatical systems, examine how they differ, and 1 This analogy, as well the notion of parameter that follows, are discussed with exceptional clarity in Mark Baker’s 2001 book The Atoms of Language. possibly even count how many distinct ones exist. A careful comparison will reveal what is shared across them, thus uncovering what may be seen as the core properties of grammar. It will also reveal in which ways grammars may and may not differ. “How many languages are there?” can now be interpreted as “How many grammatical systems are there?” – still a difficult question, but one with the potential of revealing how this particular aspect of our knowledge is organized. Given this perspective, how do languages differ? A very useful notion to express the way grammatical systems differ is that of parameter, which can be seen as a point at which two grammatical systems can depart, i.e. make different choices. Consider the following pair of sentences from Mohawk and English with identical meaning: (1) Washakotya’tawitsherahtkvhta’se’ (Mohawk) (2) He made the thing that one puts on one’s body (i.e., the dress) ugly for her The striking differences between Mohawk and English E-languages can be seen to derive from a different choice their grammatical systems made at the following choice point, or parameter: The Polysynthesis Parameter: Verbs must include some expression of each of the main participants in the event described (the subject, object and indirect object). Mohawk has the positive setting for this parameter, that is, it expresses on the verb each participant in the event – in this case, the verb has a marker for all of the partcipants in the event of putting. English, in contrast, adopts the negative setting of the parameter, and so has an unencumbered verb and the participants in the event expressed in relatively fixed positions in the sentence (the subject preceding and the object following the verb). The notion of parameter is a useful way to express cross-linguistic differences for at least two reasons. Parameters allow us to find systematicity in the way grammatical systems differ, since some parametric choices are pre-requisites for others. Moreover, in some cases, it is possible to reduce a cluster of differences in the E-language to a single parametric difference – that is, to derive them from a single choice point in the grammatical system. For example, Italian and English differ in the requirements imposed on the subjects of their sentences. In English, all sentences with finite verbs must have an overt subject, whereas in Italian the subject does not necessarily have to be overtly realized (cf. She has already called versus Ha già telefonato). In English, subjects must be overtly realized even when they do not have referential content, as in sentences with weather related predicates (cf. It rains), whereas Italian doesn’t have subjects of this kind (cf. Piove). Even when the logical subject is present in the sentence, in postverbal position, English needs an overt subject in pre-verbal position (cf. There have arrived three people), whereas Italian does not (Sono arrivate tre persone). This set of differences in the E-languages can be reduced to a single parametric difference in the Ilanguage: The Null Subject Parameter: A language may allow the subject of a finite clause to lack phonetic content. By assuming that Italian has the positive setting for this parameter and English the negative one, this cluster of differences can be reduced to one. Can we now say how many distinct languages, viewed as distinct grammatical systems, are there? In addition to looking at broad cross-linguistic differences, like those distinguishing Mohawk from English, we also need to look at more fine grained differences as well, those that distinguish closely related linguistic varieties, for example the Romance variety spoken in Venice,Venetian, from the one spoken in Milan, Milanese, or the variety of English spoken in Washington from the one spoken in the Appalachian area of Eastern Tennessee. What we find is that linguistc variation is both surprisingly restricted and surprisingly rich. Take for example the aspect of grammar concerned with the expression of sentential negation. The examination of a large number of varieties reveals that only three basic strategies are adopted to make a sentence negative: placing a negative marker in pre-verbal position, placing it in post-verbal position, or using both a pre- and a post-verbal negative marker. This is a surprisingly restricted set of choices, considering all the possibilities that are in principle available (e.g., a negative sentence could be the mirror image of a positive one; or it could have a special negative element after the third word from the left or from the right, etc.). Yet, within this restricted set of choices, the amount of possible variation is surprisingly high: depending on whether or not the pre-verbal negative marker can be the only negative element in a clause, the language will or will not be able to negate imperatives; depending on whether the post-verbal negative marker occurs to the left or to the right of certain adverbs, it will or will not be able to co-occur with another negative element in the sentence. What we see is the existence of micro-parameters: choice points at which grammars can differ only minimally and yet determine subtle but visible differences in the E-languages, even when they share most syntactic properties. So, how many grammatical systems are there? Surprisingly few – in the sense that variation within grammatical systems is highly restricted; and at the same time surprisingly many, in the sense that a particular grammatical choice can exhibit fine grained distinctions which yield closely related but distinct E-languages. In Northern italy, it has been argued, one can find over 300 distinct grammatical systems. What about in United States?