The empirical investigation of mental representations and their composition Joe Thompson∗ Simon Fraser University May 5, 2010 ∗ This work owes a great deal to the suggestions of Ray Jennings and Kate Slaney. Thanks also goes to Trey Boone for suggesting that the evidence presented here might bear directly on the LOT hypothesis. 1 Contents 1 The Language of Thought Hypothesis 3 1.1 Mental Representations . . . . . . . . . . . . . . . . . . . . . 3 1.2 Composition and LOT . . . . . . . . . . . . . . . . . . . . . . 4 2 LOT and Language Science 7 2.1 LOT and Language Change . . . . . . . . . . . . . . . . . . . 7 2.2 Productivity and Systematicity of Language . . . . . . . . . . 10 2.3 Recursion and Hierarchical Complexity . . . . . . . . . . . . . 12 3 Testing the LOT hypothesis 17 3.1 The problem with Recursive, Centre-Embedded, Elements . . 17 3.2 The Strength of the Case Against LOT . . . . . . . . . . . . 20 3.3 Construct-validity Theory . . . . . . . . . . . . . . . . . . . . 22 2 1 The Language of Thought Hypothesis 1.1 Mental Representations The Language of thought hypothesis (or LOT hypothesis) seems implicit in a number of theories of neural architecture. The hypothesis is that structured entities in the brain represent the world. The syntactic composition of these mental representations determine their expressive and causal powers. What a mental representation represents is determined by (a) its syntactic composition and (b) the representational contents of its atomic parts1 (Fodor, 2008). A representation’s causal powers are exhausted by the capacity for mutually supporting computational systems to track syntactic structure (computational systems do not track semantic content directly). It will be difficult to construct a research project with the specific goal of evaluating the LOT hypothesis. This is not to say that claims about compositional mental representations are inherently untestable, or in danger of being unfalsifiable. But the belief in compositional mental representations is so central to cognitive science that it seems, for all practical purposes, insulated from revision. Throwing out the language of thought, at least for those already working under its purview, would be tantamount to a paradigm shift. Denying that there are any mental representations at all would be even more traumatic. The relative insulation of the LOT hypothesis might explain why it tends to be the subject of a priori evaluation. Fodor (1981), for example, argued that compositional mental representations are the ally of belief-desire psychology, because they provide an intuitive explanation for how the content of a belief could cause behaviour. Objections to the coherence of mental representations, along the lines of private language arguments (Wittgenstein, 1958), are similarly a priori. While we need not object to a priori argumentation in science, we should be 1 The atomic components of a language of thought are the components containing no further parts. 3 realistic about the prospects for winning over our theoretical competitors. It is not contentious that the ultimate success or failure of the LOT hypothesis will lie in its capacity to provide explanations, predictions, and direction to research. What is needed, however, is a non-tendentious method for evaluating how the LOT hypothesis fares on these criteria. The best way to evaluate the LOT hypothesis, it seems, would be to formulate research projects whose consequences bear directly on its plausibility. The difficulty is that such research projects seem hard to design. I will argue that such research projects do exist, and it may be that they can be used to reject the LOT hypothesis, or at least revise our views on the composition of mental representations. It should be emphasized that if conceptual attacks on the LOT hypothesis succeed, then the LOT hypothesis must be incoherent, and my work here will be largely irrelevant. I will assume, however, that LOT theorists will be unconvinced by such arguments. I only wish to argue, from within LOT’s philosophical framework, that basal ganglia research poses empirical problems for the LOT hypothesis. Even if I am wrong, it may still turn out that the LOT hypothesis is incoherent. 1.2 Composition and LOT Compositional mental representations allow for an intuitive kind of beliefdesire psychology which I will call the computational-representational theory of mind (CRTM). The view accepts that propositional attitudes (beliefs that P desires that P and so on2 . . . ) are to be understood as relations between organisms and mental representations (Fodor, 2008; Rey, 1997). CRTM, however, takes this relationship to be spelled out in computational terms.3 An agent will have a belief that P iff they have a mental representation of P and this representation plays the right role in a computational system. 2 P is, of course, a poor choice for representing propositional attitude contents, as it is used in propositional logic to represent an atom. LOT theorists will expect propositional attitude contents to have compositional structure. 3 To say that a brain is a computational system is to say that we can describe the brain in terms of physical machine states (in an appropriate Turing machine), and then map these machine states to sentences in a computing language. For a fuller discussion of the notion of computation see Fodor (1975) and Rey (1997). 4 This is supposed to provide an intuitive account of the causal efficacy of propositional attitudes. How my belief that P will influence my behaviour is to be determined by (a) the fact that I believe that P rather than, say, desire that P and (b) the fact that I believe that P rather than that Q. Manipulating the computational role of a representation (insofar as this is possible), and consequently manipulating an individual’s attitude toward a proposition, will have different behavioural consequences from those that involve manipulating the mental representation itself. This respects the intuition that a desire that P should have different behavioural consequences from those of a belief that Q. Compositional mental representations assist in explaining how manipulating the mental representation could influence behaviour because, as the CRTM theorist argues, the mind’s computational architecture is sensitive to the syntactic structure of its mental representations. This notion of composition is central to the LOT hypothesis. The syntactic composition of a formal language is defined without appeal to semantics. Recursive rules generate a set of well formed sentences, and (because the class of atoms is well defined) their compositional structure can be defined with a notion of uniform substitution. Semantic composition is defined along these recursive rules, ensuring that the semantic values of complex sentences are a function of the semantic values of the component parts. Given independent definitions of a ‘proof’ and ‘inference’ we can seek to demonstrate the soundness and completeness of the proof theory with respect to the independent semantic theory of inference, and show that for every proof there is a corresponding valid inference, and vice versa. As a consequence, any machine that reliably generates proofs will also generate valid inferences, even though it has no direct access to our semantic interpretation of the language. Mental representations are supposed to represent the world, but possess causal powers in virtue of their syntactic composition. This is ontologically 5 respectable because the sentences of Mentalese are just physical structures, and their syntactic operations just physical events. Of course, the syntax and the semantics of Mentalese must be suitably related if this account is going to work. Ideally, semantic composition would be definable along the same rules that capture the syntactic composition. This would allow physical systems to track semantic content in the same way that a proof checker can track inference, at least in those cases where organisms are supposed to track content.4 Consider, for instance, the causal/inferential powers that Fodor (2008) bestows upon the content of belief “granny left and auntie stayed.” That the logical syntax of the thought is conjunctive (partially) determines, on the one hand, its truth-conditions and its behavior in inference and, on the other hand, its causal/computational role in mental processes. I think that this bringing of logic and logical syntax together with a theory of mental processes is the foundation of our cognitive science [. . . ] (Fodor, 2008, p. 21). The principle distinction between a theory positing mental representations and the Language of Thought hypothesis, then, lies in the notion of composition.5 The idea is that there could be mental representations that are not compositional. The content of these representations would be determined by the contents of their atoms alone, and not their structural features. A 4 There are a number of cases where systems are not supposed to be able to track what it is that they represent. Imagine a possible world identical to ours, except that on twin-earth there is no H2 0, only the compound xyz. H2 O and xyz have all the same surface properties, but a different underlying physical structure. Suppose that twinearth-English contains the linguistic construction water, and (intuitively) WaterT win − earth means xyz. By our previous supposition, neither human beings, nor any neural system, is capable of distinguishing H2 O from xyz. The twin-earth Mentalese expression representing H2 O could be syntactically identical to the earth representation of H2 O. A computational system will not be sensitive to whether it represents H2 O or xyz. 5 It should be noted that, on the present understanding of the LOT hypothesis, sentences in Mentalese need not be the contents of thought. Thoughts, beliefs, desires, and other psychological states, according to the computational theory of mind, imply certain computational relationships between an organism and a representation. A Mentalese sentence, on its own, is merely a compositionally structured mental representation. 6 radical associationist, for example, might explain behaviour by appealing only to associative pairings between structurally atomic ideas (for a discussion, see Rey, 1997). Such a view could allow for mental representations without taking up the LOT hypothesis. I must also emphasize that there are many ways in which representations could be composed. Consider, for example, a Language of Thought in which atomic representations of perceived objects are “tagged” with location and date representations. Sentences of the language are thus of the form Objectlocation,date and contain sentences such as “bearbackyard,6AM .” The composition of this language is distinct from one which takes mental representations of propositions to be composed of a noun phrases and a verb phrases, as in (the girl)N P (kicked the dog)V P I take it that mental representations of either sort of composition are sufficient for a LOT, even though the former language is presumably too simple to be very helpful. In section 2 I will argue that the LOT theorist, in order to account for linguistic behaviour, will be under empirical pressure to represent natural language and a LOT as having comparable structure. 2 2.1 LOT and Language Science LOT and Language Change Empirical arguments for the existence of mental representations do exist. These arguments tend to proceed by seeking out uncontestable observations that can only be explained by the positing of mental representations. Gallistel (2000, 1990), for instance, argues that the intricacies of conditioned learning require positing mental representations. But theories of conditioning have been hailed as the paradigmatic cases where non-representational 7 explanation is to be had. But, if anti-representationalism cannot even account for conditioning, then it seems we will be forced to buy into a paradigm of mental representations (Carruthers, 2006).6 Unfortunately, this work is outside our present purview and must be considered in more detail elsewhere. It should not be surprising, however, that research into language is also relevant to the Language of Thought hypothesis. In fact, the two share many foundational assumptions. Suppose, following Saussure, we believe that studying language in a disciplined manner requires ignoring much of language change. An absolute state is defined by the absence of changes, and since language changes somewhat in spite of everything, studying a language-state means in practice disregarding changes of little importance. (de Saussure, 1959, p. 102). Natural languages, at least in the conversational sense of English, French and so on, change all the time, perhaps with every linguistic intervention. But the Language of Thought cannot change so haphazardly. Changes to the syntactic structure of Mentalese, as a whole, would probably be catastrophic, unless these changes could first be accommodated by a brain’s neural systems. The claim that a Language of Thought could undergo significant change without mutually supporting changes to the relevant computational mechanisms (and without resulting in an inviable organism) is either incomprehensible or, at the very least, grossly implausible.7 Such a change might be analogous to using Morse code to communicate over a channel, only to have one party spontaneously change the language without notice. 6 Studying the composition of mental representations will be similarly indirect. Theorists may look to various domains (such as vision) and ask what kind of composition is required of the mental representations there. 7 One might wonder whether the composition of Mentalese might change during an individual’s development (perhaps during language acquisition). One problem with allowing individuals to possess different languages of thought is that if two individuals had distinct mental languages, then it might be impossible for their beliefs to share syntactically identical contents. At best, two structurally distinct sentences (of different languages) could possess the same representational content. 8 The Language of Thought hypothesis, therefore, has a friend in Saussure, who does not view language as the kind of thing that changes. Of course, speech or parole, changes frequently. But the good linguist is supposed to distinguish speech and language and much of the variation in speech, on this view, is disregarded as incidental variation in how people produce language. Taken as a whole, speech is many-sided and heterogeneous; [...] we cannot put it into any category of human facts, for we cannot discover its unity. Language, on the contrary, is a self-contained whole and a principle of classification. As soon as we give language first place among the facts of speech, we introduce a natural order into a mass that lends itself to no other classification (p. 9 de Saussure, 1959). To understand linguistic behaviour, the linguist must study language, which is a synchronic, unchanging, entity (de Saussure, 1959, p. 102).8 Change, on this view, is only an incidental feature of natural languages. There is nothing problematic about an unchanging language, such as the LOT. It follows that if we dropped the Saussurean philosophy of language, and instead viewed a language as a complex system of practices, then LOT might lose its status as a language. While the LOT hypothesis takes up this foundational assumption about language, I will argue that the LOT theorist will also be inclined to take up certain assumptions about Mentalese composition in order to explain linguistic behaviour. These assumptions, I shall argue, can be studied with respectable research projects. 8 Theorists recognize, of course, that English changes. Chomsky (1964), for example, introduces a notion of rule-changing novelty, where the grammar of a natural language actually changes. But synchronic (formal) representations of natural-language must treat such changes as, in effect, entirely new languages. 9 2.2 Productivity and Systematicity of Language Even when we abstract away from language change we still find problematic variation in the linguistic data. Individuals seem to be able to produce and apprehend a massive variety of sentences (we are often asked to consider, for example, the odds that a complex sentence will be repeated in a book9 ). Here linguists have emphatically appealed to the notion of composition. The central fact to which any significant linguistic theory must address itself is this: a mature speaker can produce a new sentence of his language on the appropriate occasion, and other speakers can understand it immediately, though it is equally new to them. [...] On the basis of limited experience with the data of speech, each normal human has developed for himself a thorough competence in his native language. This competence can be represented, to an as yet undetermined extent, as a system of rules that we can call the grammar of his language. To each phonetically possible utterance [...], the grammar assigns a certain structural description that specifies the linguistic elements of which it is constituted and their structural relations [...] (Chomsky, 1964, p. 50-51). The set of well formed sentences in a natural language are to be captured recursively by reference to a well defined set of atoms and syntactic composition. This view allows languages to contain a staggering (and potentially infinite) number of never-before-uttered sentences in a language. I will follow Fodor (2008) in calling this the productivity of language.10 The capacity 9 Chomsky and Miller (1963) 10 It should be noted that the notion of composition does more than simply account for the capacity to comprehend a potentially infinite class of sentences. Composition also purchases linguistic systematicity. “[The] ability to produce/understand some of the sentences is intrinsically connected to the ability to produce/understand many of the others” (Fodor, 1987, p.149). If one can understand the sentence dogs are cuter than cats, then one should be able to understand cats are cuter than dogs. Regardless of the number of sentences in the English, then, composition remains important because “the very same combinatorial mechanisms that determine the meaning of any of the sentences determine the meaning of all the rest”(Fodor, 1987, p.150). 10 to produce and comprehend such a broad class of sentences, on this view, is a relatively simple matter of coming to grips with the facts of composition.11 If language is best understood as a synchronic entity and linguistic behaviour is best accounted for by compositionality, then this would be a great boon to the Language of Thought hypothesis. This is because the LOT hypothesis runs into the same problem of productivity that is present for natural languages. If there are a staggering number of permissible English sentences then there will need to be at least as many permissible mental representations. This is because a person’s sensitivity to any phonological, morphological, syntactic, or even semantic feature of an utterance must ultimately be explained, on a LOT architecture, by both the composition of mental representations and the computational systems that are sensitive to them (Fodor, 2008). The result is that many of the explanatory jobs traditionally associated with syntactic and semantic theory must be accomplished by the syntax of mental representations (and their supporting computational systems). This includes issues of syntactic and semantic ambiguity. Each reading for a syntactically ambiguous sentence, for instance, needs to correspond to a uniquely structured mental representation (Fodor, 2008, p. 58).12 11 Of course, for some theorists coming to grips with compositionality will some degree of innate knowledge or innate structure. 12 If each reading of an ambiguous sentence did not correspond to a unique representation, then computational systems could not distinguish readings that some people plainly do. Mentalese must be powerful enough, then, to allow me to disambiguate no trees have fallen over here. (no trees have fallen) (over here). (No trees have fallen over) (here) (Mary Shaw’s example). Each reading of this sentence must correspond to its own mental representation (Fodor, 2008). This is because (once again), computational systems need to distinguish these readings if they are to explain how we treat them differently. What might be surprising is that many of the semantic distinctions present in English must be accounted for by syntactic distinctions in Mentalese. A classic requirement for a good semantic theory, for instance, is the ability to sort out semantic ambiguities (Katz and Fodor, 1964). In the Language of Thought there can be no structural ambiguity between bank qua financial institution and bank qua geographical feature. Each reading of the English sentence I walked to the bank must correspond to a unique mental representation (Fodor, 2008). 11 This leaves the theorist who works on the syntax and semantics of English in a rather awkward position. A natural language is ultimately a feature of a population. It is therefore conceptually possible for us to describe a population’s linguistic behaviour in a theoretical idiom that is independent from our best theories of neural processing. English, for example, could be a feature that is weakly emergent (Bedau, 1997) on the properties of individual brains.13 But while it is conceptually possible to study the syntax and semantics of English without concern for neural processing, this will be a difficult line for LOT theorists to take in practice. It is the burden of the LOT theorist to posit computations defined over structured mental representations in such a way that, sooner or later, all aspects of linguistic behaviour, including productivity and ambiguity, are accounted for. Our best grammatical theories of natural languages are therefore crucial to the LOT theorist because they clarify what abilities must be accounted for. 2.3 Recursion and Hierarchical Complexity While our theory of English grammar is clearly relevant to the LOT theorist, it will still be difficult to formulate a research project that bears on the LOT hypothesis. I need to be more specific about how LOT theorists plan to explain linguistic behaviour. I believe that the LOT theorist will be very sympathetic to claims about the recursive structure of mental representations. My worry, however, is that such claims are problematic on empirical grounds. Another brief detour is 13 I am not concerned with the various notions of emergence that exist in philosophical literature. I am not even committed to the usefulness of Bedau’s (1997) explication of weak emergence. What is crucial to the notion, however, is that weakly emergent properties bear a kind of independence from the “lower level” properties. At the very least, this independence is epistemic. It is perfectly respectable for a science to study weakly emergent properties in a theoretical language of their choosing. They do not first have to provide definitions of their theoretical terms in the theoretical language of “lower level” sciences. In this sense, a majority of psychologists seem comfortable with the idea that the property of mind is weakly emergent. In fact, I am happy to view even weather patterns as emergent properties. 12 therefore worthwhile. One might try to defend claims about the structural complexity of a natural language by arguing that adequate formal representation of a language requires a grammar of certain complexity. Chomsky and Miller (1963) argue, for instance, that natural language exhibits hierarchical complexity, which is supposed to be apparent in relative clauses where the subject is separated from the verb phrase by a center embedded clause. For example, The following conversation, which took place between the two friends in the pump-room one morning, after an acquaintance of eight or nine days, is given as a specimen of their very warm attachment [. . . ] (emphasis added, Austen, 1870). Representation of this class of sentences supposedly requires a formal grammar of a certain complexity (Chomsky and Miller, 1963). One can see how the long-distance dependencies apparent in such clauses pose a sequencing problem for organisms (Hauser, Chomsky, and Fitch, 2002). At the very least, understanding the sentence seems to require sensitivity to the relationship between the main verb and the subject, despite the distance between them. Phrase-structure grammars have traditionally been used to provide structural descriptions for these sentences. Consider a simpler example,14 14 The triangles stand in for a collection of lower branches. 13 Figure 1: A recursively centre-embedded sentence S VP NP T N the S boy who lifted the ball kicked the dog While the LOT theorist will happily accept that English sentences contain noun phrases and verb phrases, it is not obvious that this phrase-structure tree could be read as describing the structure of a mental representation. It is unclear, for instance, whether a formal representation of the syntax of Mentalese would include noun phrases and verb phrases as grammatical categories.15 Fortunately, this stronger claim about the syntax of representations is not important for my argument. Here I am only interested in whether mental representations contain recursive elements. In phrase-structure grammars a grammatical category (or non-terminal symbol such as NP, N, VP, etc.), is a recursive element iff the symbol is dominated by another instance of the category further up the tree. A recursive element is also centre-embedded if it occurs in neither the left-most branch, nor the right-most branch, of the tree (for a full definition, see Chomsky and Miller, 1963). The symbol S (which stands for “sentence”) in figure 1 is a recursive centre-embedded sentence. In what follows I will argue that empirical evidence will press the LOT theorist to accept 1. Some Mentalese sentences contain centre-embedded recursive elements, and perhaps other Mentalese sentences, as constituent parts. 15 It is important to distinguish the claim that some part of a mental representation represents a noun phrase and the claim that noun phrases are part of the structure of mental representations. The latter is a claim about the syntax of LOT, while the former is a claim about semantics. 14 1 should seem plausible to the LOT theorist. This is because long-distance dependencies, so the story goes, present an insurmountable problem for radical associationism, one of the LOT theorist’s most vigorous competitors. Radical associationism is the view that linguistic behaviour can be explained with respect to association-relations between ideas, stimuli, or responses without positing compositional mental representations. The problem with such views, according to Fodor (2008), is that they tend to assume that the associations are formed on the basis of temporal-contiguity. [. . . ] Hume held that ideas became associated as a function of the temporal contiguity of their tokenings. [. . . ] Likewise, according to skinnerian theory, responses become conditioned to stimuli as a function of their temporal contiguity to reinforcers. By contrast, Chomsky argued that the mind is sensitive to relations among interdependent elements of mental or linguistic representations that may be arbitrarily far apart (Fodor, 2008, p. 103). The worry, then, was that the radical associationist could not account for why humans are so good at producing and understanding sentences that exhibit long-distance dependencies (especially if the relationships are very long16 ). I will not discuss this objection in any detail here (for a discussion see, Fodor, 2008). What is important is that these supposed difficulties with radical associationism are often hailed as a chief advantage of the LOT hypothesis. 16 It is important to note here that Fodor (2008) may not have the relatively short ‘long distance’ dependencies of Jane Austen in mind. The productivity of language extends into the domain of long-distance dependencies, and it sometimes seems as if English grammar allows for an arbitrary number of centre-embeddings. The boy, who lifted the ball, kicked the dog. The boy, who lifted the ball, which was lighter than 1 kilogram, kicked the dog. The boy, who lifted the ball, which was lighter than one kilogram, which is lighter than 2 kilograms, kicked the dog. and so on . . . Our linguistic intuitions are supposed to suggest that the sentences picked out here are all grammatical, along with sentences containing any finite number of embeddings. 15 On the one hand, there is no reason why computational relations need to be contiguity-sensitive, so if mental processes are computational they can be sensitive to ‘long-distance’ dependencies between the mental (/linguistic) objects over which they are defined. On the other hand, we know that computations can be implemented by causal mechanisms; that’s what computers do for a living. [. . . ] In short, a serious cognitive science requires a theory of mental processes that is compatible with the productivity of mental representations, and [the computational theory of mind] obliges by providing such a theory (Fodor, 2008, p. 105) Accounting for centre-embedded relative clauses is therefore crucial for the LOT theorist, who wishes to gain ground on the radical associationist (and Skinner). The LOT theorist will do this by explaining the comprehension of long-distance dependencies in English by positing recursive centre-embedded elements in Mentalese sentences. If they do not take up this claim, then it is unclear how the relevant computations can be sensitive to long-distance dependencies. Theorists will, no doubt, be concerned about allowing English sentences to retain a structural complexity over and above that of Mentalese sentences.17 I worry that if the LOT theorist posits recursive elements in Mentalese, then she will also adopt another claim. 2. Production and comprehension of center embedded relative clauses is made possible by computational operations that exploit this syntactic feature of mental representations. 2 seems to be a natural consequence of 1 for the LOT theorist, because comprehending complex sentences will be a matter of composing complex Mentalese expressions to represent the relevant propositions. But any psychological process for the LOT theorist, including understanding, will be a matter of computational transformations of mental representations. 17 For a fuller discussion of the importance of centre-embedded recursive elements to sequence complexity, see Chomsky and Miller (1963). 16 If you think that a mental process - extensionally, as it were - as a sequence of mental states each specified with reference to its intentional content, then mental representations provide a mechanism for the construction of these sequences; and they allow you to get, in a mechanical way, from one such state to the next by performing operations on the representations (Fodor, 1987, p. 145). Our theory of the composition of mental representations, then, constrains our theory of what kinds of computations can be performed on those mental representations. Any posited computational features had better be relevant to computation or there will be no reason for positing the structural feature in the first place. I will argue that viewing the comprehension of a relative clause as the construction of a complex mental representation (with centreembedded recursive elements) places problematic constraints on our theory of sentence processing. 3 3.1 Testing the LOT hypothesis The problem with Recursive, Centre-Embedded, Elements Some theorists have argued that the brain does not employ recursion in order to deal with center-embedded relative clauses. It is an open question as to whether Chomsky’s theories have any biologic validity. Can we really be sure that the relative, prepositional, and adverbial clauses that occur in complex sentences are derived from embedded sentences? [. . .] In short, recursion as defined by Hauser, Chomsky, and Fitch (2002) follows from hypothetical syntactic structures, hypothetical insertions of sentences and phrase nodes, and hypothetical theory-specific rewriting operations on these insertions to form the sentences we actually 17 here or read (Lieberman, 2006, p.359-360). It may not be obvious that Lieberman’s issue here is with a theory of neural processing (although one guesses this is what he means by “biologic validity”). Lieberman’s argument is actually based on a theory of basal ganglia operations or, more specifically, a theory about the operations of segregated circuits that run from various areas of the cortex to segregated regions of the basal ganglia and back. I will postpone evaluating the evidential support for these claims until the discussion of how the LOT theorist should respond to this argument. The argument begins from either of the following premises. I Segregated circuits seem to perform similar computations (see, for example, Alexander and Crutcher, 1990; Delong, 1990) and these segregated circuits perform what can be considered motor, cognitive, and linguistic sequencing18 (Lieberman, 2006). One such circuit is involved in the comprehension of centre-embedded relative clauses. II A single circuit may perform cognitive and linguistic sequencing (see, for example, Lieberman, 2006; Hochstadt et al., 2006). This circuit is involved in the comprehension of centre-embedded relative clauses. The basic problem posed by I is that motor control sequences are often thought to be less complex than linguistic ones. Devlin (2006) assumes, for instance, that motor sequences lack the hierarchical complexity of natural language. But if motor sequences were less complex than linguistic sequences, one would predict that the neural systems underlying complex behavioural sequences would employ different kinds of computations, a proposition that is inconsistent with I. Comprehension of sentences with centre-embedded recursive elements, for the LOT theorist, should require a specific kind of recursion. “[. . . The] recursions [. . . relevant to language] are defined over the constituent structure, [and more specifically the hierarchical constituent structure] of mental representations” (Fodor, 2008, p. 105). If motor control sequences are less 18 for a discussion of what is meant by sequencing see 3.2 18 complex, then the computations allowing for complex motor control should not need to exploit centre-embedded recursive elements in motor representations. One way to avoid this problem is to accept, as Lieberman (2006) does, that motor sequences are hierarchically complex. But even this does not completely dissolve the difficulty. Eventually, the LOT theorist will have to propose a similar computational processes for the two circuits. These computations will have to be defined over centre-embedded recursive elements and usefully accounts for both linguistic and motor sequencing. II is problematic for similar reasons. If II is true, the LOT theorist will have to explain how the same computational system sequences cognitive and linguistic acts. The circuit in question is presumed to be important for the correct apprehension of certain center embedded clauses. For example, patients with Parkinson’s disease have difficulties comprehending a certain class of centre-embedded sentences from structural information alone (Hochstadt et al., 2006). Furthermore, the circuit also seems to be important for cognitive set shifting, a supposed component of executive function (Edabi and Pfeiffer, 2005) If it turned out that a single system underlay the sequencing of both cognitive and linguistic acts, then it would burden the LOT theorist with making a unified account of the circuit’s computations. If the LOT theorist accepts 1, then these computations had better be defined over the hierarchicalconstituent-structure of these representations. LOT theorists must avoid, for instance, modeling the brain as performing cognitive-shifts at relative clauses boundaries (Hochstadt et al., 2006) unless these shifting operations are suitably defined over hierarchical-representations. I propose, then, that I and II are in tension with the claim that (1) the LOT permits recursive, centre-embedded, elements and the claim that (2) the comprehension of relative clauses is made possible by the recursive elements in LOT. 19 3.2 The Strength of the Case Against LOT I am not claiming that the current evidence is sufficient to overthrow the LOT hypothesis. There are, of course, plenty of presuppositions in Lieberman’s argument that a theorist can object to. Note, however, that my case is made if I and II are respectable hypotheses that are in tension with the LOT hypothesis. I have only the burden of showing that I and II can be evaluated by serious research projects. It is an old hypothesis that segregated cortical-subcortical-cortical circuits may employ similar functions. Rather than viewing the basal ganglia as a single functional system taking projections from various cortical locations, evidence suggests that the basal ganglia are part of a number of largely (anatomically) segregated circuits (Alexander et al., 1986). The dorsolateral prefrontal cortex, for instance, seems to comprise a cortical-subcorticalcircuit that is segregated from the motor circuit. Both circuits are composed of relatively segregated neural populations in the striatum, interior globus pallidus, and thalamus (Alexander et al., 1986). It has been suggested that this anatomical division might also be a functional division (Alexander et al., 1986; Alexander and Crutcher, 1990). However, this functional division is to be made on the basis of those circuits’ respective cortical areas. The circuits, themselves, seem to perform fundamentally similar operations. Because of the parallel nature of the basal ganglia-thalamocortical circuits and the apparent uniformity of synaptic organization at corresponding levels of these functionally segregated pathways [. . . ] it would seem likely that similar neuronal operations are performed at comparable stages of each of the five proposed circuits (Alexander et al., 1986, p.361). This poses a serious architectural hypothesis of cortical-subcortical-cortical circuits. What is still needed is a method for empirically testing I and II. We must empirically ascertain whether these circuits have anything to do 20 with language, motor control, or cognition. Two sources of evidence for this comes from pathology and imaging data. i Dysfunction in the Circuits (as seen in Parkinson’s disease and Hypoxia) is associated with a wide array of cognitive, linguistic, and motor difficulties (including a difficulty with center-embedded clauses) (see, for example, Lieberman et al., 1995, 2005; Lieberman, 2006; Hochstadt et al., 2006; Feldman et al., 1990; Lieberman et al., 1992). Furthermore, difficulty in comprehending relative clauses from structural information alone (Hochstadt et al., 2006). ii Circuits are implicated with the performance of similar cognitive (for example, Monchi and Petrides, 2001; Monchi et al., 2004) and motor (Edabi and Pfeiffer, 2005; Doyon et al., 2009, for a review, see) sequencing acts in healthy individuals.19 I will discuss how theorists measure these various kinds of sequencing in a moment. First I will explain why is meant by sequencing. A neural system’s sequencing powers allow an organism to partake of certain complex behavioural sequences. These may be a complex motor sequence, such as that of jumping a hurdle while running, or that of producing a syntactically complex sentence. But even if we were satisfied with motor and linguistic sequencing, the notion of cognitive sequencing remains unclear. We are left to assume that what is meant (or perhaps should be meant) by cognitive sequencing is supposed to be sorted out through the validation of our measures. Lieberman views the Wisconsin Card Sort as measuring a capacity to sequence cognitive acts. This is crucial to his argument, as it ultimately leads him to view a cortical-subcortical-circuit as performing cognitive sequencing. The Wisconsin Card Sort is perhaps the most exhaustively studied measure that Lieberman relies upon, yet it is not obvious that the WCST is a valid measure of cognitive sequencing. It is, however, extremely common 19 I know of no strong evidence for the claim that cortical-subcortical-cortical circuits are involved in the comprehension of centre-embedded relative clauses in healthy individuals. Lieberman (personal communication) does predict, however, that imaging studies should find basal ganglia activation during this kind of linguistic sequencing. 21 to view the WCST as measuring (among other things) a capacity for shifting one’s strategic disposition to behave in response to changing environmental demands (see Flowers and Robertson, 1985, p. 517; Taylor and Saint-Cyr, 1995, p. 283; Edabi and Pfeiffer, 2005, p. 350). Set-shifting might be called upon, for example, when one must shift away from a failing chess strategy. At least some of the measures Lieberman employs are in need of constructvalidity. Hochstadt et al. (2006), for instance, employed the Test of Meaning from Syntax (TMS), which requires extensive further study before validity can be claimed. This test is crucial for his argument as it is presented here since it is taken to measure an individual’s ability to comprehend centreembedded relative clauses from syntactic information alone. Whether the TMS actually does so will, no doubt, be highly contentious, regardless of its early promise (Hochstadt et al., 2006; Pickett, 1998). But if we believe in something like the Chronbach and Meehl (1955) account of constructvalidity, then these difficulties are to be expected. Furthermore, whether I and II can be evaluated by serious research projects will turn on the respectability of construct-validity theory itself. 3.3 Construct-validity Theory Demonstrating the construct-validity of a test is a long and arduous process. As with other forms of test validity, a theorist furthers the construct-validity of a test by demonstrating its relationship with other observables. What makes construct-validity problematic is that it requires theorists to defend the claim that the test measures a hypothetical construct, a theoretical entity that cannot be directly observed (such as aggression). As a result, theorists need to draw upon a number of theoretical propositions in order to determine what kinds of observations should further construct-validity (Chronbach and Meehl, 1955). If a test battery really measures intelligence one might predict that it should predict academic success, though this requires the background assumption that academic performance is related to intelligence. What seems to be the most contentious element of construct-validity is the idea that at the beginning of validation our theoretical constructs may be vague, but further study will make the meaning of our constructs clear. 22 Chronbach and Meehl (1955) relied on a nomological network of scientific laws20 in order to explain how demonstrating construct-validity could influence the meaning of a hypothetical construct. The scientific laws in the nomological network make reference to both hypothetical constructs and observables. For Chronbach and Meehl (1955) hypothetical constructs get their meaning from the role they play in this nomological network. “[Even a] vague, avowedly incomplete network still gives the constructs whatever meaning they do have” (Chronbach and Meehl, 1955, p. 12). A nomological network might, for instance, have laws spelling out the consequences of anger on aggression. It might also have laws capturing the behavioural consequences of both. As we develop the network, our understanding of aggression is supposed to improve.21 Lieberman needs such a view to support his belief that further empirical work will allow him to clarify what is meant by cognitive sequencing. More importantly, construct-validity theory allows theorists to use empirical methods to determine whether the TMS actually measures what it is supposed to. This will amount to defending a nomological network that explains why the TMS tracks the ability to comprehend centre-embedded relative clauses from structural information alone. Of course, one might worry that Chronbach and Meehl (1955) are presupposing a problematic philosophy of language.22 But these concerns are of little relevance to my argument. While some may view construct-validity 20 Chronbach and Meehl (1955) made no assumptions regarding whether the laws were statistical or deterministic. 21 Theorists may want to distinguish possession of a concept and explication of a concept in order to fit construct-validity theory into contemporary semantic theories. If an individual possesses a concept then they will have a mental representation corresponding to the concept. Possessing the concept dog is necessary for Jennifer to believe that “that is a dog.” Possession of a concept does not, however, imply that one can explicate the concept. When asked what dogs are, there is no guarantee that Jennifer can say anything useful. One might, then, view nomological networks as influencing only our ability to explicate concepts. Whether we actually possess a concept in the first place may be a distinct question. 22 According to Wittgenstein, for instance, our psychological concepts are already as clear as they are ever going to be. For the Wittgensteinian, there is no deeper meaning to the word ”reasoning” beyond the conversational practices already associated with it. Since everyday speakers participate in these practices perfectly well, it is unclear how empirical investigation will clear up the meaning of a term. 23 theory as objectionable, this will not be true for LOT theorists. They need some way of measuring mental processes. A theorist cannot simply stare at a brain until a computational theory dawns. Theorists need to develop measures that will track underlying computational processes. They will have no idea what kind of computational relationship will correspond to a belief until the science is well under way. They must rely upon some form of construct-validity theory to justify a methodology of starting from a loose understanding of beliefs as some kind of computational relationship between an organism and its mental representations. In other words, construct-validity theory relieves the scientist of the burden of laying down a computational theory of belief in advance. As theorists fill in the details of the nomological network (and develop a computational theory of mind) they are supposed to come closer to explicating the concept of belief. Assuming that some form of construct-validity theory holds, theorists can continue the process of validating measures of motor, linguistic, and cognitive sequencing. Such measures will allow theorists to make claims about the importance of cortical-subcortical-circuits for various sequencing abilities. Results from such studies will put a burden on the LOT theorist, who hopes to account for complex sequencing behaviours in terms of computations on mental representations. At some point, these theorists must provide accounts of how similar computations will account for structurally complex motor, linguistic, and cognitive behaviour. Furthermore, these computations must make use of the composition of mental representations. If the LOT theorist posits centre-embedded recursive elements to account for the comprehension of relative clauses, then the computations they posit in explaining behaviour had better exploit this composition. If the comprehension of centre-embedded sentences (such as “the boy who was fat kicked the clown”) has more in common with a cognitive set-shifting ability (as tracked by the Wisconsin Card Sort), then the LOT theorist must reconcile such facts with his theory of computation. Reconciling the LOT hypothesis with these facts may still be possible. Theorists can try to explain cognitive set-shifting with computations that are defined over recursive elements. Perhaps cognitive sets, themselves, contain recursive elements.23 Theorists are welcome to respond this way to anoma23 Another option is to argue that Broca’s area forms its own cortical-striatal-cortical circuit (Ullman, 2006) that performs its own unique computations. 24 lous results, but their responses will not be insulated from revision. This evidence may turn out to be a great boon for LOT theorists, as it may allow them to theorize about the composition of mental representations in a disciplined manner. However, it must be emphasized that language is a particularly important domain for the LOT hypothesis. The explanation of linguistic behaviour is supposed to be a task which, though a unsolvable mystery for radical associationism, is a manageable problem for a LOT (Fodor, 2008). But surely if the LOT hypothesis generates serious anomalies in the domain in which it is supposed to be most successful, then the entire hypothesis must come under suspicion. References Alexander, G. and M. Crutcher (1990). Functional architecture of basal ganglia circuits: Neural substrates of parallel processing. Trends in Neuroscience. 13 (7), 266–271. Alexander, G. E., M. R. DeLong, and P. L. Strick (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience 9, 357–381. Austen, J. (1870). Northanger Abbey. London: Spottiswoode and Co. Bedau, M. (1997). Weak emergence. In J. Tomberlin (Ed.), Philosophical Perspectives: Mind, Causation, and World., Volume 11, pp. 375–399. Malden, MA: Blackwell. Carruthers, P. (2006). The Architecture of the Mind. Oxford University Press. Chomsky, N. (1964). Current issues in linguistic theory. In J. A. Fodor and J. J. Katz (Eds.), The Structure of Language: Readings in the Philosophy of Language. Englewood Cliffs, New Jersey: Prentice-Hall, Inc. Chomsky, N. and G. A. Miller (1963). Introduction to the formal analysis of natural languages. In R. D. L. Galanter, R. R. Bush, and E. (Eds.), Handbook of Mathematical Psychology. New York and London: John Wiley and Sons, Inc. 25 Chronbach, L. J. and P. E. Meehl (1955). Construct validity in psychological tests. Psychological Bulletin 52, 287–302. de Saussure, F. (1959). Course In General Linguistics. New York: Philosophical Liberary. Delong, M. (1990). Primate models of movement disorders of basal ganglia origin. Trends in Neuroscience. 13 (7), 281–285. Devlin, J. (2006). Are we dancing apes? Science 314 (10). Doyon, J., P. Bellec, R. Amsel, V. Penhune, O. Monchi, J. Carrier, S. Lehéricy, and H. Benali. (2009). Contributions of the basal ganglia and functionally related brain structures to motor learning. Behavioral Brain Research. 199 (1), 61–75. Edabi, M. S. and R. Pfeiffer (Eds.) (2005). Parkinson’s Disease. CRC press. Feldman, P. L., J. Friedman, and L. S (1990). Syntax comprehension deficits in Parkinson’s disease. Journal of Nervous and Mental Disease 178 (360365). Flowers, K. A. and C. Robertson (1985). The effect of Parkinson’s disease on the ability to maintain a mental set. Journal of Neurology, Neuroscience, and Psychiatry. 48, 517–529. Fodor, H. (2008). LOT 2: The Language of Thought Revisited. Oxford, England: Oxford University Press. Fodor, J. (1975). The Language of Thought. New York: Crowell. Fodor, J. (1981). Introduction. In J. Fodor. (Ed.), Representations: Philosophical essays on the foundations of cognitive science. Hassocks: Harvester. Fodor, J. (1987). Books. Psychosemantics. Cambridge.: MIT press/Bradford Gallistel, R. (1990). The Organization of Learning. MIT Press. Gallistel, R. (2000). The replacement of general-purpose learning models with adaptively specialized learning modules. In M. Gazzaniga (Ed.), The New Cognitive Neurosciences, Second Edition. MIT Press. 26 Hauser, M. D., N. Chomsky, and W. T. Fitch (2002). The Faculty of Language, what is it, who has it, and how did it evolve? Neuroscience 298 (22), 1569–1679. Hochstadt, J., H. Nakano, P. Lieberman, and J. Friedman (2006). The roles of sequencing and verbal working memory in sentence comprehension deficits in Parkinson’s disease. Brain and Language 97, 243–257. Katz, J. and J. Fodor (1964). The structure of a semantic theory. In J. Katz and J. Fodor (Eds.), The Structure of Language: Readings in the Philosophy of Language. Englewood Cliffs, New Jersey.: Prentice-Hall, Inc. Lieberman, P. (2006). Toward an Evolutionary Biology of Language. Cambridge Massachusetts: The Belknap Press of Harvard University Press. Lieberman, P., E. T. Kako, J. Friedman, G. Tajchman, L. S. Feldman, and E. B. Jiminez (1992). Speech production, syntax comprehension, and cognitive deficits in Parkinson’s disease. Brain and Language 43, 169–189. Lieberman, P., B. G. Kanki, and A. Protopappas (1995). Speech production and cognitive decrements on Mount Everest. Aviation, Space and Environmental Medicine 66, 857–864. Lieberman, P., A. Morey, J. Hochstadt, M. Larson, and S. Mather (2005). Mount Everest: A space-analogue for speech monitoring of cognitive deficits and stress. Aviation, Space and environmental Medicine 76, 198– 207. Monchi, O., M. Petrides, J. Doyon, R. Postuma, K. Worsley, and A. Dagher (2004). Neural bases of set-shifting deficits in Parkinson’s disease. The Journal of Neuroscience. 24 (3), 702–710. Monchi, O. and P. Petrides (2001). Wisconsin Card Sorting revisited: Distinct neural circuits participating in different stages of the task identified by event- related functional magnetic resonance imaging. Neuroscience 21, 7733–7741. Pickett, E. (1998). Language and the Cerebellum. Ph. D. thesis, Brown University. Rey, G. (1997). Contemporary Philosophy of Mind: A Contentiously Classical Approach. Blackwell Publishing. 27 Taylor, A. E. and J. A. Saint-Cyr (1995). The neuropsychology of Parkinson’s Disease. Brain and Cognition 28, 281–296. Ullman, M. (2006). Special issue position paper: Is broca’s area part of a basal ganglia thalamocortical circuit? Cortex 42, 480–485. Wittgenstein, L. (1958). Macmillan Company. Philosophical Investigations. 28 New York: The