Chomsky: Syntactic Structures (1957) Chap 1-3 (pp. 11-25) Sept, 2008 Introduction. Syntax ``has as its goal the construction of a grammar that can be viewed as a device … for producing sentences of the language under analysis.’’ ``the problem of determining the fundamental underlying properties of successful grammars.’’ ``ultimate outcome: …a theory of linguistic structure in which the descriptive devices utilized in particular grammars are presented and studied abstractly, with no specific reference to particular languages. … a general method for selecting a grammar for each language, given a corpus of sentences of this languages.’’ fundamental notion: `linguistic level’ (eg, phonemics, morphology, phrase structure, …) Chapter 1. Language: ``a set (finite or infinite) of sentences, each finite in length and constructed from a finite set of elements.’’ ``each natural language has a finite number of phonemes (or letters) and each sentence is representable as a finite sequence of these phonemes (or letters) though there are infinitely many sentences’’ ``we assume intuitive knowledge of the grammatical sentences of English and ask what sort of grammar will be able to (produce) these in an effective and illuminating way.’’ Explicating ``grammatical in English’’: ``sufficient to assume a partial knowledge of sentences and nonsentences’’ ``assume certain sequences are definitely sentences and certain others are definitely nonsentences. In intermediate cases we shall … let the grammar itself decide when the G is set up in the simplest way so that it includes the clear Ss and excludes the clear non-Ss.’’ ``for a single language, in isolation, this provides only a weak test of adequacy since many different Gs may handle the clear cases properly. (But) this can be generalized to a very strong condition, however, if we insist that the clear cases be handled properly for each language by grammars all of which are constructed by the same method.’’ This is reasonable ``since we are interested not only in particular languages but also in the general nature of Language’’ How to separate grammatical from ungrammatical? Not the following: 1. Not simply compared to the corpus. (we must project to the infinite corpus) 2. Not meaningful. See a. Colorless green ideas sleep furiously (grammatical but meaningless) b. Furiously sleep ideas green colorless (ungrammatical and meaningless) 3. Not statistically probable. a. I saw a fragile whale. (almost zero frequency of occurrence, and grammatical) b. I saw a fragile of. (almost zero frequency of occurrence and ungrammatical) So, ``grammar is autonomous and independent of meaning and probabilistic models give no particular insight into some of the basic problems of syntactic structure.’’ Chapter 3. Finite state machine: set of states (including Start and End states), set of transitions (when a word is produced allowing transition to next state) Terminal vs. nonterminal symbols A context-free grammar: Non-terminal set = {Start), Terminal set - = {a,b} Productions: 1. S -> aSb 2. S -> ab Generates: ab (by Production 2), aabb (by Production 1, plus Production 2) aaaabbbb (by Production 1, plus Production 2 for 3 times) hence generates strings anbn (p. 23) Should English have limits on how many repetitions of a rule (production) are possible? No: ``In general the assumption that languages are infinite is made in order to simplify the description. If a grammar does not have recursive devices (closed loops) it will be prohibitively complex).’’