John P. Broderick Department of Linguistics University of South Florida Tampa, Florida 33620 October, 1974 USAGE VARIATION IN WRITTEN DISCOURSE Grammarians have long assumed the validity of a distinction between informal and formal language use, but little attempt has been made to verify it empirically or to discover specific characteristics that distinguish the two varieties. Traditional scholars, who have had most to say about usage, have shown little or no inclination to apply empirical techniques to language study. Empirically oriented linguists have until recently considered usage to be beyond the scope of applicability of available empirical techniques. But adequate notions of formal and informal usage varieties are available. John S. Kenyon, in 19481, recounted in some detail the history of the terminology associated with the distinction between formal and informal usage. He separated the notion of level from the notion of function as applied to varieties of English usage, claiming that the term level is a metaphor based , on culture-induced value judgments applied to usage, in accord with which usage varieties achieve high or low prestige in a language community. Such value judgments are not applied to functional varieties. Thus, Kenyon notes, formal English is not on a higher level than informal; it is simply appropriate to formal occasions, and informal English is not on a lower level than formal English; it is appropriate to informal occasions. Martin Joos, in 19622 called Kenyon's functional varieties styles, and instead of a formal vs. informal dichotomy, he proposed a spectrum of five styles: (1) frozen, (2) formal, (3) consultative, (4) casual, and (5) intimate. All five styles may appear in either spoken or written discourse. In speech, frozen style might show up in a religious sermon, formal style in a legislative speech, consultative style in a university semi casual style in a coffee-room conversation, intimate style in a love letter. In written language, frozen style might appear in legal or diplomatic documents, formal style in a work on literary criticism, consultative style in a business memorandum, casual style in a letter between friends, intimate style in a love letter. 'Kenyon and Joos provide fairly explicit notions of formal and informal usage. Here, I treat only written English and divide the formal-informal spectrum in two. What I call formal written English corresponds most closely to Joos's formal style, and informal written English most closely to his casual style. 1 This study is based on samples of writing elicited from 94 college freshmen at a large state university. An elicitation procedure was established to obtain from each subject a sample of informal and another of formal written English. Informal written English was taken to be any and all written performance when no conscious attention is paid to culturally determined aspects of usage and style, such as paragraph structure, sentence variety, and precise diction. Formal written English was taken to be that writing produced when writers are consciously aware that their writing must conform to certain of the above canons of good writing, i.e., when they are precisely aware that they are writing formally (when they are exercising what Fishman has called sociolinguistic communicative competence3). The subjects heard a tape of a short simulated classroom lecture, and were directed to write down rapidly as much as they could remember immediately after hearing it. Next, they were instructed to simulate an essay test a professor insisted not only on the facts in this case what they had heard(on the tape) but also on logical organization, clarity of expression, appropriate paragraph structure, and good grammar and usage. (The text of the taped lecture appears in the Appendix.) The essays written during the rapid rendition period are the informal essays. In writing both the informal and the formal essays, the subjects attempted to concentrate on accurately recalling and fully reproducing as much of the taped lecture as they could remember. Since they were so consciously attempting to reproduce the text of the tape, variations from its linguistic structure were taken as highly indicative of each one’s writing competence. And the sets of informal and formal essays they produced were comparable to the extent that one set more or less than another set reproduced some clearly definable linguistic phenomenon from the model taped lecture. Labov4 has pointed out the problem, in studies of language variation, of determining which phenomena in a corpus can most fruitfully be selected for counting and statistical processing, and his comment that syntactic phenomena such as complementation are unsatisfactory5 was born out in this study—no single syntactic process occurred with enough frequency even to calculate a valid mean for a set of essays. However, early analysis did reveal consistent patterns in the lexical structure of the corpus. So it was decided to focus the study on three phenomena: (1) The length in tatal number of words of the informal vs. the formal essays, (2) the number of verbs appearing in the formal vs. the informal essays and (3) the subtotal of verbs that were repetitions of verbs the subjects had heard on the model taped lecture. Although no adequate linguistic definition of the term word exists, it is possible to make use of the graphemic divisions of written discourse for counting words if the purpose of an investigation is comparison of the lengths of paired writing 2 samples.6 So, each graphemic element separated from others by a space on the printed page—including symbols, abbreviations, numbers, and hyphenated words—was counted as a word. The taped lecture contained 369 words, the mean for the informal essays was 141 and for the formal essays, 182. (The formal notably larger than the informal, but both around ½ the total number of words in the model essay.) I use the term verb to include both lexical items ordinarily designated as such and also adjectives. Both traditional grammar and more recent linguistic research provide reasons for doing so. The long accepted division of the sentence into subject and predicate, the latter including both verbs and predicate adjectives, supports this grouping. 7 And work by transformational linguists like George Lakoff, who describes many ways in which verbs and adjectives behave alike, also supports this grouping. Thus, in the analysis of the sentences in th taped lecture, items dominated by the category very or the category predicate adjective in the surface constituent structure were counted as verbs. There are a total of 50 such occurrences; including multiple occurrences there are 40 distinct items. These are listed below (notice that be + predicate nominative, and the expletive there + be are included in the count): List of the verbs and the number of times each occurred in the taped lecture verb ask available BE + NP there BE become cause compete cooperate create decrease different eat long make need obtain offset predict procure produce # of occurrences 1 1 2 2 1 1 2 1 1 2 1 1 1 1 1 1 1 2 2 2 verb enough fight find force get have hard immpossible increase lengthen likely limit put raise say slow down solve stable tend visualize # of occurrences 2 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 Next, the number of verbs in each of the essays in the corpus was counted and means calculated for the set of informal essays and for the set of formal essays. The mean for the informal set was 20 and for the formal set 3 24 (compared to fifty on the model essay). Then the verbs repeating the 40 on the model tape were counted in each essay and means calculated for each sub-set; the mean for the informal essays was 12.4, and for the formal essays 12..8 (very close). Multivariate statistical tests showed that for the total number of words and the total number of verbs the means for the informal and formal essays were significantly different at the .05 level. But the means for the number of verbs repeated from the tape were not significantly different. This indicates that, when the subjects were directed to revise carefully the informal versions of what they had heard on the tape, and produce a formal version, they used significantly more verbs but not significantly more of the verbs they had heard on the tape. This significant increase in the number of verbs not contained in the taped lecture would have to be attributable to either one or a combination of the following two factors: (1) the subjects used the same non-tape verbs that they had used in the informal essays, but more often (i.e., they repeated themselves) or (2) the subjects used additional non-tape verbs-—verbs not used at all in the informal essays—thus bringing some special lexical resources to bear on the task of writing formally. Careful analysis of all of the nontape verbs in both sets of essays reveals that the increase is attributable almost entirely to the latter factor: i.e., they were bringing special lexical resources to bear on the task of writing formally. The following display reports on the total number of verbs not from the taped lecture that occurred in the two sets of essays: Verbs not from the taped lecture that occurred in the corpus: total number of new lexical items, and total number of occurrences of these items. Verbs that occurred in the informal essays Number of new lexical items: 292 Verbs that occurred in the formal essays Total number of occurrences of new lexical items: Number of new lexical items: 634 411 Total number of occurrences of new lexical items: 937 The number of distinct new lexical items within these totals is also included in the above display. Notice that within the total of 634 non-tape verbs in the formal essays there were 292 distinct lexical items and within the total of 937 non-tape verbs in the formal essays there were 411 distinct lexical items. If the significant increase, in the set of formal essays, of non-tape verbs were due to repetition of items already used in the informal essays, we might expect the ratio of total occurrences of non-tape verbs to the number of distinct lexical items to increase, but this ratio remains essentially unchanged. On the other hand, if the significant increase in the number of nontape verbs in the formal essays were due to the use of additional items besides those used in the informal essays, then we would 4 expect the above ratioo to remain constant, as indeed it does, but the number of distinct non-tape verbs to increase noticeably— the above display shows that this is just what took place: whereas there were 292 distinct non-tape verbs in the set of informal essays, there were 411 in the set of formal essays. The statistically significant increase in the number of non-tape verbs in the formal essays is thus almost entirely attributable to the subjects’ using non-tape verbs in the formal essays which they had not used in the informal essays. But the display does not reveal that there was a grand total of 371 non-tape verbs in all the essays in the corpus. Many of these verbs were used (by at least one subject) in both informal and formal essays. It would not be surprising if a majority were to appear in both sets. However, this is not the case. Of the total, 156 appear only in the formal essays and 70 appear only in the informal essays. Thus the large increase in the number of non-tape verbs in the formal essays (from 292 to 411) does not just represent adding of verbs to the non-tape verbs already used in the informal essays but also abandoning of many non-tape verbs already used in the informal essays. (This is especially noteworthy in view of the fact that the subjects had their informal drafts in hand as they wrote the formal drafts.) There are thus good grounds to make the generalization that formal written English differs from informal written English in that it has a larger inventory of verbs than informal English and that many of these verbs tend to occur only in formal texts. But the above findings also indicate that certain verbs seem to be reserved for use in informal texts as well. How then can we explain the significant increase in overall length (total number of words) in the formal as opposed to the informal essays. In other words, is there anything about the verbs used only in the formal essays which distinguishes them from the verbs used only in the informal essays? To explore this problem, it was decided to compare the list of just those 156 non-tape verbs that occurred only in the formal essays with the list of just those 70 non-tape verbs that occurred in the informal essays. A look at the subcategorization of the verbs on these lists seemed promising. First, a few comments about subcategorization. Why, for instance, does John hit the ball seem perfectly acceptable in English but *John hit happy and *John hit that Mary sold the car do not? John disappeared sounds fine, but John disappeared the ball does not. The lexical information known to users of English in accord with which they would accept and possibly produce the first version I have given in each case, and reject the others is accounted form in transformational grammars by what Chomsky9 has termed strict sub-categorizational features. But transformational research on sub-categorization is recent and usually highly schematic. There is not available a comprehensive lexicon in the generative tradition. Nor was it feasible to accomplish adequate lexical analyses of the 156 non-tape verbs of the formal essays and the 70 non-tape verbs of the informal essays. But lexical information provided by A. S. Hornby, E. V. Gatenby, and H. Wakefield in their Advanced Learner’s Dictionary of Current English10 (henceforth referred to as the ALD) closely approximates Chomsky’s strict subcategorizational features and thus reflects the same characteristics of verbs that would form the basis for analysis of strict subcategorizational features. The introduction to the ALD presents a synopsis of English grammar based on A. S. Hornby11 where English is analyzed according to the following twenty-five verb patterns: 5 The twenty-five patterns that form the corse of Hornby’s grammar as listed on page xv of the ALD (first edition): Patterns 1 to 19 indicate what are usually called transitive uses of verbs. Patterns 20 to 25 indicate what are usually called intransitive uses. The term conjunctive is used in this list for the interrogative adverbs and pronouns (how, what, where, who, whom, whose, why) and the conjunctions whether and if (when this is used for whether) when they introduce dependent clauses or infinitive phrases. VP 1…Vb x Direct Object VP 2…Vb x (not) to x Infinitive, etc. VP 3…Vb x Noun or Pronoun x (not) to x Infinitive, etc. VP 4…Vb x Noun or Pronoun x (to be) x Complement VP 5…Vb x Noun or Pronoun x Infinitive, etc. VP 6…Vb x Noun or Pronoun x Present Participle VP 7…Vb x Object x Adjective VP 8…Vb x Object x Noun VP 9…Vb x Object x Past Participle VP 10…Vb x Object x Adverb or Adverbial Phrase, etc. VP 11…Vb x that-clause VP 12…Vb x Noun or Pronoun x that-clause VP 13…Vb x Conjunctive x to x Infinitive, etc. VP 14…Vb x Noun or Pronoun x Conjunctive x to x Infinitive, etc. VP 15…Vb x Conjunctive x Clause VP 16…Vb x Noun or Pronoun x Conjunctive x Clause VP 17…Vb x Gerund, etc. VP 18…Vb x Direct Object x Preposition x Prepositional Object VP 19…Vb x Indirect Object x Direct Object VP 20…Vb x (for) x Complement of Distance, Time, Price, etc. VP 21…Vb alone VP 22…Vb x Predicative VP 23…Vb x Adverbial Adjunct VP 24…Vb x Preposition x Prepositional Object VP 25…Vb x to x Infinitive It is quite clear from Hornby’s introduction that the above patterns attempt to explicate the same facts of language as Chomsky’s proposals concerning strict subcategorizational features: A knowledge of how to put words together is as important as, perhaps more important than, a knowledge of their meanings. The most important patterns are those for the verbs. Unless the learner becomes familiar with these he will be unable to use his vocabulary. He may suppose that because he has heard and seen ‘I intend (want, propose) to come’, he may say or write ‘I suggest to come’, that because he has heard or seen ‘Please tell me the meaning’, ‘Please show me the way’, he can say or write ‘Please explain me this sentence’. Because ‘he began talking about the weather’ means about the same as ‘He began to talk about the weather’, the learner may suppose, wrongly, of course, that ‘He stopped talking about the weather’ means the same as ‘He stopped to talk about the weather’.12 The lexical entries in the ALD provide, in addition to pronunciation and meanings, lists of allowed verb patterns (which are analogous to listings of strict subcategorizational features that might accompany lexical entries for verbs in a lexicon of the type Chomsky proposes13). In order to use verbs and to structure sentences appropriately around 6 verbs, the user of English must have knowledge of the kinds of facts that Hornby labels verb patterns and Chomsky calls strict sub-categorizational features. The ALD provides verb pattern information only for those verbs traditionally designated as such, and not for adjectives. This phase of the investigation was thus limited to those 115 of the formal essay non-tape verbs that fit the traditional notion of verb, and to those 41 of the 70 informal essay non-tape verbs that fit the traditional notion. The first of the following displays lists the 115 verbs used only in the formal essays and gives the verb patterns they allow. The second of the following displays provides the same information for the 41 verbs used only in the informal essays. List of 115 traditionally defined verbs not from the taped lecture which were used only in the formal essays. The numbers indicate the ALD verb patterns which each verb allows. alleviate avert avoid battle bear benefit calculate catch up coexist counteract couple curtail desire discuss divide double drive ease eliminate emerge enter equalize exceed exert explore fail fall fill halt handle harness 1 1, 18 1, 17 24 1,2,3,10,17,18,19,21,23,24 1 1,10,11,13,15,21,24 10,23 1 1 1,10,18,21 1 1,2,3,11 1,13,15 1,10,13,18,21 1,10,21,23 1,7,10,13,18,21 1,10,13,18,21 1,18 21,24 1,10,18,21,24 1 1 1,10,18 1 1,21,23,24,25 20,21,22,23,24 1,10,18,19,21,23 21,23 1 1 implement intensify irrigate lack launch mount move negate observe oppose outgrow outrun owe persist point out protract range recognize reconcile reenter replace resort (to) risk run out (of) search seek shift spread tear threaten 1 1, 21 1 1 1,18,23 1,21,23 1,10,11,18,21,23 1 1,4,5,6,11,13,15,16,21 1,18 1 1 1,18,19,21,24 21,24 10 1 1,10,23,24 1,4,11 1,18 1,10,18,21,24 1,18 24 1,17 23 1,10,23,24 1,10,24 1,10,18,21,23,24 1,10,18,21,23 1,7,10,21,23,24 1,2,18,21 accomplish attain 1 1,24 pass perish 1,10,18,19,21,23,24 1,21,23 7 baffle balance boom broaden conquer defeat design end envision fit happen influence lick (beat) notice participate 1 1,18,21,23 10,21,23 1,10,21 1 1 1,2,10,18,21 1,10,21,23,24 1 1,10,18,21,23 21,24,25 1 1 1,5,6,11,13,15,21 21,24 plague pose present prove refuse serve succeed suggest surpass turn wage wait watch witness yield 1 1,18,21,23,24 1,10,18 1,4,11,22,24,25 1,2,19,21 1,18,19,20,21,23,25 1,10,21,23 1,11,17,18 1,18 1,7,10,18,21,22,23,24,25 1 1,18,20,21,24,25 1,5,6,10,13,15,20,21,22 23,24,25 1,21,24 1,10,18,21,24 accommodate achieve amount (to) approach attempt cancel out compare compel conclude cope 1,18 1 24 1,21 1,2,17 1 1,18,21 1,3 1,2,4,11,18,21,24 21,24 die foresee indicate inhibit insure join meet outweigh tend (crops) wipe out 21,22,23,24,25 1,11,15 1,11 1,18 1,18 1,10,18,23,24 1,10,21,23,24 1 1 1,7,10,18 List of 41 traditionally defined verbs not from the taped lecture which were used only in the informal essays. The numbers indicate the ALD very patterns which each verb allows. affect attack band (together) cite contend discourage divide drink hamper incur 1 1 10,12 acquire assume base challenge check conflict enhance 1 1,4,11 18 1,3,18 1,10 21,24 1 1 11,24 1,18 1,10,18,21,23 1,10,21,24 1 1 invent lesson note outbalance outstrip practice react stem vary 1 1,21 1,11,13,15 1 1 1,17,18,20,21,24,25 21,23,24 1 1,21 linger overpower prevail right speed up spring strive 10,21,23 1 21,24 8 1,10,21,23 1,18,20,21,23,25 21,24,25 farm figure (out) furnish 1,21 10 1,18 suspend understand win 1,18 1,2,11,13,15,17,21 1,3,10,18,19,21 overcome 1 worsen 1,21 Analysis of the information contained in the above displays reveals a number of interesting tendencies: (1) of the 115 verbs used only in the formal essays, half of them allow three or more patterns; of the 41 verbs used only in the informal essays, less than a third allow three or more patterns; (2) of the 115 verbs used only in the formal essays, more than 40% allow transitive and intransitive verb patterns; but of the 41 verbs used only in the informal essays just slightly over 30% allow both transitive and intransitive patterns; (3) the verbs used in the formal essays show a greater tendency to allow longer patterns, for example patterns 10, 18, and 24. I have attributed the significant increase in the number of verbs in the formal essays of my corpus to the fact that the formal essays contain more non-tape verbs, and different non-tape verbs, than the informal essays. I have raised the question of how the non-tape verbs used only in the formal essays might differ from those used only in the informal essays, and I have sought an answer in the subcategorization of the verbs on each list. As a result, I can propose the following as reasonable assertions concerning significant aspects of the difference between informal and formal written English: (1) The inventory of verbs available for use in formal written English is larger than that for informal written English; (2) Certain verbs tend to be reserved for use only in formal and certain in informal written English; (3) Verbs used in formal written English tend to have a greater syntactic potential than verbs used in informal written English: they allow more verb patterns and longer verb patterns than verbs used in informal written English. Furthermore, to the extent that Hornby’s verb patterns are analogous to strict subcategorizational features, we may say that verbs used in formal written texts have more such features and longer features is their lexical entries. 9 APPENDIX Text of the taped lecture To raise the food and other farm products people need, a little more than one and one quarter acres of arable land was available per person for the world as a whole in 1955. If , as some experts now predict, the world population increases by the year 2000 to approximately six billion persons, which is at best a guess, the amount of arable land per person will decrease to just over one-half acre. There will be some new land under cultivation, of course, and productivity will increase through the use of better fertilizers, etc. But advances in medical knowledge and in public health will lengthen the average life span and tend to offset other factors which slow down the rate of increase. As the population increases, therefore, the size of the piece of land from which each obtains his food has to decrease. The countries of the world have to solve serious problems if their peoples are to get enough food in the future, even though, in some parts, surpluses of a few products create problems that are different. It is not enough to say that men are always competing among themselves for food and that there never was a time when every man had enough to eat. Competition for food or for the land upon which to produce it or for the access for the waters from which to procure it causes many conflicts among men, tribes, and countries. The gigantic task of feeding today’s population or a much larger population, which is likely in 50 or 100 years, is hard to visualize, and we ask ourselves: “How long will it be before limitations on the production of food forces population numbers to become stable?” We can put the dilemma this way: Compete (maybe even fight) for food and the places producing food or cooperate in raising and procuring enough food for everybody. At the same time we also have to find means with which to limit the growth of populations. These are some of the uncertainties which make it impossible to predict the population-food relationships in any one country, or in the world as a whole, in some distant year, or even in the year 2000. 10 NOTES John S. Kenyon, “Cultural Levels and Functional Varieties of English,” College English, 10 (1948), pp. 31-36. 1 2 Martin Joos, The Five Clocks, (Bloomington: Indiana University Research Center in Anthropology Folklore and Linguistics, 1962). Joshua A. Fishman, “The Sociology of Language,” in Language and Social Context, ed. Pier Paolo Giglioli (New York: Penguin, 1972), p.49. 3 4 William Labov, Sociolinguistic Patterns (Philadelphia: University of Pennsylvania Press, 1972); See especially chapter 8. 5 Labov, 191 and 247. 6 Gustav Herdan, Quantitative Linguistics, (London: Butterworths, 1964) pp. 71-76 7 See, for example, George O. Curme, Principles and Practice of English Grammar, (New York: Barnes and Noble, 1947), pp. 105-106. 8 George Lakoff, “On the Nature of Syntactic Irregularity,” in Mathematical Linguistics and Automatic Translation, ed. Anthony Oettinger (Cambridge: Harvard Computation Laboratory, 1965) pp. A1-A19. 9 Noam Chomsky. Aspects of the Theory of Syntax, (Cambridge: MIT Press, 1965), pp. 90-127. A. S. Hornby et al., The Advanced Learner’s Dictionary of Current English. (London: Oxford University Press, 1963). 10 11 A. S. Hornby, A Guide to Patterns and Usage in English, (London: Oxford University Press, 1954). 12 Hornby et al., p. v. 13 Chomsky, 164-192. 11