ICAMET (Innsbruck Computer Archive of Machine-Readable English Texts) Manual of the Innsbruck Middle English Prose Corpus Version 2.4 Manfred Markus English Department, University of Innsbruck Austria © Manfred Markus Innsbruck, 2010 1 Preface 2010 For a table of contents of this file please open the menu card LAYOUT and then DOCUMENT STRUCTURE in your WORD program. You will then see the chapters and subchapters of this manual in a separate window. The following manual is an abridged and substantially revised version of the published booklet of 1999 (Manfred Markus, Manual of ICAMET. Innsbrucker Beiträge zur Kulturwissenschaft. Anglistische Reihe 7. Wien: Braumüller Verlag). The new version refers to the Innsbruck Prose Corpus only; the Innsbruck Letter Corpus will be published separately elsewhere. The sampler provided on the distributable CD ROM comprises only 139 text files of the complete corpus, both in a doc-version without “cocoa” headers and in an rtf-version with the headers. The latter version has been added for users who do not presently work with Windows XP, as well as for users of Raymond Hickey’s program Corpus Presenter (see below), which accepts only rtf-files for analysis. The Innsbruck Prose Corpus as a whole now consists of 159 files, all of which, due to copyright restriction for some of the texts, are accessible only on a CD ROM at the English Department of the University of Innsbruck. Chapter 5.2. below provides information on which files of the total corpus could not be included in the present sampler. The earlier sampler of Middle English prose published on the 2 nd edition of the ICAME CD ROM in 1999 was in need of revision for two main reasons: the number of text files was considerably 2 smaller than in the subsequent versions, which have profited from the recent permission of the Early English Text Society to include in the Innsbruck Prose Corpus 23 books published by the EETS and still under copyright protection. The exact titles are identified in the survey table below (5.2.). I am most obliged to the EETS for this act of benevolence. The other reason why the first version of the Innsbruck Prose Corpus had to be revised is the fact that some of the special characters used in the original DOS files came out modified into “hieroglyphs” in the WinWord version published by the HIT Centre in Bergen in 1999. I apologise for this defect and am now pleased to announce that the problem of the fonts has been solved in the present version. As regards the copyright barrier, I am still hopeful that Oxford University Press will allow the "fair academic use" and distribution of the digitised files of EETS books still under copyright protection. As in the past, the Innsbruck Middle English Prose Corpus is here offered to the international community of researchers for scholarly purposes on a non-profit basis. Whoever uses the corpus as a whole or in parts for publications is kindly asked to send me a message of information about this fact. The rtf-versions of the present sampler are furnished with COCOA headers. These headers allow analysts to filter out files according to any of the 26 parameters offered for this purpose. For the implementation of the headers in the two Innsbruck corpora I am much obliged to Raymond Hickey, whose program Corpus Presenter was published in bookform (cf Hickey 2003). I would also like to thank Hans-Jürgen Diller, who has been one of my most critical users of the corpus over the last few years by referring me to mistakes and incongruities, particularly in the 3 headers. As far as these are concerned, I am also very obliged to one of my postgraduate students in Innsbruck, Andrea Leonie Krapf, for helping me to proofread them. Lack of time and money caused the headers to be anything but perfect and completed. While there have been experiments with normalising some of the texts of the Innsbruck Prose Corpus, we have not been able to do this systematically for a larger number of texts. This task, which in principle can be solved with the help of recent programs, such as Corpus Presenter (Hickey 2003), has to be left to individual users’ own initiative. Manfred Markus, University of Innsbruck, Dept. of English, January 2010 4 Preface (05/1997) ICAMET, the INNSBRUCK COMPUTER ARCHIVE OF MACHINE-READABLE TEXTS, has three parts: (1) the INNSBRUCK PROSE CORPUS (1100 to 1500) (2) the INNSBRUCK LETTER CORPUS (1386 to 1688) (3) the INNSBRUCK VARIA CORPUS (still in preparation) The three sub-corpora are fairly unequal. The PROSE CORPUS consists of 159 full-text data bases, usually complete books, of nearly 6 mill. words altogether. The LETTER CORPUS, considerably smaller, contains 254 letters of a total of 110,307 words (2006: 337 letters of a total of 146,183 words). The VARIA CORPUS is a potpourri of a dozen or so translated, normalised, tagged or alternative versions of the texts, particularly of those in the Prose Corpus; also, some texts ended up in the varia section, because they belatedly turned out to be in verse or post-1500. This manual is only concerned with the PROSE CORPUS (ICAMET proper, so to speak). ICAMET was supported by the Austrian "Forschungsfonds" in its initial stage, namely for two years from 1992 to 1994. While I am most obliged for this support, the reason why the project could not thrive as planned, has to do with the reduced amount of funding from the very beginning and the abrupt cancellation in the late summer of 1994. The original schedule for the Innsbruck full-text data base of prose was to compile considerably more files than can now be presented to the international community and also more representative ones. Moreover, we were confident of getting full copyright permission by the EETS for fair academic use. 5 As things have developed, the Innsbruck Prose Corpus is as yet incomplete and more fragmentary than intended. Some text types are not at all or insufficiently represented, so are the 12th and 13th centuries vs the 14th and 15th, and the copyright question, concerning about 2/3 of our texts, has not fully been solved yet and caused the unpleasant delay of the publishing of this corpus. Unlike many commercial distributors of CD-ROMs, we have not deliberately selected old editions only to bypass the copyright problem. As a result, for the time being some of our texts are not freely available, whether on CD-ROM or on the Internet. But all texts of the Innsbruck Prose Corpus, including those that are still under copyright protection, can be used by researchers in Innsbruck itself. The regularly updated details concerning the availability of the Innsbruck Prose Corpus can be found on our Internet page (address: www2.uibk.ac.at/fakultaeten/c6/c609/projects/icamet). Innsbruck, May 1997 Manfred Markus Acknowledgements While we are still hoping for access to all the books of the Early English Text Society included in the Innsbruck Prose Corpus, we highly appreciate the permission of the Council of EETS to use a subset of the EETS volumes still under copyright protection, namely 23 (see the table under 5.2.); for this I am particularly obliged to Richard Hamer of the EETS. Moreover, the distribution of our corpus texts was graciously licenced by other 6 publishers, in particular, the Universitätsverlag C. Winter, Heidelberg, for their series Middle English Texts (General Editors: Manfred Görlach and O.S. Pickering), and James Hogg for some volumes of the Salzburg Series. I would also like to thank the following publishers for receiving permission to include single texts in our corpus: Almqvist & Wiksell's Boktryckeri, Upsala; Eynar Munsgaard Publishers, Copenhagen; Milford Publishers, Oxford & London; Garland Publishing, New York & London; Martinus Nijhoff, The Hague; The University of Leeds; Oxford University Press; and Cambridge University Press. I am also very much obliged to many helping hands, without which, needless to say, the project would not have materialised. Some of them were major members of the project, but most of them cooperated on a short-term basis (so-called "trainees" and "tutors"), and they have been too many over the years to be all mentioned individually. My particular thanks, however, go to Roland Benedikter, Andrea Kruckenhauser, Robert McColl Millar, Ulrike Mühlbacher-Nadenik, Paul Perger, Eva Maria Rainer, Elliot Schreiber, Gerda Schütz, Maturot Sinavarat, Aloys Wechselberger, and Claudia Herzog. I would also like to thank both the Department of English and the Faculty of Letters of the University of Innsbruck for their infrastructural support and the hardware needed in a project like the Innsbruck Middle English Prose Corpus. Moreover, I am very much obliged to the "EDV-Zentrum" and the "Subzentrum" of the University, in particular the late Georg Anker, for software and know-how support. Of the various other people that were of help "warewise" (i.e. in matters of hard- or software), I would like to mention Mario 7 Andriollo and Josef Wallmannsberger, both temporary Innsbruck colleagues of the English Department. Moreover, I am aware to have profited a great deal from several student participants of my classes on computer philology, taught in Innsbruck over the last twenty years; in this dynamic academic field, where teachers can particularly learn from their students, it is not a mere gesture that I express my thanks to them. As researchers in residence, some of the users of the full version of the Innsbruck Prose Corpus, ‘in particular, Hans-Jürgen Diller, have obliged me by giving us feedback concerning mistakes in the corpus; we are most thankful for this kind of cooperation. Finally, I have often felt inspired and encouraged during my participation in various conferences of the last few years, among these a series of ICAME conferences and, as far as historical linguistics is concerned, several conferences initiated or even convened by Matti Rissanen of the University of Helsinki; so not the least of my thanks go there and to the ICAME organizers. It also has to be acknowledged that The Helsinki Corpus of English Texts, initiated by Matti Rissanen and Merja Kytö, was the first of its kind, thus paving the way for many other historical English electronic corpora to follow. 8 0. Introduction: organisational The method of compiling this corpus was basically the following: the selected text was first scanned, either directly from the book or, in some cases of bad quality of pages (e.g. uneven patches), from a xerox copy made for the purpose. The scanner was a Siemens Highscan 400 machine; the programme for font recognition was, after an initial phase (1991) of experimenting with OCR Recognita and Optopus, PROLECTOR. It is not so easy to handle, but flexible as to the special needs in view of "badly" printed old books 1. The scanned texts were all manually corrected by two people and given a normalised format/layout. Correction always meant reading against the original edition at least once; shorter texts up to 100 pages were read independently twice. With longer texts I functioned as the second corrector by at least checking the degree of reliability of the first correction. The work of correction partly included applying global commands and WORDCRUNCHER index lists (cf. Markus 1994). While all contributors to the project gave their best, no hundred percent reliability could be reached. Almost all the members of the changing team were non-specialised in Middle English orthography. But then even well-made present-day books are not a hundred percent perfect. All in all, the Innsbruck Middle English Prose Corpus and the Innsbruck Letter Corpus, far from being a mechanical reproduction of edited texts, are reliable enough to be used as bases for scholarly work. Benevolent users are invited to report any mistakes found in the corpus texts to my e-mail address: manfred.markus@uibk.ac.at. 1 The 2005 standard is represented by FineReader 7.0. 9 1. Principles of compiling the PROSE CORPUS The Innsbruck Prose Corpus is a full-text data base, aiming at target groups of users who, unlike those of the Helsinki Corpus of English Texts (cf. Kytö 1993), are not only interested in extracts of texts, but in their complete versions. The corpus thus allows literary, historical and topical analyses of various kinds, particularly studies of cultural history, but it also invites linguists to raise questions, e.g. of style or rhetorics, for which one would want a lengthier piece of text, or its beginning and ending. The corpus is a selection of Middle English prose. The texts are, therefore, relatively free from poetic stylisation (in spite of the occasional role of the alliterative and formulaic tradition). The language of this prose can be assumed to be closer than that of poetry (if not close) to the way language was really used in speech. It also represents the many varieties, both in speaking and spelling, of Middle English as used in special text types and for special occasions. 1.1. Overall structure The corpus comprises 159 files, representing 131 texts as found in scholarly editions, widely those published by the Early English Text Society (EETS). In line with editing habits, texts normally have book length, and they are accordingly identified in our corpus by an individual number. Only in some exceptional cases, like with Caxton's collected prologues ("caxtpro"), have we given one ID-number to a collection of (usually short) texts. On the other hand, many-volume texts (like Pecock's The Donet) which were published separately under different names are 10 stored volume-wise in the corpus and have accordingly been given different names and ID-numbers. Text versions based on different manuscripts and presented synoptically in some editions have been given one number only, which is, however, specified by additional letters a, b, c, etc. 1.2. File names All the files have names within the 8-letter DOS mode. If the author is known, the first letters of the name of a file usually suggest the author's name before the title of the work, thus myrcseve means "John Myrc, Seven Questions...". In the case of works by Chaucer, however, the question of the manuscript used for an edition has come in. Here the file name usually includes reference to a manuscript or edition. So persske is the name for Chaucer's Parson's Tale in the edition by Skeat. The more common case is that of anonymous works. Here the file's name suggests either its title only or the title plus the manuscript; thus, ancrenero means "Ancrene Riwle, MS Nero". In the rare cases when author, title of work and manuscript have to be named, the manuscripts are merely suggested by the last letter (a, b, c, etc.). By the same token, texts which were too long for saving on one diskette were split into different parts identified by final running numbers after the proper name. Thus, trevdia1 means "Trevisa, Dialogus inter militem et clericum, Part 1". The files of the corpus, arranged alphabetically and according to other principles, are listed in chapter 5 below. 11 1.3. The compiler's dilemma: authenticity vs. retrievability On the one hand, the specific characteristics of a manuscript as reflected in the used edition are to be presented on the screen as authentically as possible, with all the deficiencies or alleged deficiencies of the manuscript. On the other hand, medieval scribes and editors of medieval texts reveal different policies of encoding, so that the output is anything but consistent. In order to make things retrievable for the computer, some of the practices of both scribes and editors have to be emended and many of the coincidental characteristics of fonts, format and layout have to be given up. This may be illustrated by two examples. Medieval scribes, and partly the editors with them, did not always pay attention to the unity of words, for example, by cutting them to pieces from one line to the next without hyphenation. While it is not entirely clear whether strings such as there fore are to be considered as one word or two in medieval texts, we have linked obvious constituents of words, marking the intervention by an underscore hyphen in parentheses (_). We have used an underscore hyphen without parentheses in those cases where a word has been split by the editor through the breaking of line and has been syllabified. We were thinking of lexical analyses and crunching programs, which, without our intervention, would index words like house/hus and bond where the text really has or intends husbond. On the other hand we did not want to give up the line-breaking of editors altogether. Keeping the lines of the editions not only made our task of proofreading much easier, but will also allow future users easily to check texts against the editions used in the given cases. 12 Moreover, sticking to frozen lines (preferably those of the editions) will allow a later semi-automatic production of normalised interlinear lines. While these are not (yet) provided in the present version of the Prose Corpus, I have occasionally reflected the possibility and necessity of such additional normalised lines for Middle English prose texts (cf, for example, Markus 1997). A second example of unavoidable emendation is the inconsistent use and representation of initials in manuscripts and editions. While ignoring them would be a real loss in view of their possible function as markers of the beginning of chapters, the editors' various ways of marking, or commenting on, different types and sizes of initials had to be given up for the sake of their being retrievable. If we had reproduced a dozen or so different ways of marking initials, how could the user have been in a position to find them? So we have used just one coded marker for initials, namely <b> (for "boldface") (see below 4.1.). 13 2. Survey of used codes When the corpus was started to be compiled, we had to use different modes of producing characters not available on the keyboard or in lower ASCII format. In the meantime, the present versions of WORD have most of the characters needed for ICAMET on the keyboard, or they allow access to them in the INSERT-SYMBOL routine. So most of the problems concerning characters raised in the first edition of this handbook have been solved. This is true of the ash, the thorn and the eth, both upper and lower case, and also the accentuation of all vowels including y is no longer a problem. Only one character is, for all I know, not generally available yet, namely the yogh (Z). As its presentation here shows, the real problem is not its production on screen - I have taken it from the collection of fonts in "Times New Roman Phonetics", which is part of my WORD 2003 package. But if users accidentally change their fonts, for example to Courier New, our yogh comes out wrongly as a z. So the true problem involved is to use encodings that are equally reproduced in different worlds of fonts and yet are not too non-iconic. In the present case of yogh I decided for a simple solution by using the number 3. I opted against the circumscriptive encoding used by the Helsinki Corpus (+g) because "+" has been used in the Innsbruck Prose Corpus for other purposes; moreover, "+g" for the yogh makes a text difficult to read. Users outside the Latin (Western) script culture may still find one or other special character misrepresented in their text processing surroundings. The problem is one of the wide lack of international encoding norms. All I can say is that we checked 14 our special characters in WINDOWS 1998 and found them correctly transferred when WINDOWS XP emerged. 2.1. Use of lower ASCII Though modern keyboards and WINDOWS have made ASCII superfluous, Table 1 below seems helpful for the sake of the functions of the signs rather than their mode of production. The lower ASCII codes are internationally coordinated, but unfortunately limited in range. Apart from the characters of the present-day (English) alphabet they comprise the main punctuation marks and diacritical signs. In the present corpus most of these codes represent themselves, but some of them, printed in boldface in Table 1, are used for specially defined coding purposes (for details see 3.1. below). ASCII char. name/main function -------------------------------------------------------20 ¶ paragraph sign2 33 ! exclamation mark 34 " quotation mark/double quote 38 & ampersand sign (meaning 'and') 39 ' apostrophe/single quote/accent aigu3 40 ( opening parenthesis 41 ) closing parenthesis 42 * asterisk (general text marker) 43 + plus/Christian cross 44 , comma 45 minus/hyphen (dash: --) 46 . period/full stop 2 3 Not to be merged with the usual return sign at the end of paragraphs, which unfortunately looks the same on the screen, but cannot, of course, be produced in the middle of lines filled with text. When an accent sign, which looks rather like an accent aigu, is not above a character, but has to follow it (like in the case of <y> in the DOS version of the corpus), the computer produces a vertical stroke, which is identical with the apostrophe sign and the quotation mark. 15 47 58 59 60 61 62 63 91 92 93 94 95 96 123 124 125 126 / : ; < = > ? [ \ ] ^ _ ` { | } ~ slash colon semicolon (;; for inverted semicolon) opening claret bracket (<-- is used as an arrow) equals/superscript markers closing claret bracket (--> is used as an arrow) question mark opening bracket back slash closing bracket circumflex underline accent grave opening brace vertical bar closing brace tilde Table 1: Lower ASCII characters 2.2. Use of ASCII 128 to 255 In case any higher ASCII encodings were overlooked in our revision (which, we hope, is not the case), Table 2 may turn out to be helpful: ASCII char. name/main function -----------------------------------------------------145 146 156 164 168 191 226 4 5 æ Æ £ ñ ¿ ¿ l.c.4 ash u.c.5 ash pound sign n with superscript tilde l.c. yogh (makeshift) u.c. yogh (makeshift) u.c. eth (makeshift) Abbreviation for "lower case" (i.e. small). Abbreviation for "upper case" (i.e. capital). 16 232 233 235 Table 2: l.c. thorn (makeshift) u.c. thorn (makeshift) l.c. eth (makeshift) Higher ASCII characters originally used in the Innsbruck Prose Corpus The makeshift characters are, at least in the majority of cases, somewhat suggestive of the characters really intended. 3. Special issues of encoding 3.1. Symbols and contractions Medieval and Middle English manuscripts are full of abbreviating symbols and contractions. In the editions these have been either expanded or printed literally, almost as a matter of taste. In the present corpus, we have generally followed the editors one way or another. In the few cases where we have not done so this happened for special reasons, e.g. when the editor was inconsistent; usually we have made this clear in marked comments. Whether contractions, wherever they occur, have to be strictly marked as such is another question. The use of symbolic signs and word contractions instead of (complete) words, though generally frequent in the Middle Ages 6, in individual texts only concerns a limited set of special cases. In Middle English prose we are mainly talking about word symbols in the context of common everyday matters (like money units) and with the 6 Cf. the computer program Abbreviationes by Olav Pluta; http://www.ruhr-uni-bochum.de/philosophy/projects/abbrev.htm of 19 September 2010. 17 naturally frequent function words (like and); for the rest, the tendency to contractions focusses on frequent words or phrases like quoth, that, what, etcetera, frequent affixes like the suffix our or nasal phonemes. Table 3 gives a selective survey. (a) symbols character name function --------------------------------------------&/7 ampersand and/ond/et + plus sign relig.: the cross (b) words and phrases: qt wt þ etc./& c. = = = = quoth with that et cetera (c) affixes/phonemes oþ7 pvoked8 linguã frō primo = = = = = Note: Superscript tilde and stroke above vowels mark the nasal quality of a vowel or a nasal after the vowel. oþer provoked linguam from primo9 Table 3: Symbols and abbreviations in the Innsbruck Prose Corpus In the face of the limited number of the symbols and abbreviations in a given text, corpus users will not find it 7 8 9 Often the upper part of the thorn is crossed by a small horizontal stroke. The lower part of the <p> is frequently marked by a small horizontal stroke. The same with other ordinal numbers. 18 difficult to identify symbols and contractions, and the markers applied for these by editors are in the main dispensable. We of ICAMET have therefore omitted flourishes and have normalised italics, no matter whether flourishes and italics were used by editors for marking expansions or not; in some editions they have, in fact, a merely decorative function. At any rate it would have been extremely difficult for us to encode the special signs. Moreover, flourishes have been ignored in many editions anyway. As to superscript tildes and strokes, they are sometimes also used for the purpose of abbreviation, but they are by far less frequent than flourishes and italics; we have therefore abided by any editor's policy and encoded the superscripts (cf. below 3.5.). 3.2. Punctuation Punctuation marks are text level markers. Medieval scribes mostly used them scantily and/or -- from a modern point of view -- inconsistently. Whatever marks we find in the editions, they are widely due to the editors. Unlike in present-day usage, punctuation marks sometimes do not have a syntactic, but a prosodic function (cf. 3.3. below). The following punctuation marks do have a syntactic function: . , : ; ? ! -/ full stop/period comma colon semicolon question mark mark of exclamation hyphen dash slash (sometimes functioning as a full stop) Table 4: Syntactic punctuation marks in the Innsbruck Prose Corpus 19 3.3. Prosodic signs The variable degrees of spokenness, as reflected, among other things, in the prosodic markers of some texts, may be an issue in future research on Middle English prose. We have, therefore, tried to keep prosodic markers separate from syntactic ones as much as possible10, though this has not been the case in many of our source editions. Our general policy has been to use double signs for the prosodic function in those cases where the single sign is occupied syntactically. First, the middle (midline) dot, obviously a marker of speech intervals, had to be kept apart from the period (full stop). We have marked the dot, wherever we spotted it, by a double full stop (..), unless it was the only dot-like marker in the text, in which case we have left the simple full stop of the editor unaltered. Slashes are very common text markers. They obviously often have the same function as middle dots, namely to mark the pause between tone groups. But some editions use slashes (or vertical bars) as slot markers for the folio or page numbers, with the figures added in the margin. To avoid confusion here, we have uniformly used the same marker in all cases of folio marking, namely brackets, e.g. [fol.19a]. For the details, cf. 4.2. below. Another frequent text marker used in the middle of lines to indicate the beginning of new paragraphs is the section sign (¶). It used to be produced by ALT 20 (now equallly encoded in both Courier New and Times New Roman symbols) and functions just like any normal character; i.e., unlike the RETURN sign, which 10 I do not claim complete consistency in this, since keeping the two types of punctuation marks apart is sometimes a matter of subtle interpretation and, therefore, difficult to achieve. 20 looks the same on the screen, it does not have a control function and can be positioned anywhere in the line. Another specially medieval text marker is the inverted semicolon, the so-called punctus elevatus, which often has a clearly prosodic function, suggesting a raising of voice11. This sign has been encoded in the corpus by a double semicolon (;;). All other signs with a probably or possibly prosodic function are used only rarely in our source editions. Within the Innsbruck Middle English Prose Corpus, they are usually referred to in an intervention of ours, such as {dots}, and they are also mentioned in the parameters. We are here talking about markers such as ● accent signs which seem to mark sentence accent rather than word stress (') ● arrangements of dots, such as ::. in Caxton's Prologues ("caxtpro1") ● half-line slashes, either high or low (as again in "caxtpro1") ● marked spaces in the lines (perhaps meant as markers of speech pauses) (cf. "caxtpro1"). 3.4. Different text levels (e.g. editor's emendations) In view of the complexity of editing Middle English texts the ICAMET team members abstained from interpreting the text level codes that are found in the editions, such as italics (used for the marking mainly of emendations, but also of foreign words, quotations, expansions etc.). Other such text level codes are: boldface, parentheses, brackets, claret brackets, italics within parentheses or brackets, subpunctuation, underlined or crossedout letters or words, letters or words in different sizes or colours (as often explained in the apparatus), etc. Particularly in critical 11 Cf. Tolkien 1962, XIII. 21 editions many format markers, including footnoted gaps, only make sense if the apparatus is accessible, which is, of course, something a corpus cannot offer. We therefore decided not to claim or pretend to provide full transparency in this particular respect. Marked distinctions of different editorial interventions have been given up and all the interventions of the editor(s) are equally identified by brackets. Italics, however, had to be normalised, no matter what their function was (cf. below). The reasons for this were pragmatic ones: (a) italics, particularly of single characters, are hard to recognize, both by machine scanners and human proofreaders; (b) they are mostly used for routine emendations, such as expansions of contractions; (c) they cannot be displayed in the ASCII format so that they would have to be encoded (which would make the words concerned hard to read). Another mode of text which no doubt deserves marking is that of our own (ICAMET's) interventions. These are usually comments, hardly ever emendations. Such passages are generally marked by opening/closing braces {}, but only to the extent that they are not subject to routine modification, as when words are, for example, broken at the end of lines (cf. 3.6.). Since the editor's or our comments are not part of the text proper, they are -- in addition to the brackets and braces -- marked by vertical bars, so-called “bybes” (|), formerly produced by ALT 124, but now available within both the Courier New and Times New Roman symbol routine. The scope of these markers is the subsequent word, so that the spaces between several words in a comment have to be bridged by an underscore hyphen. Both the page/line numbers and the headings, i.e. the "title lines", are also marked by initial bybes and internal underscore 22 hyphens between words. These "title lines", which contain the main bibliographical data (cf. 6.1. below), are not identical with, though they often come close to, the editor's or author's introductory titles as given on the first page of the editions. One more text level marker is used in the Innsbruck Prose Corpus (in imitation of HTML tags), namely clarets <...>. They are used only for very specific purposes of formatting (like in the case of boldface letters, cf. 3.5. below). Parentheses are not used as markers in the Corpus. Wherever they are in the text, they have been taken over from the editor or author uncritically. In sum, we have used the following text level markers in the Innsbruck Prose Corpus: |[...] |{...} |... |<...> for the editor's interventions for interventions done by ICAMET for headings, page and line numbers, and remarks for very special tags (cf 3.5.) Table 5: Text level markers in the Innsbruck Prose Corpus In all these cases, spaces between words have been bridged by underscore hyphens (_). 3.5. Character formatting Given that, for the reasons mentioned in 3.4., we have been unable to use specific character formats in the Innsbruck Middle English Prose Corpus, a few formats still have been found indispensable and therefore have been encoded: |<b> 12 for marking initials (of various kinds)12 Normal boldface letters had to be normalised, I am afraid. 23 =x= for a raised letter x, thus, prim=o= for primo13 Note: Complete words or numerical units in superscript format are likewise marked by =...=, inserted in the line where they belong.14 x~ for any letter <x> (with a tilde or horizontal stroke above it); however, <ñ> was formerly available in higher ASCII (ALT 164) so that we have reproduced it in the present corpus version with the help of Courier New or Times New Roman symbols. y' for the letter <ý>, with an accent aigu. y did not allow superscript accents on the keyboard when the compilation of the corpus was started. Thus we have also used < y`, y^> for <ỳ, ŷ>. e<trema> for <ë>; likewise, other vowels with trema and all vowels with other rare signs are tagged in the HTML way with explanatory terms in clarets. The character formats that had to be considered negligible 15 in the Innsbruck Middle English Prose Corpus are: ● ● ● ● 13 14 15 different colours of fonts; different sizes and types of fonts, including simple boldfaced letters, but not initials; spaced letters; italics, mainly used for marking the editor's expansions of contractions; We have followed the coding practice of the Helsinki Corpus here; cf. Kytö 1993, 27. But the superscripts r and v of the folio references, and likewise a and b used for the same purpose, are not particularly marked that way, since the folio references are the editor's addition and have in toto been identified by brackets. I am not saying that they really are. On the contrary, we should go back to the manuscripts to reconsider them--after their generally eclectic and inconsistent treatment by editors. 24 ● flourished letters (used by the scribes mainly for marking contractions). 3.6. Line, paragraph and page formatting The line division of our source editions has strictly been preserved. This is achieved by the use of the RETURN-key at the end of the lines. We have not bothered about an overlength of lines. Whenever a line is longer than 80 characters, which occurs very rarely, it will automatically run on into the next line on the screen. To avoid unnecessarily reduced length and thus overflow of lines, corpus users should set the page margins of their word processing programs on zero. One may also think of reducing the size of fonts. When a word is broken by the change of line (hus- plus bondman), the second element is taken over to the end of the first line and the intervention is marked by an underscore hyphen (_). This marker and method is also used when the breaking of line is marked otherwise in an edition (e.g. by a short vertical bar added by the editor) or when a word is broken by the change of line without a hyphen (which is often the case in the manuscripts). In the latter case, we are talking about instances of our own intervention, so that underscore hyphens had to be marked by braces {...}. In all editions the lines, whenever counted at all, are numbered in groups of five or, sometimes, four, normally pagewise, but occasionally chapterwise. The line numbering of the edition is always provided in the Innsbruck Prose Corpus on the left margin of the computer screen. To keep it distinct from the text proper it is marked by an initial vertical bar ("bybe"). 25 Paragraphs have been marked in the corpus in accordance with the layout of the source text (by an additional RETURN). Page numbers, like line numbers, are marked by initial bybes and strictly follow the source edition. Occasionally texts were arranged columnwise in an edition; we then had to come back to a previous page after a number of columns so that pagination is then not fully sequential.16 3.7. Document formatting Letters, words, lines, paragraphs, pages and chapters--these are the traditional units of segmentation in a prose text. Given the different versions of manuscripts and, as a result, editions of some Middle English prose, the texts are not comparable with one another unless they are divided into equal blocks; if the lines as basic units do not correspond to each other, their numbering is not really helpful for the collation so that we need other, analogous blocks instead. Collating programs, such as COLLATE217, are based on the definition of such blocks. To give an example: the various versions of the Ancrene Riwle are so different from one another that an early editor and translator, J. Morton (1853), found it reasonable to tag them with a numerus currens "M.1", "M.2", etc. throughout the whole text in a synoptic way. These tags or anchor spots, which mark content-based units, have been taken over by most editors in order to make the relationship between the text versions transparent. 16 17 Cf., e.g., some of the Caxton editions. Cf. Robinson 1994. 26 Segmenting parallel text versions into blocks and tagging or marking the blocks would seem an advisable strategy useful for some texts of the Innsbruck Middle English Prose Corpus. So far, we have, however, only experimented with such reference codes, implementing them outside the official Corpus in versions of the different editions of Ancrene Riwle. Such pilot versions are intended for computer analysis and therefore, in principal, open to all kinds of further tags implemented for specific purposes. 4. Permitted and omitted types of format As has been mentioned earlier in this manual, some textual features of the source text had to be ignored or modified in the corpus. This may sound like an offence against philological good manners, but it is unavoidable and, moreover, fully in line with editorial habits. Editors of Middle English texts have widely disagreed on which features of the source manuscript(s) to abide by and which to drop; and they all and sundry were bound in some way or another to abstract from the physical appearance of the manuscripts. Corpus compilers for their part have to adapt their source texts, in our case the editions, to their own medium. To make things transparent to the user of this corpus (even at the cost of occasional repetition), the source features omitted or modified in the Innsbruck Prose Corpus are singly discussed in the following survey of subchapters 4.1. to 4.9. 27 4.1. Different initials There are different sizes, types and colours of initials in Middle English manuscripts. They range from "Four-line ornamental initial R in blue with red pen-flourishes, continued down left margin" (as commented by the editor in view of the beginning of the Cleopatra MS of Ancrene Riwle18) to an "Unusually large and prominent black capital" in the middle of a line a few pages later19. Moreover, initials may be lower or upper case letters, really written and painted or merely "virtual", i.e. planned and suggested by the free space in the manuscript and/or by a normal letter which may or may not be fully visible. Faced with such differences and sometimes undecided about the function of initials beyond that of decoration, most editors, unlike Dobson in the case just mentioned, have not marked or described the differences of initials to the full or consistently. Accordingly, whoever wants to study the initials in detail is advised to return to the manuscripts and to work out a complete typology based on these manuscripts. This could not be achieved on the basis of editions and, thus, within the present corpus project. The only aim has been to mark all the initials as such, including the minor, often black ones in the middle of lines, and to suggest the decorative size of the major ones by giving them space on the screen, for example, a four-line margin in the case of the Ancrene Riwle. For users to find the initials, these are all coded by |<b> (for "boldface"), with a space before the letter concerned. 18 19 Cf. Dobson 1972, 1, footnote 1. Cf. Dobson 1972, 10, footnote 22. 28 4.2. References to manuscripts, folios and pages The references to folios or pages of a manuscript/printed book and also the references to the manuscript itself are all given in brackets [...] and, in addition, marked by │r (r for ‘remark’). We have regularised and shortened "folio", "fol." and other versions to "f." and "page" or "pag." to "p.", likewise "recto" to "r" and "verso" to "v". Moreover, we have never marked the superscript quality of r or v or of any other such letter. When the recto pages have zero markings so that "f.8" means "f.8r", we have left them as they are, as long as the opposition to the verso pages was clear. We have also tolerated "a" or "b" for the recto and verso pages respectively. Unlike our source editors, we have always inserted the folio or page reference numbers within the lines where they belong, not in the margin. If a word is split by the change of folio or page, we have moved the reference to the end of the word concerned and marked the exact spot within the word by an asterisk * (even if editors have used other symbols). References of a similar kind to the folio numbers, for example, references to pages or cross references to previous editions of a text, have been treated in an analogous way, likewise dates, proper names or other criteria of text arrangement 20; "column" has always been abbreviated to "col." Other manuscript markers of this kind referring to shelf-marks, lines ("sign."/"l.") or the like are so rare in our source editions that we have not made a point of regularising them, but have aimed at following the editor instead. 20 Cf. the collections of charters, letters or (Caxton's) prefaces. 29 4.3. Layout As regards layout, the source texts have generally been imitated. This goes as far as blank spots in the text (for example due to holes or erasures etc. in the vellum). The numbers of pagination and lines are, however, always on the left margin, at a regularised distance irrespective of the idiosyncrasy of the source edition. Our pedantic imitation of the lines, pages and other layout features of the source editions is based on our opinion that the present corpus cannot be a substitute of scholarly editions, but only their complement and that the way back to the editions should be kept as viable as possible, also in order to allow easy scholarly control of the corpus texts in relation to their sources. Head- or footnotes, however, like all marginal comments of the editor(s), have been omitted. Marginal comments of the original scribe or printer have, of course, been treated differently and preserved. Occasionally, multi-column lists and tables have created exceptional layout problems on the screen. We have tried to solve them as best we could, steering clear of tabs and other word processing commands of format or deviant scripts. 4.4. Allographic letters (wynn, ſ ) Wynn <p> is hardly ever, long s <ſ> rarely used in editions of Middle English texts. Since they seem to be allographic, i.e. do not -- in spite of positional preferences in the case of long s -have any specific function, they have tacitly been normalised to <w> and <s> respectively in those extremely rare cases where editors have still applied them. 30 4.5. Punctuation marks and signs Some editions (for example Ancrene Riwle, ed. Day 1952) use vertical strokes within words, thus marking the fact that the words concerned are divided in the manuscripts between two lines and have no syllabification marker; other editions use other markers for the same purpose. We have not kept this case apart from the similar one when the editor/printer has syllabified a word and marked the change of line by a hyphen. In either case the word-unifying intervention is marked in the Innsbruck Prose Corpus by an underscore hyphen, thus mar_tir. In some other cases we have ourselves linked the isolated segments of a word split by the breaking of line both in the original text and in the source edition. We have then marked our intervention, according to our rule, by an underscore hyphen in braces, thus mar{_}tir. Slashes and dashes are, for the sake of uniformity, always typographically kept apart by a space from their surroundings; so are inverted semicolons and midline dots (cf. 3.3.). In the case of other punctuation marks (semicolons etc.), we have followed our source texts and not done any normalising. Tildes and superscript strokes, when used to mark a contraction, are not kept apart; the two symbols are both represented by a tilde after the letter concerned, thus x~, except for the letter <ñ>, which could originally be produced by ALT 16421. Iconic material (pictures, diagrams, various signs) has generally been omitted, but the omission is made evident in a comment in braces. The cross sign, graphically suggested in many a religious 21 It is now represented by a Times New Roman symbol. 31 text of the corpus, has, however, been represented by a plus sign (+); this is always kept apart from its neighbourhood by a space. 4.6. Contractions (curled letters) The flourishes of curled letters have been ignored and omitted; so have now meaningless accents and strokes above <i's> and nasals respectively, for example in Ayenbit of Inwit and in Mirk's Festial.22 These signs may have had a graphical function in Middle English manuscripts, for example in view of the minim problem23; but in modern printed editions letters are, of course, sufficiently identified. In either case normalisation seems legitimate, since only a very small set of everyday words is concerned (cf. 3.1., Table 3 above), or only special letters, such as <łł> and < í >. 4.7. Degrees of (il)legibility There are different types and degrees of legibility or illegibility in the manuscripts. Some editors tried to keep some of these degrees distinct (for example Dobson 1972 in his edition of the English Text of Ancrene Riwle). But without a reviewing of the manuscripts, corpus users are not in a position to judge for themselves and have to take the editors' decisions and interventions for granted. We have therefore normalised the various kinds of editorial interventions by marking them all alike with brackets [...], even when, like in the case of Dobson's edition, various markers and categories of legibility have been used. Corpus compilers and, I 22 After a first analysis, Festial (EETS ES 90) was finally excluded from the corpus, since it provides only extracts. 23 Cf. Markus 1990, 35. 32 daresay, corpus users do not want to be concerned with questions of worm holes or erasures caused by water or light. 4.8. Traces of the scribe's self-correction To give full evidence of the editorial history of a text cannot be within the range of a machine-readable corpus based on editions. Whatever editors, sometimes with a great deal of ambition, have done in this respect (Dobson 1972 is a good example), such distinctions in the present project have been considered to be subject to normalising. Underlined or "subpuncted" segments of texts, thus marked to give evidence of the scribe's or another scribe's corrections, are therefore normalised in format. Crossed-out segments are marked by square brackets to indicate that the passages have been subject to the editor's interpretative and interpretable intervention. 4.9. Foreignness It would be a helpful service, particularly for future lexicological work, to tag the foreignness of words as opposed to Middle English words proper. This would be in line with the Helsinki Corpus policy (cf. Kytö 1993, 30f.). But mixture of languages is so common, even typical in Middle English prose texts (more than in Old English and Modern English) that -- in line with most of our source editions -- we had to abstain from marking language codes. Tagging foreignness would not only have been a great deal of additional work; in parts it would have been an impossible task, since lexical foreignness cannot always reliably be identified. Is Middle English courage a "foreign" word? It certainly was in its initial stage of loaning. 33 French, Latin or other foreign words or passages have therefore been integrated unmarked up to a point, namely up to the length of five lines (also occasional verse lines have been taken in up to that length). Foreign and verse texts beyond that length have been omitted and referred to in braces. This policy of not marking a foreign language code may seem regrettable, particularly in view of the considerable role of macaronic language mixing in Middle English prose. 24 But then the Innsbruck Prose Corpus would profit from tags not only of the used language, but of all kinds, from syntax to semantics and pragmatics, to mention only these three fields. Any initiative and help in this task of tagging the Innsbruck Middle English Prose Corpus would be welcome. 5. Corpus texts (INNSBRUCK MIDDLE ENGLISH PROSE CORPUS) In the following, the files implemented in the Innsbruck Prose Corpus are listed in two different ways: first, alphabetically, with the complete bibliographical data; second, arranged alphabetically by the short names of the files. The second list provides information about the exact sizes and copyright conditions/restrictions of the files. Those texts which could not (yet) be included in the sampler due to copyright restriction are highlighted by bold-faced letters. The numerus currens, added to each title, does not fully correlate with alphabetical order, which 24 The number of researchers who have dealt with this topic include Laura Wright; cf her Sources of London English Medieval Thames Vocabulary. Oxford: Clarendon Press (1996). 34 is due to the fact that some texts were added to the corpus as follow-up files after the first attribution of numbers. 5.1. Alphabetical list of the texts Abbey of the Holy Ghost. In Religious pieces in prose and verse. Ed. from Robert Thornton's Manuscript (c. 1400) in the Lincoln Cathedral Library by George G. Perry. EETS OS 26 (1867/1914), pp. 51-62. abbey 1 Agnus Castus. A Middle English Herbal. ed. S. B. Liljegren. Upsala: Almqvist & Wiksells Boktryckeri AB, 1950, pp. 119-205. agnus 121 Alphabet 1: "An Alphabet of Tales." An English 15th Century Translation of the "Alphabetum Narrationum" of Etienne de Bascon. Part I: A-H. From Additional MS. 25,719 of the British Museum. Ed. Mrs. Mary Macleod Banks. EETS OS 126 (1904). Part I: A-H. alpha1 2a Alphabet 2: "An Alphabet of Tales." An English 15th Century Translation of the "Alphabetum Narrationum" once attributed to Etienne de Bascon. From Additional MS. 25,719 of the British Museum. Ed. Mrs. Mary Macleod Banks. EETS OS 126 (1904). Part II: I-Z. alpha2 2b Ancrene 1: The English Text of the Ancrene Riwle: Ancrene Wisse, Corpus Christi College Cambridge MS 402, ed. J.R.R. Tolkien and N.R. Ker. EETS OS 249 (1962 for 1960). anccor 3 Ancrene 2: The English Text of the Ancrene Riwle, ed. from Gonville and Caius College MS. 234/120, by R.M. Wilson. EETS OS 229 (1954). ancgon 4 Ancrene 3: The English Text of the Ancrene Riwle, ed. from Cotton MS. Nero A. XIV, by Mabel Day. EETS OS 225 (1952 for 1946). ancnero 5 Ancrene 4: Ancrene Riwle, ed. from Magdalene College, Cambridge, MS. Pepys 2498, by A. Zettersten. EETS OS 274 (1976). ancpepys 6 Ancrene 5: The English Text of the. Ed. from Cotton MS. Titus D. XVIII by Francis M. Mack, together with the Lanhydrock Fragment, Bodleian MS. Eng. th. c. 70, ed. by A. Zettersten. EETS OS 252 (1963 for 1962). anctit 7 35 Angels' Song, Of. Ed. Toshiyuki Takamiya, in Two Minor Works of Walter Hilton. Tokyo: Privately Printed, 1980, pp. 9-15. hiltang 109 Art of Hunting, The. William Twiti. ed. Bror Danielsson. Stockholm: Almqvist & Wiksell International, 1977, pp. 40-58. arthunt 112 Boke of Sygnes from the St. Paul's Cathedral Library MS. In The Rewyll of Seynt Sauioure. Ed. James Hogg. vol. 3: The Syon Additions for the Brethren and The Boke of Sygnes. Salzburg: Institut für Anglistik und Amerikanistik der Universität, 1980. pp. 134-144. boke 8 Book of Quintessence, The. Or the fifth being; that is to say, man's heaven. Ed. from British Museum MS. Sloane 73 about 1460-70 A.D. by Frederick J. Furnivall. EETS OS 16 (1866; rev. ed. 1889) [Sloane MS 73, fol. 10-25b]. bookqe 9 Book of the Foundation of St. Bartholomew's Church in London, The. Ed. Norman Moore. BM Cotton MS. Vespasian B ix. EETS OS 163 (1923), pp. 1-63. barthol 103 Brut1: The Brut, or The Chronicles of England, ed. from MS Rawlinson B 171, Bodl. Library by Friedrich W.D. Brie. Part I. EETS OS 131 (1906, repr. 1960) brut1 116 Brut2: The Brut, or The Chronicles of England, ed. from MS Rawlinson B 171, Bodl. Library by Friedrich W.D. Brie. Part II. EETS OS 136 (1908, repr. 1987) brut2 117 Capgrave: John Capgrave's Abbreuiacion of Chronicles (1462,3). Ed. Peter J. Lucas. MS Gg.4. 12. Cambr. U. Libr. EETS 285 (1983). capgrave 10 Caxton 1: Blanchardyn and Eglantine (c. 1489), ed. Leon Kellner. EETS ES 58 (1890). caxtblan 16 Caxton 2: The Curial by maystere Alain Charretier. Translated thus in Englyssh by William Caxton 1484. Ed. Frederick J. Furnivall. EETS OS 54 (1888). caxtcur 17 Caxton 3: Caxton, William, Dialogues in French and English, ed. Henry Bradley. EETS ES 79 (1900). caxtdial 18 Caxton 4: Doctrinal of Sapience, printed by William Caxton, 1489. Ed. Joseph Gallagher. Middle English Texts (Heidelberg: Winter, 1993). caxtdoc 19 36 Caxton 5: Eneydos, 1490. Englisht from the French Liure des Eneydes, 1483, ed. W.T. Culley, F.J. Furnivall. EETS ES 57 (1890; repr. 1962). caxteney 20 Caxton 6: Foure Sonnes of Aymon, The Right Plesaunt and Goodly Historie of the. Englisht from the French by William Caxton and Printed by him about 1489. Part I. Ed. Octavia Richardson. EETS ES 44 (1884), repr. 1975. caxtaym1 11 Caxton 7: Foure Sonnes of Aymon, The Right Plesaunt and Goodly Historie of the. Englisht from the French by William Caxton and Printed by him about 1489. Part II. Ed. Octavia Richardson. EETS ES 45 (1885), repr. 1975. caxtaym2 15 Caxton 8: Knight of La Tour-Laundry, The Book of the. Compiled for the instruction of his daughters; from the unique manuscript in the British Museum, Harl. 1764, and Caxton's print, A.D. 1484, tr. by Thomas Wright. EETS OS 33 (1868). caxtkni 14 Caxton 9: Paris and Vienne. Tr. from the French by William Caxton. Ed. MacEdward Leach. EETS OS 234 (1957; repr. 1970). caxtpar 21 Caxton 10: The Prologues and Epilogues of William Caxton, ed. W.J.B. Crotch. EETS OS 176 (1928 for 1927). 1st edition = caxtpro1 22a 2nd edition = caxtpro2 22b Caxton 11: William Caxton, Quattuor Sermones, ed. N.F. Blake. Middle English Texts. Heidelberg 1975. pp.19-89, 95-108. caxtquat 12 Caxton 12: Thomas of Canterbury. In The Golden Legend or Lives of the Saints as Englished by William Caxton, transl. by Caxton (1483), ed. F.S. Ellis. London: Dent, 1900. 2: 182-97; 4: 56-60. caxttho 13 Caxton 13: Caxton, William, Tulle of Olde Age. Textuntersuchung mit literarischer Einführung, ed. Heinz Susebach. Halle (Saale): Max Niemeyer Verlag, 1933, pp. 1-95. caxtulle 119 Cely Letters 1472-1488, The. Ed. Alison Hanham. EETS OS 273 (1975). cely 23 37 Chaucer’s prose: Chaucer 1, Treatise on the Astrolabe (ca. 1380), in The Complete Works of Geoffrey Chaucer, ed. Walter W. Skeat. 2nd ed. Oxford: Clarendon Press, 1899. vol. 4. astske 26 Chaucer 2, Boethius' Consolatio Philosophiae (ca. 1380), ed. Skeat, 1899, vol. 2. boeske 26b Chaucer 3, “The Tale of Melibeus”, in The Canterbury Tales by Geoffrey Chaucer, ed. from the Hengwrt Manuscript by Norman Blake. London: Arnold, 1980. melbla 26c Chaucer 4, Canterbury Tales, “Melibeus Tale”, ed. Skeat: 2nd ed., vol. 4, 1900. melske 26d Chaucer 5, “The Person's Tale", in The Canterbury Tales, , edited from the Hengwrt Manuscript, ed. Norman Blake. London: Arnold, 1980. persbla 26e Chaucer 6, Canterbury Tales, “Parson's Tale”, ed. Skeat: 2nded., vol. 4, 1900. persske 26f Chronicle, see Peterborough Chronicle. Cloud of Unknowing and the Book of Privy Counselling, The. Ed. Phylis Hodgson. EETS OS 218 (1944 for 1943; repr. 1958). cloudunk 27 Cookery Books: Two Fifteenth-Century Cookery-Books. Harl. Ms. 279 (from 1430), & Harl. 4016 (from 1450). Ed. Thomas Austin. EETS OS 91 (1888). cookery 28 Courtesy: A Fifteenth-Century Courtesy Book. In A Fifteenth-Century Courtesy Book and Two Fifteenth-Century Franciscan Rules, eds. R.W. Chambers and W.W. Seton. EETS OS 148 (1914). pp. 11-17. courtesy 29 Craft of Dying, The. (c. 1450) In Ratis Raving, and Other Moral and Religious Pieces in Prose and Verse. Ed. from the Cambridge University MS., KK. 1.5. London: EETS OS 43 (1870). pp. 1-8. craftdye 30 Curye on Inglysch. English Culinary Manuscripts of the Fourteenth Century including the Forme of Cury. Ed. Constance B. Hieatt and Sharon Butler. EETS SS 8 (1985), pp. 39-156. curying 56 Dan Jon Gaytryge's Sermon. In Religious pieces in prose and verse. Ed. from Robert Thornton's Manuscript (c. 1400) in the Lincoln Cathedral Library by George G. Perry. EETS OS 26 (1867/1914), pp. 1-15. gaytryge 31 38 Dan Michel's Ayenbite of Inwyt, or, Remorse of Conscience. In the Kentish dialect, 1340 A.D. vol. 1: Ed. R. Morris. EETS OS 23 (1866; reprint 1965). danayen 32 Deonise Hid Diuinite and other treatises on contemplative prayer related to The Cloud of Unknowing, ed. Phyllis Hodgson, MS Harley 674. EETS OS 231 (1955 for 1949). deonise 33 Dicts and Sayings of the Philosophers, The. Ed. Curt F. Bühler, ICAMET text: the Scrope translation (1450). EETS OS 211 (1941 for 1939), repr. 1961. dicts 34 Dorothea, Life of Saint. In Prosalegenden, ed. Carl Horstmann 1880 (Anglia), pp. 325-328. doroth 35 Eight Chapters on Perfection, ed. Fumio Kuriyagawa. Tokyo: Privately Printed by F. Kuriyagawa and Toshiyuki Takamiya, Dept. of English, Keio University, Mita, Minatoku, Tokyo 108 Japan, 1980, pp. 14-32. hiltperf 108 English Conquest of Ireland, The. A.D. 1166-1185. MS. Trin. Coll. Dublin, E. 2.31. ed. Frederick J. Furnivall. EETS OS 107 (1896), pp. 2-150. conquest 111 Equatorie of the Planetis, The. From the Peterhouse MS. 75.I., ed. Derek J. Price. Cambridge: C. U. P.: 1955, pp. 19-44. equat 122 Familiar Dialogue of the Friend and the Fellow, A. A translation of Alain Chartier's Dialogus Familiaris Amici et Sodalis, ed. Margaret S. Blayney. EETS OS 295 (1989). famdial 36 Fistula in Ano, Treatise of. Haemorrhoids and Clysters. Ed. D'Arcy Power. EETS OS 139 (1910; repr. 1968). fistula 37 Gesta Romanorum: The Early English Versions of the Gesta Romanorum. Ed. Sir Frederick Madden, re-ed. by Sidney J.H. Herrtage. EETS ES 33 (1879, repr. 1898, 1932, 1962). gestarom 38 Gild of St Mary, Lichfield, ed. F.J. Furnivall. EETS ES 114 (1920). gildmary 39 Gilds: English Gilds. The original ordinances of more than one hundred early English gilds, ed. Toulmin Smith. EETS OS 40 (1870; repr. 1963). gilds 40 39 Gilte Legende, Three Lives from the. In Middle English religious prose, ed. N. Blake. London: Arnold, 1972. pp. 151-173. giltele 41 Govern see Secreta Secretorum Hali Meidenhad. An alliterative homily of the thirteenth century from Ms. Bodley 34, Oxford, and Cotton Ms. Titus D.18, BM. Ed. F.J. Furnivall. EETS OS 18 (1922). Bodley: halibod 42a Titus: halitit 42b Hali Meidhad, ed. Bella Millett. EETS OS 284 (1982) (critical edition). halicrit 43 Hieronymus, The Life of St, in Prosalegenden, ed. Carl Horstmann 1880 (Anglia), pp.328-360. hieron 44 History of Reynard the Fox, in Early prose romances, ed. Henry Morley (London, etc.: Routledge 1889). histreyn 45 History of the Holy Rood Tree. Ed. A.S. Napier. EETS OS 103 (1894). ME: roodme 46 Homilies 1: Twelfth-Century Homilies in MS Bodley 343, ed. A.O. Belfour. EETS OS 137 (1909, repr. 1962, 1988). homilbod 47 Homilies 2: Early English Homilies from the Twelfth Century, MS Vespasian D. XIV, ed. Rubie D.-N. Warner. EETS OS 152 (1918; repr. 1971). homilves 48 Homilies 3: Old English Homilies of the Twelfth Century, from the unique Ms. B. 14.52. in the library of Trinity College, Cambridge. Second series. Ed. R. Morris, 1873. EETS OS 53 (1873), pp. 3-219. oehom 49 Imitatione Christi: Middle English translations of De Imitatione Christi, from a ms. in the library of Trinity College, Dublin (tr. l5th cent.), ed. John K. Ingram. EETS ES 63 (1893; repr. 1987). imita 50 John Capgrave's Lives of St. Augustine and St. Gilbert of Sempringham, and a Sermon. Ed. J.J. Munro. EETS OS 140 (1910). caplives 100 Julian of Norwich's Revelation of Divine Love. The Shorter Version. Ed. from B.L. Add. MS 37790 by Frances Beer. Middle English Texts 8 (Heidelberg: Winter, 1978). julian 51 40 Juliana: Iuliene, The Liflade ant te Passiun of Seinte (c1200), ed. S.T.R.O. d'Ardenne (Liège, 1936, repr. EETS OS 248 (1961). MS Bodley juliabod 52a MS Royal juliaroy 52b Katherine: Seinte Katerine, The Life of. From the Royal Ms. 17 A xxvii, &c. Ed. Eugen Einenkel. EETS OS 80 (1884). Royal, ed. Einenkel: kathroy 53 Kentish Sermons, Old. In Old English Miscellany. Ed. Richard Morris EETS OS 49 (1872, repr. 1927), pp. 26-36. kentserm 54 Kings: The Three Kings of Cologne. An early English translation of the "Historia Trium Regum" by John of Hildesheim. Ed. by C. Horstmann. London. EETS OS 85 (1886), pp. 2-157. Cambridge: kingscam 55a Royal: kingsroy 55b Lanterne of Li3t, The. Ed. Lilian M. Swinburn. EETS OS 151 (1917 for 1915). lantlit 57 Lapidaries, English Mediaeval. Ed. Joan Evans/Mary S. Serjeantson. EETS 190 (1933). lapidari 58 Late Middle English Treatise on Horses, A. Ed. from BL Ms Sloane ff. 102-117b by Anne Charlotte Svinhufvud. Stockholm: Almqvist & Wiksell International, 1978, pp. 85-147. horses 107 Letters concerning Christchurch. James Cornwallis, chief baron of the exchequer of Ireland, the prior of Christ Church (circa 1430). lettchri 59 Liber de Diversis Medicinis, Thornton MS Lincoln Cathedral A. 5.2, ed. Margaret Sinclair Ogden. EETS OS 207 (1938). liber 60 Lincoln Diocese Documents 1450-1544, ed. Andrew Clark. EETS OS 149 (1914), repr. 1971. lincdoc 61 Litil tretys on the Seven Deadly Sins, A. Richard Lavynham, O. Carm, ed. Dr. J.P.W.M. Van Zutphen. Rome: Institutum Carmelitanum, 1956, pp. 1-25. treatise 113 Lollard Sermons, ed. Gloria Cigman. EETS OS 294 (1989). lollard 62 41 Love, Nicholas. Mirror of the Blessed Life of Jesus Christ. A critical edition based on Cambridge University Library Additional MSS 6578 and 6686. Ed. with introduction, notes and glossary by Michael G. Sargent. Garland Medieval Texts, 18 (New York, London: Garland Publishing, 1992). mirbles 63 Malory, Syr Thomas, Le Morte D’Arthur. Original edition of William Caxton, ed. H. Oskar Sommer (London: David Nutt, 1889), 2 vols., vol. 1, pp. 1-406. malory1 129 Mandeville, John: The Bodley Version of Mandeville's Travels. From Bodleian Ms. E Musæo 116 with parallel extracts from the Latin text of British Museum Ms. Royal 13 E. IX, ed. M.C. Seymour. EETS OS 253 (1963). mandevil 64 Margaret 1: Seinte Marherete, Þe Meiden ant Martyr. Ed. Frances M. Mack (Oxford/ London: Milford, 1934), pp. 2-54. MS. Bodley 34. margabod 65a Margaret 2: Seinte Marherete, Þe Meiden ant Martyr. Ed. Frances M. Mack (Oxford/London: Milford, 1934), pp. 3-55. MS. Royal 17 margaroy 65b Maria, Life of Saint, in Prosalegenden, ed. Carl Horstmann 1880 (Anglia). maria 66 Melusine. Compiled (1382-1394) by Jean d'Arras Englisht about 1500. Ed. A. K. Donald. EETS ES 68 (1895). melusine 67 Merlin or the early history of King Arthur: a prose romance. Ed. Henry B. Wheatley. EETS OS 10, 21, 36 (1865-1869), with an introduction by William Edward Mead 1899 (repr. 1969). 1: merlin1 68,1 2: merlin2 68,2 3: merlin3 68,3 42 Metham, John. John Metham’ Prose Works. Ed. Hardin Craig. EETS OS 132 (1916 for 1906), pp. 83-158. Palmistry, Garret Ms., pp. 84, 86, etc. to 116. metpa1 69a Palmistry, All Souls’ Ms., pp. 85, 87, etc. to 117. metpa2 69b Physiologus, All Souls’ Ms., pp. 118-145. metphys 69c Christmas Day, Garret Ms., pp. 146-7 metchri1 69d Christmas Day, All Souls’ Ms, pp. 157-8. metchri2 69e Days of the Moon, Garret Ms., pp. 148-156 metmoon 69f Middle English Prose Complaint of Our Lady and Gospel of Nicodemus, The. From Cambridge, Magdadene College, MS Pepys 2498, ed. C. William Marx and Jeanne F. Drennan. Heidelberg: Carl Winter Universitätsverlag, 1987, pp.73-129. compl 123 Middle English Translation of Macer Floridus de Viribus Herbarum, A. Ed. Gösta Frisk. The English Institute in the University of Upsala, Copenhagen: Ejnar Munksgaard, 1949, pp. 57-202. herbarum 115 Mirror of St Edmund, The. In Religious pieces in prose and verse. Ed. from Robert Thornton's Manuscript (c. 1400) in the Lincoln Cathedral Library by George G. Perry. EETS OS 26 (1867/1914), pp. 16-50. mirredm 71 Misyn, Richard, tr. The Fire of Love and the Mending of Life or The Rule of Living by Richard Rolle, ed. Ralph Harvey. EETS OS 106 (1896). misfire 72a mismend 72b Mittelenglische Originalurkunden (1405- 1430), ed. Hermann Flasdieck. Heidelberg: Carl Winter's Universitätsbuchhandlung, 1926. urkundfl 24 Mittelenglische Originalurkunden, ed. Lorenz Morsbach. Heidelberg: Winter (1923). urkundmo 25 43 Myracles of Our Lady, The. From Wynkyn de Worde’s edition, ed. Peter Whiteford. Heidelberg: Carl Winter Universitätsverlag, 1990, pp. 41-73. myracles 124 Myrc, John, Seven questions to be asked of a dying man. In Instructions for Parish Priests. Ed. Edward Peacock, F.S.A. From Cotton Ms. Claudius A.II. EETS OS 31 (1868). myrcseve 70 Myrour to Lewde Men and Wymmen, A. A prose version of the Speculum Vitae, ed. from B.L. MS Harley 45 by Venetia Nelson. Middle English Texts 14 (Heidelberg: Winter, 1981). mirror 73 Order: The Thirde Order of Seynt Franceys for the Brethren and Susters of the Order of Penitentis, in A Fifteenth-Century Courtesy Book and Two FifteenthCentury Franciscan Rules, edited from a XV Century MS. Formerly in the Pennant Collection, ed. R.W. Chambers, EETS OS 148 (1914). pp. 43-55. order 74a Oseney: The English Register of Oseney Abbey, by Oxford. Written about 1400. Ed. Andrew Clark. EETS OS 133 (1907). oseney 75 Paston Letters, The. Ed. James Gairdner. Library Edition, 6 vols. 1904. Repr. Gloucester: Alan Sutton Publishing, 1986, vols. II to VI (I = Introd.) paston2 to paston6 76,1-5 Pater Noster of Richard Ermyte, The. A Late Middle English Exposition of the Lord’s Prayer,ed. F. G. A. M. Aarts, from Westminster School Library MS. 3. The Hague: Martinus Nijhoff, 1967, pp. 3-56. pater 118 Pecock 1: Pecock, Reginald. The Donet. EETS OS 156 (1921 for 1918), repr. 1971. pp. 1-214, Ms. Bodley 216. pecdon1 77 Pecock 2: Pecock, Reginald. The Folewer to the Donet. EETS OS 164 (1924 for 1923), repr. 1971. pp. 1-227, Ms. Royal 17 D. pecdon2 78 Pepysian Gospel Harmony, The. Ed. Margery Goates, Ms. Pepys 2498. EETS OS 157 (1922 for 1919). pepys 102 Peterborough Chronicle 1070-1154, The. Ed. Cecily Clark. From Ms. Bodley Laud Misc. 636 (London: Oxford UP, 1958; 2nd ed. 1970), pp. 2-60. peterbor 91 44 Prynces see Secreta Secretorum Prose Life of Alexander, The. From the Thornton Ms. Ed. J.S. Westlake. EETS OS 143. (1913 for 1911), pp. 7-50, 52-115 (= original ME text). [pp. 1-7, 50-52 (= ModE translation) stored in Innsbruck Varia] lifealex 104 Register of Godstow Nunnery near Oxford, The English. Written about 1450. Rawlinson Ms. B. 408 (c.1450). Ed. Andrew Clark. EETS OS 129 (1905; repr. 1971), part I. reggod1 79 Register of Godstow Nunnery near Oxford, The English. Written about 1450. Rawlinson Ms. B. 408 (c.1450). Ed. Andrew Clark. EETS OS 129 (1905; repr. 1971), part II. reggod2 80 Revelations of St. Birgitta, The. Ed. from the fifteenth-century ms. in the Garrett Collection in the Library of Princeton University by William Patterson Cumming. EETS OS 178 (1929 for 1928), repr.1987. birgitta 81 Rolle 1: Yorkshire Writers. Richard Rolle of Hampole. An English father of the Church and his followers. Ed. C. Horstmann (London, New York: Swan Sonnenschein, MacMillan, 1895), vol. I. rollhor1 82 Rolle 2: Richard Rolle of Hampole and his followers. Ed. C. Horstmann (London/New York: Swan Sonnenschein, Macmillan, 1896), vol. II. rollhor2a 83a rollhor2b 83b Rolle 3: Richard Rolle and the Holy Boke Gratia Dei. An edition with commentary by Sister Mary Lutz Arntz, S.N.D. Ed. James Hogg (Salzburg: Institut für Anglistik und Amerikanistik der Universität, 1981). rollebok 84 Rolle 4: English Prose Treatises of Richard Rolle de Hampole, ed. George G. Perry. EETS OS 20 (1866, 1921). rollpros 85 Rolle 5: Richard Rolle of Hampole and His Followers, ed. C. Horstman. London: Swan Sonnnenschein; New York: Macmillan, 1895, pp. 1-182 rollplus 105 The Rewle of Sustres Menouresses Enclosid, in A Fifteenth-Century Courtesy Book and Two Fifteenth-Century Franciscan Rules, edited from a XV century MS. in the Bodleian Library by R.W. Chambers, Bodl. 585. EETS 148 (1914), 81-116. rule 74b 45 Saint Bartholomew. In Three Lives from the Gilte Legende, from MS B. L. Egerton 876, ed. Richard Hamer. Heidelberg: Carl Winter Universitätsverlag, 1978, pp. 7587. stbarth 127 Saint George. In Three Lives from the Gilte Legende, from MS B. L. Egerton 876, ed. Richard Hamer. Heidelberg: Carl Winter Universitätsverlag, 1978, pp. 65-74. george 126 Saint Nicholas. In Three Lives from the Gilte Legende., from MS B. L. Egerton 876, ed. Richard Hamer. Heidelberg: Carl Winter Universitätsverlag, 1978, pp.5164. nichol 125 Sawles Warde: In Old English Homilies and Homiletic Treatises. First series, parts I & II. Ed. Richard Morris. From manuscripts in the British Museum, Lambeth, and Bodleian Libraries. EETS OS 29 & 34 (1867/8), pp. 245-267. MS Bodley 34: sawleswd 86 Secrete see Secreta Secretorum Secreta Secretorum, Three Prose Versions of. Ed. Robert Steele. vol. I: Text and Glossary. EETS ES 74 (1898), repr. 1973. secrete 87a govern 87b prynces 87c Speculum Christiani, A Middle English Religious Treatise of the 14th Century, ed. Gustav Holmstedt, MS. Harley 6580. Oxford: Oxford University Press, 1930, pp.2240. specchri 114 Speculum Sacerdotale, ed. Edward H. Weatherly. BM MS Add. 36791. EETS OS 200 (1936 for 1935), pp.1-253. speculum 101 Spheres and Planets. In The Book of Quinte Essence or the Fifth Being; that is to say, Man's Heaven. Ed. from British Museum MS. Sloane 73 about 1460-70 A.D. by Frederick J. Furnivall. EETS OS 16 (1866; rev. ed. 1889), p. 26. spheres 88 Syon Additions for the Brethren, The, and The Boke of Sygnes from St Paul's Cathedral Library MS. Salzburg: Institut für Anglistik und Amerikanistik der Universität 1980, pp. 12-133. syon 89 46 Syon Additions for the Sisters, The, from St Paul's Cathedral Library MS. Salzburg; ed. James Hogg. Institut für Anglistik und Amerikanistik der Universität 1980, pp. 1-206. syonsist 90 Testament of Love, The. In Walter W. Skeat, ed. Chauceriana and Other Pieces, London: Oxford University Press, 1897, pp. 1-145. testlove 106 Three Kings’ Sons. The. Ed. F. J. Furnivall. MS, Harleian 326, EETS Extra Series 67 (1895), pp. 1-207. threekin 110 Three Middle English Sermons from the Worcester Chapter Manuscript F. 10, ed. D. M. Grisdale. Leeds: Titus Wilson of Kendal, 1939, pp. 1-80. sermworc 120 Treatyse of Loue, The. Ed. John H. Fisher. EETS OS 203 (1951; repr. 1970). tretlove 92 Trevisa, John (?), Methodius: `The Bygynnyng of the World and the Ende of Worldes'. Ed. Aaron Jenkins Perry. EETS OS 167 (1925 for 1924), part III, pp. 94-112. trevmeth 93b trevmead (Northern version of BM MS Add. 37049) 93c Trevisa, John, Dialogus inter militem et clericum, ed. Aaron Jenkins Perry. EETS OS 167 (1925 for 1924). trevdia 93a Vices and Virtues. Being a Soul's Confession of its Sins, with Reason's Description of the Virtues. A Middle-English dialogue of about 1200 A.D. Ed. Ferdinand Holthausen. From Stowe Ms. 240 of the British Museum. EETS OS 89 (1888). vices 94 Wenefreda: Prosalegenden. Legende der heiligen Wenefreda. Ed. C. Horstmann. From Lambeth 306, fol. 188 in Caxton's print 1484. Anglia 3 (1880), 295-320. wenefr 95 Wheatley Manuscript from British Museum Additional Manuscript 39574, The. Ed. Mabel Day. EETS OS 155 (1921) pp. 76-99: Life of Adam and Eve; p. 100: A Prayer at the Elevation. wheat 96 47 Wills: Fifty Earliest English Wills in the Court of Probate, London. A.D. 13871439; with a Priest's of 1454. Copied and edited from the Original Registers in Somerset House, ed. Frederick J. Furnivall, London: Oxford University Press, 1964, pp. 1-134. wills 128 Wisdom of Solomon, The. In Ratis Raving Ratis Raving and Other Moral and Religious Pieces in Prose and Verse. Edited from The Cambridge University MS. KK. 1.5 by J. Rawson Lumby. EETS OS 43 (1870), pp. 11-25. solomon 97 Wohunge of Ure Lauerd, The (c1210). In Old English Homilies and Homiletic Treatises. Series I, parts 1 & 2. Ed. Richard Morris. EETS OS 29 & 34 (18671868), pp. 269-287. wohunge 98 Wycliff, The English Works of. Ed. F.D. Matthew. 2nd rev. ed. EETS OS 74 (1880; 1902); in ICAMET split into parts I and II. wyclif1 99,1 wyclif2 99,2 5.2. List of files arranged by short names Note: The files in boldface are those that are available only in the full version of the Innsbruck Middle English Prose Corpus and, thus, not included in the Sampler on CD-ROM. name no. words signs publis hed ed. permis sion abbey 1 4,571 19,889 1867 Perry, George G. yes agnus 121 27,412 140,063 1950 Liljegren, S.B. yes Harvar d UP alpha1 2a 90,250 360,034 1904 Macleod Banks, Mary yes alpha2 2b 90,663 366,800 1904 Macleod Banks, Mary yes 48 anccor 3 75,185 313,810 1962 J.R.R. Tolkien and N.R. Ker EETS no ancgon 4 30,591 132,254 1954 R.M. Wilson EETS no ancnero 5 75,407 321,524 1952 Mabel Day EETS no ancpepy s 6 77,272 421,913 1976 A. Zettersten EETS no anctit 7 62,713 349,306 1963 Francis M. Mack EETS no arthunt 112 3,733 26,624 1977 Bror Danielsson yes Almqvi st & W. astske 26a 16,838 98,677 1912 Walter W. Skeat yes barthol 103 24,837 103,873 1923/ Norman Moore EETS 1996 yes birgitta 81 51,949 292,829 1929/ William 1987 Patterson Cumming EETS yeso boeske 26b 53,076 324,003 1912 Walter W. Skeat yes boke 8 1,880 10,557 1980 James Hogg yes Salzbur g bookqe 9 9,830 54,860 1889 Frederick J. Furnivall yes EETS brut1 116 105,94 7 608,766 1906 Friedrich W.D. Brie EETS no brut2 117 116,49 2 688,109 1908 Friedrich W.D. Brie EETS no 49 capgrav e 10 87,590 499,101 1983 Peter J. Lucas EETS no caplives 100 58,585 243,691 1910 J.J. Munro yes caxtaym 1 11 43,459 599,336 1884 Octavia Richardson yes caxtaym 2 15 94,932 376,038 1885 Octavia Richardson yes caxtblan 16 65,337 370,274 1890 Leon Kellner yes caxtcur 17 5,325 28,599 1888 Frederick J. Furnivall yes caxtdial 18 9,388 59,338 1900 Henry Bradley yes caxtdoc 19 72,683 306,335 1993 Joseph Gallagher yes Winter caxteney 20 56,266 329,376 1890 W.T. Culley, F.J. Furnivall yes caxtkni 14 80,078 458,728 1868 Joseph Rawson Lumby yes caxtpar 21 34,484 178,839 1970 MacEdward Leach EETS yes caxtpro 1 22a 29,619 161,103 1928/ J.B. Crotch 1973 EETS no caxtpro 2 22b 6,087 32,819 1928/ J.B. Crotch 1973 EETS no caxtquat 12 26,222 153,367 1975 N.F. Blake yes Winter caxttho 13 4,920 26,359 1900 F.S. Ellis yes caxttulle 119 35,995 192,557 1933 Heinz Susebach yes Niemey er 50 cely 23 90,411 402,332 1975 Alison Hanham EETS no cloudun k 27 51,339 277,961 1958 Phylis Hodgson EETS yes compl 123 13,836 70,063 1987 C. William Marx and Jeanne F. Drennan yes Winter conquest 111 29,794 217,088 1896 Frederick J. Furnivall yes cookery 28 48,007 256,456 1888 Thomas Austin yes courtesy 29 3,199 17,325 1963 R.W. Chambers and W.W. Seton EETS yes craftdye 30 3,390 18,458 1870 J. R Lumby yes curyein g 56 29,704 161,380 1985 Constance B. Hieatt and Sharon Butler EETS no danayen 32 104,12 8 396,256 1866 R. Morris yes deonise 33 22,633 125,348 1955 Phyllis Hodgson EETS no dicts 34 58,775 331,945 1961 Curt F. Bühler EETS no doroth 35 1,554 9,314 1880 Carl Horstmann yes equat 122 7,522 37,515 1955 Derek J. Price yes CUP 36 11,530 72,007 1989 Margaret S. Blayney EETS yes famdial 51 fistula 37 40,066 225,271 1910 D'Arcy Power EETS no gaytryge 31 5,446 30,869 1867 George G. Perry yes 126 3,028 17,527 1978 Richard Hamer yes Winter gestarom 38 134,71 3 564,032 1863 Sir Frederick Madden yes gildmary 39 4,155 25,773 1920 F.J. Furnivall EETS yes gilds 40 83,317 384,882 giltele 41 7,396 42,430 govern 87b 32,911 halibod 42a halicrit george 1870 Toulmin Smith yes 1972 N. Blake Arnold no 189,089 1898 Robert Steele yes 9,193 52,290 1922/ F.J. Furnivall 1973 EETS yes 43 9,200 52,669 1982 Bella Millett EETS no halitit 42b 9,238 52,817 1973 F.J. Furnivall EETS yes herbaru m 115 37,677 281,600 1949 Gösta Frisk yes Munks gaard hieron 44 16,809 96,322 1880 Carl Horstmann yes hiltang 109 2,413 13,757 1980 Toshiyuki Takamiya yes Jap, publ, hiltperf 108 5,130 31,885 1980 Toshiyuki Takamiya yes Jap, publ, histreyn 45 47,766 250,324 1889 Henry Morley yes 52 homilbo d 47 27,517 153,980 1909 A.O. Belfour yes homilves 48 60,982 360,442 1918 Rubie D.-N. Warner yes horses 107 9,512 36,218 imita 50 49,382 278,147 52a 7,576 51 juliaroy 1978 Anne Charlotte Almqvi Svinhufvud st &W. yes 1893 John K. Ingram yes 41,176 1961 S.T.R.O. d'Ardenne EETS yes 15,151 85,489 1978 Frances Beer yes Winter 52b 7,002 38,570 1961 S.T.R.O. d'Ardenne EETS yes kathroy 53 11,804 79,851 1884 Eugen Einenkel yes kentserm 54 3,996 20,042 1872/ Richard Morris yes 1927 kingsca m 55a 25,096 142,344 1886 C. Horstmann yes kingsroy 55b 24,414 138,258 1886 C. Horstmann yes lantlit 57 50,862 286,093 1917 Lilian M. Swinburn yes lapidari 58 36,315 184,263 1933/ Joan 1990 Evans/Mary S. Serjeantson EETS yes lettchri 59 19,041 103,184 1877 J. B. Sheppard, M.B.C.S. yes liber 60 34,969 185,663 1969 Margaret Sinclair Ogden EETS no juliabod julian 53 lifealex 104 45,269 193,022 1971 J.S. Westlake EETS yes lincdoc 61 19,334 114,521 1914 Andrew Clark yes lollard 62 96,484 415,844 1989 Gloria Cigman EETS no malory1 129 182,07 1 703,952 1889 H. Oskar Sommer yes mandevi l 64 25,393 144,639 1963 M.C. Seymour EETS yes margabo d 65a 8,877 50,110 1934 Frances M. Mack yes Milford margaro y 65b 8,818 50,502 1934 Frances M. Mack yes Milford 66 2,468 13,915 1880 Carl Horstmann yes melbla 26c 17,065 76,631 1980 Norman F. Blake no melske 26d 17,837 97,271 1912 Walter W. Skeat yes 67 132,36 9 550,157 1895 A. K. Donald yes merlin1 68,1 77,431 328,742 1865- Henry B. 69 Wheatley yes merlin2 68,2 41,925 584,880 1865- Henry B. 69 Wheatley yes merlin3 68,3 101,27 9 435,276 1865- Henry B. 69 Wheatley yes metchri1 69a 592 3,577 1916/ Hardin Craig 1973 EETS yes metchri2 69b 353 2,207 1916/ Hardin Craig EETS maria melusine 54 1973 yes metmoo n 69c 2,981 17,122 1916/ Hardin Craig 1973 EETS yes metpa1 69d 5,633 33,775 1916/ Hardin Craig 1973 EETS yes metpa2 69e 5,374 32,122 1916/ Hardin Craig 1973 EETS yes metphys 69f 9,144 56,465 1916/ Hardin Craig 1973 EETS yes mirbles 63 40,480 592,413 1992 Michael G. Sargent yes Garland mirredm 71 14,395 80,496 1867 George G. Perry yes mirror 73 24,258 491,126 1981 Venetia Nelson yes Winter misfire 72a 51,169 278,673 1896 Ralph Harvey yes mismend 72b 12,668 69,071 1896 Ralph Harvey EETS yes myracles 124 14,195 82,828 1990 Peter Whiteford yes Winter myrcsev e 70 662 3,785 1868 Edward Peacock, F.S.A. yes nichol 125 4,097 23,220 oehom 49 42,304 235,778 1873 R. Morris yes order 74 19,503 110,775 1914/ R.W. 1963 Chambers EETS yes oseney 75 72,770 322,550 1907 Andrew Clark yes 1978 Richard Hamer yes Winter 55 paston 2 76,1 85,325 362,350 1904/ James 1986 Gairdner yes paston 3 76,2 21,927 503,240 1904/ James 1986 Gairdner yes paston 4 76,3 21,453 483,701 1904/ James 1986 Gairdner yes paston 5 76,4 99,648 402,209 1904/ James 1986 Gairdner yes paston 6 76,5 49,601 285,888 1904/ James 1986 Gairdner yes pater 118 28,855 153,118 1967 F. G. A. M. Aarts yes Nijhoff pecdon 1 77 76,542 334,997 1921 Pecock, Reginald EETS no pecdon 2 78 30,283 541,149 1921 Pecock, Reginald EETS no pepys 102 40,333 163,367 1922 Margery Goates EETS no persbla 26e 30,300 134,479 1980 Norman F. Blake no persske 26d 31,707 174,447 1912 Walter W. Skeat yes peterbor 91 21,955 127,835 1958/ Cecily Clark 1970 OUP yes prynces 87c 52,699 318,733 1898 Robert Steele yes reggod1 79 62,489 769,129 1905 Andrew Clark EETS yes reggod 2 80 108,67 9 495,079 1905 Andrew Clark EETS yes rollebok 84 35,787 196,956 1981 James Hogg yes 56 Salzbur g rollho2a 83a 66,790 272,801 1895 C. Horstmann yes rollho2b 83b 55,195 245,802 1895 C. Horstmann yes rollhor1 82 137,28 8 558,249 1895 C. Horstmann yes rollplus 105 26,627 140,271 1895 C. Horstman(n) yes rollpros 85 18,275 106,725 1866 George G. Perry yes roodme 46 7,456 38,301 1894 A.S. Napier yes 74b 15,990 67,978 1914 R.W. Chambers EETS yes sawlesd 86 4,937 27,147 1867 Richard Morris yes secrete 87a 16,441 95,628 1898 Robert Steele yes sermwor c 120 33,054 182,940 1939 D. M. Grisdale yes ULeeds solomon 97 6,636 36,136 1870 Rawson Lumby yes specchri 114 31,427 250,880 1930 Gustav Holmstedt yes OUP speculu m 101 110,51 3 478,275 1936/ Edward H. 1988 Weatherly EETS no spheres 88 320 1,915 1866 Frederick J. Furnivall yes stbarth 127 3,693 20,940 89 23,701 142,979 rule syon 1978 Richard Hamer yes Winter 1980 James Hogg yes Salzbur 57 g syonsist 90 47,297 291,692 1980 James Hogg yes Salzbur g testlove 106 57,037 345,934 1897 Walter W. Skeat yes threekin 110 107,65 0 707,072 1895 F. J. Furnivall yes treatise 113 12,119 67,399 1956 J.P.W.M. Van Zutphen yes inst, Rome tretlove 92 44,117 253,748 1970 John H. Fisher EETS no trevdia 93a 6,535 36,382 1925/ Aaron Jenkins 1987 Perry EETS yes trevmea d 93c 3,476 18,625 1925/ Aaron Jenkins 1987 Perry EETS yes trevmeth 93b 3,674 10,417 1925/ Aaron Jenkins 1987 Perry EETS yes urkundfl 24 9,922 58,533 1926 Hermann Flasdieck yes urkundm o 25 8,792 54,199 1923 Lorenz Morsbach yes 94a 28,569 167,169 1888 Ferdinand Holthausen yes wenefr 95 13,402 72,667 1880 C. Horstmann yes wheat 96 9,058 50,451 1921/ Mabel Day 1971 EETS yes wills 128 41,532 248,113 1964 Frederick J. Furnivall yes OUP vices 58 wohunge 98 4,090 20,694 wyclif1 99,1 79,281 342,347 1880 F.D. Matthew yes wyclif2 99,2 82,245 444,720 1880 F.D. Matthew yes Total 1867 Richard Morris EETS yes 7.874,5 32.987,707 08 As can be gathered from this survey, the number of files still covered by copyright and thus not available in this sampler is anything but negligible, the more so since the unlicenced texts are generally the longer ones. The files amount to 20 (1.071,438 words) and are the following: name no. words signs published permission anccor 3 75,185 313,810 1962 EETS no ancgon 4 30,591 132,254 1954 EETS no ancnero 5 75,407 321,524 1952 EETS no ancnero ancpepys 56 75,407 77,272 321,524 421,913 1952 1976 EETS no anctit 7 62,713 349,306 1963 EETS no capgrave 10 87,590 499,101 1983 EETS no caxtpro 1 22a 29,619 161,103 1973 EETS no caxtpro 2 22b 6,087 32,819 1973 EETS no cely 23 90,411 402,332 1975 EETS no curyeing 56 29,704 161,380 1985 EETS no 59 deonise 33 22,633 125,348 1955 EETS no dicts 34 58,775 331,945 1961 EETS no giltele 41 7,396 42,430 1972 Arnold no halicrit 43 9,200 52,669 1982 EETS no liber 60 34,969 185,663 1969 EETS no lollard 62 96,484 415,844 1989 EETS no melbla 26c 17,065 76,631 1980 Arnold no persbla 26e 30,300 134,479 1980 Arnold no speculum 101 110,513 478,275 1936/1988 EETS no 92 44,117 253,748 1970 EETS no 1.071,438 5.214,098 tretlove sum 6. Identification and description of files 6.1. The title heading The main data of the title page of a source book are given before the text, irrespective of some kind of a title that is presented on the first page of the edition used. We have kept our title headings short. Their lines are marked by both braces {...} and an initial “bybe” (ALT 124), moreover by underscore hyphens between the words. The headlines contain the usual bibliographical data including the main manuscript (if mentioned by the editor on the front page), but over-lengthy titles are cut short and the layout of the title page in the sourceedition (mid-line justification, different scripts, etc.) is ignored in favour of a normalised, left-justified format. Here is an example: 60 |b{Dan_Michel's_Ayenbite_of_Inwyt,} |b{or,_Remorse_of_Conscience.} |b{In_the_Kentish_Dialect,_1340_A.D.} |b{Ed._Richard_Morris.} |b{EETS_23_(1866)} |b{pp.1-271} 6.2. The parameters (survey) We have tried to abide by the COCOA header of the Helsinki Corpus as much as possible, but some of the Helsinki parameters seemed inappropriate for Middle English prose 25 and a few others nicely add to the description of an edition's profile. The overall number of parameters is, however, again 26. 1 <B> 2 <Q> 3 <N> 4 <A> 5 <I> 6 <E> Book: title and main bibliographical data Quid: short file name and number Name of main MS/print Author: surname + first name/anon/several Initiating background text: author and work Extension: size in rounded thousands of words (10k = '9,500-10,499') 7 <C> Century (accord. to MS): 12/13/14/15) 8 <M> MS or print date (1100+ = '1100-1149'26) 9 <O> Original date of work (marked as in 8) 10 <K> Contemporariness of MS and original: contemp/11 <D> Dialect: N/EM/WM/South/K/London27 25 26 27 The features dropped are: "verse or prose", "social rank of author", "audience description", "participant relationship" and "page". Instead, we have added a new parameter: "Hand (scribe): name/several". Either var, or in blocks of 50 years, from 1100 on to 1500; ‘11001149’ is expressed by 1100+, etc. ‘Around 1400’ (i.e. ‘1375-1424’) is expressed by -1400+. Options: Northern (NL, NO), East Midland (EML, EMO); West Midland (WML, WMO); Southern (SL, SO), Kentish (KL, KO); London. The additional L stands for LALME as source, O for other sources, usually the edition of the text at issue. 61 12 <V> 13 <T> 14 <G> 15 <F> Variation of dialect: var/Text type28 “Genuine” or translation: transl/Foreign precursor (language): Lat/French/Scand/Dutch 16 <W> Written or meant to be spoken: written/spoken 17 <X> Author's sex: male/female/var 18 <Y> Young (author's age): -40/40+ 19 <Z> Diachronic prototype: expository/instruction, religious /instruction, secular/narrative, imaginative/narrative, non- imaginative/statutory 20 <J> Interactive: interact/- 21 <H> Hand (scribe): name/several 22 <S> Secondary MSS/prints: name/several 23 <U> Unusual amount of French, Latin, Scand or Dutch 24 <P> Prosodic markers (mainly accents): prosod/- 25 <L> Length (abbreviations): abbrev/expanded/26 <R> Record (sth. for the): open list These 26 parameters, to be used for the "Cocoa Headers" (cf Hickey 2003, 76f.), are similar to, but not identical with, those of the Helsinki Corpus of English Texts (cf Kytö 1991, 43f.). The 28 Options: Bible; biography of saints; courtesy books; documents/wills/statutes; dream books; educational fiction; handbook, astronomy; handbook, cooking; handbook, craft of dressing; handbook, craft of dying; handbook, craft of hunting; handbook, language; handbook, medicine; handbook, visiting of the sick; handbook, other; history; law; letters private/official; pamphlets; philosophy; political allegory; preface/prologue/epilogue; religious, mysticism; religious, treatise; romance; rules; science, medicine; science, other; sermon (= homily); travelogue; varia (petition, proclamation). 62 reason for this offence of what has by now become some kind of a standard (cf Hickey´s use of the parameters for other corpora as well, such as the Corpus of Early Medical Writing 1375-1750) simply is the fact that some of the parameters of the Helsinki Corpus do not make much sense for Middle English texts, whereas others not used in the Helsinki Corpus were considered by us to be most informative (e.g. those concerning the manuscripts). In Hickey´s program “Corpus Presenter” (cf Hickey 2003) it does not matter what encodings are used for the parameters as long as the syntax is observed and as long as the encodings of the parameters added before each text file are applied consistently. The syntax for line 4 below, for example, is <A name>; so given that Chaucer is the author the Cocoa header is <A Chaucer>. In some cases the letter markers of the parameters are suggestive of what they stand for, but the principle could not be adhered to consistently since we had to stick to the Helsinki letters. Note that the ICAMET Cocoa Headers use the capital letters of the Helsinki Corpus in a different order and, in some cases, with a different sense (see below). 6.3. The reference codes in detail (a) ID The first four parameters identify the file: by the short name of the file -- up to eight letters -- , by the numerus currens of the Innsbruck Prose Corpus, the main manuscript that the source edition is based on, and finally the author(s). Authors' names are given first by surname, then by first name. Two or more authors' 63 names are separated by a semicolon. When unknown, the author's name is given as "anon" (for anonymous). "several" means that there are two or more anonymous authors or more than two known authors. (b) Intertextuality and size Parameter 5 takes account of the well-known fact that medieval works are usually based on previous works, with formal variation and skill being more important than topical originality. With a term of recent modern theory this could be called "intertextuality". Parameter 6 gives the number of words rounded up in thousands (represented by "k"). (c) Time In parameter 7 the optional figures are 12 to 15 (for the 12th to 15th century). Parameter 8 refers to the exact manuscript or print date of the source text if known, otherwise to the 50-year period concerned, from 1100-1149 on to 1450-1500. Parameter 9 pays attention to the assumed time of the creation of a work, with time again measured in 50-year phases. Parameter 10 interprets parameters 8 and 9 compared. In view of the vagueness of many medieval dates, contemporariness is defined with a tolerance of 50 years. (c) Dialect Parameters 11 and 12 refer to dialect. The dialect is defined, widely in line with the dominant conventions and the abbreviations used and explained by Kytö (1993, 50), as one of the following options: dialect abbreviation 64 ------------------------------------East Midland EML/EMO West Midland WML/WMO Northern NL/NO Southern SL/SO London LondonL/LondonO Kentish KL/KO unknown x Table 14: Dialect abbreviations in the Innsbruck Prose Corpus In the preceding abbreviations the final "L" stands for LALME, "O" for other sources than LALME. LALME is the acronym for A Linguistic Atlas of Late Medieval English by Angus McIntosh, M.L. Samuels and Michael Benskin. Aberdeen: Aberdeen University Press (1986). Parameter 12 asks for traces of additional dialects beyond the dominant one. (d) Text type and foreign background The fourth group of parameters (13-16) first refers to text type. We have distinguished the following types (in alphabetical order): (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) Bible biography of saints courtesy book documents/wills/statutes dream book educational fiction handbook, astronomy handbook, cooking handbook, craft of dressing handbook, craft of dying handbook, craft of hunting 65 (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) (24) (25) (26) (27) (28) (29) (30) (31) (32) handbook, language handbook, medicine handbook, visiting of the sick handbook, other history law letters private/official pamphlets philosophy political allegory preface/prologue/epilogue religious, mysticism religious, treatise romance rules science, medicine science, other sermon (homily) travelogue varia: petition, proclamation Table 15: Text types in the Innsbruck Prose Corpus By parameter 14 we mean the alternative of the source text to be originally written in Middle English or to be a translation. The option "transl" also includes fairly autonomous transfers from a foreign language into Middle English. If the text is a nontranslated original, this is expressed by "-". In the next parameter (15) reference can also be made to several languages, for example: Latin; French; Scand; Dutch. By the parameter features of 16 we mean the question whether the source text has an exclusively written or traces of a spoken style. As a third option, the two terms can also be combined. (e) Author's specifics (17 and 18) The options are: for sex: male, female or var (i.e. variant, when several authors are involved); for age: -40, 40+ and var (if several authors are involved). 66 (f) Characteristics of texts (19-21) The diachronic prototype (19) is defined as one of the following options: expository instruction, religious instruction, secular narrative, imaginative narrative, non-imaginative statutory The last parameter of the group (21: "Hand") identifies the scribe if there is only one, or otherwise the main scribe. If with two or more scribes ranking is not possible, this is marked by “several”. (g) Additional descriptive features (22-26) On the basis of what has been said so far in this subchapter (6.3.), the additional parameters 22-26 are in the main selfexplanatory, except perhaps for two points: (1) An "unusual amount" of Latin or French (or other languages) in parameter 23 is defined by the length of six or more continuous lines in the foreign language occurring at least once. (2) The other parameters of this group are based on occasional observations and merely meant to encourage further research. Of course, the answers given within these parameters do not aim or claim to be conclusive. In the case of parameter 25, which raises the question of the use of abbreviations or contractions, I am aware that the Innsbruck 67 Prose Corpus, like many of the edited texts that it is based on, are unreliable sources, since abbreviated spellings have often been normalised. The parameters “abbrev” and “expanded” are meant to encourage scholars in the cases concerned to go back to the manuscripts rather than to fully rely on the mixed practices of the editions. Note: In all parameters, "don't know" is marked by "x". Vague or hypothetical information is signalled by an added question mark. 7. Final remarks concerning accessibility of the corpus While I naturally regret that the Innsbruck Prose Corpus has not been allowed to be presented for the international community of researchers on the Internet, I am happy now to present what has been allowed for distribution on this CD-ROM. I would, however, like to inform researchers that the full corpus is readily accessible "to colleagues at Innsbruck". Whoever is interested in using it to the full is cordially invited to get in touch. For details about contact and the conditions of staying in Innsbruck as a 68 researcher in residence of the Department of English Studies see the website of the ICAMET projects under ICAMET/availability: www2.uibk.ac.at/fakultaeten/c6/c609/projects/. References Abbreviationes n.y. (ca. 1995), ed. Olaf Pluta, Institut für Philosophie, RuhrUniversität Bochum, Universitätsstraße 150, D-44801 Bochum. Day, Mable ed. 1952. The English Text of the Ancrene Riwle, ed. from Cotton Nero A. XIV, EETS OS 225. Dobson, E.J. ed. 1972. The English text of the Ancrene Riwle, ed. from B.M. Cotton MS. Cleopatra C.VI, EETS OS 267. Hickey, Raymond 2003. Corpus Presenter. Software for Language Analysis, with a manual and A Corpus of Irish English as sample data. Amsterdam: John Bernjamins Publishing Company. Kytö, Merja 1993. Manual to the diachronic part of The Helsinki Corpus of English Texts. Coding conventions and lists of source texts. 2nd edition. Helsinki: Department of English, University of Helsinki. Kytö, Merja, Matti Rissanen, and Susan Wright eds. 1994. Corpora across the centuries. Proceedings of the First International Colloquium on English Diachronic Corpora. St Catharine's College Cambridge, 25-27March 1993. Atlanta: Rodopi. Lalme: Mclntosh, Angus; M.L. Samuels; Michael Benskin eds. 1986, LALME: A Linguistic Atlas of Late Middle English, 4 vols, Aberdeen: Aberdeen University Press. Markus, Manfred 1990. Mittelenglisches Studienbuch. UTB Große Reihe. Tubingen: Francke. Markus, Manfred 1994. "The concept of ICAMET (Innsbruck Computer Archive of Middle English Texts)." In Kytö et al. eds. 1994. Corpora across the centuries. 41-52. Markus, Manfred 1997. “Normalisation of Middle English prose: possibilities and limits”. In Corpus-Based Studies in English. Papers from the Seventeenth International Conference on English Language Research on Computerized Corpora (ICAME 17), Stockholm, May 69 15-19, 1996. Ed. Magnus Ljung. Amsterdam/Atlanta, GA: Rodopi. pp. 211-226. Markus, Manfred 1999. Manual of ICAMET (Innsbruck Computer Archive of Machine-Readable English Texts). Innsbrucker Beiträge zur Kulturwissenschaft, Anglistische Reihe, vol 7. Innsbruck: Leopold-Franzens-Universität. Morton, James ed. 1853. The Ancrene Riwle, Camden Society, 57. Rainer, Eva Maria 1989. Das Perfekt im Spatmittel- und Fruhneuenglischen: eine Frequenz- und Funktionsanalyse anhand von Brieftexten. Innsbrucker Beiträge zur Kulturwissenschaft, Anglistische Reihe 2 Innsbruck. Institut für Anglistik. Robinson, Peter 1990. "COLLATE: A Program for Interactive Collation of Manuscripts." Old English Newsletter 24:27-31. Robinson, Peter 1994. Collate 2: A user guide. Oxford: Oxford University Computing Services. Tolkien, J.R.R., ed. 1962. The English text of the Ancrene Riwle, ed. from MS. Corpus Christi College Cambridge 402, intr. by N.R. Ker, EETS 249 (1962 for 1960). Wright, Laura 1996. Sources of London English Medieval Thames Vocabulary. Oxford: Clarendon Press. End of file