D. Reject. There is nothing in the paper which is either new nor interesting. Introduction In the following, I'll call the author of this paper A. A's main purpose is to describe a project that he managed within the IBM company. This is not correct. His reasons for doing so is that 1) he feels that his project has been ignored This is not correct. and 2) he feels that there are claims that he has committed fraud in his IBM morphology project of 20-30 years ago. Not exactly: In my paper, I document allegations of fraud. (Feelings are of minor importance in this connection.) In fact, these allegations are the only reason why I wrote this paper. These two points will be treated separately below. I will also look at some (other) main problems of the paper: unclear and undocumented claims, a whining tone (also with undocumented claims), What is this supposed to mean? ignorance of other people's work, as well as some smaller points, before I conclude. 1. On A's claim that the project has been ignored On p.6, A says: "Unfortunately, there are signs today, more than twenty years after the discontinuation of the Norwegian IBM project, that such an approach [i.e. A's documention in reports, reviewer's comment] is not sufficient. The project is hardly ever mentioned in the relevant literature [...]." First, I want to contest this claim: The IBM material is mentioned in several places, here are some that I found by a quick search: - on the homepage of the Oslo-Bergen tagger: http://tekstlab.uio.no/obt-ny/english/read.html - on the homepage of the Norwegian Word Bank: http://www.edd.uio.no/prosjekt/ordbanken/ This is not correct. There was no information of the IBM material in these pages, nor was there any mentioning of it in the pages of Norsk språkbank at the time I wrote my paper. (See below). In contrast to other sources, IBM’s basic contribution has even been omitted from oral presentations of Norsk språkbank at conferences and presentations, e.g. the one at the Nordic conference on language and technology, 7-8.- October 2013. - in Ruth Vatvedt Fjeld's talk on the lexicographic database at the MONS 8 conference in Trondheim 1999. http://www.hum.uit.no/mons8/sammendrag/teknologi.html The information provided in the presentation - one single sentence is misleading. - in Christian Emil Ore's talk on a text corpus at a corpus seminar in 1998: http://www.iet.ntnu.no/~torbjorn/korpus/Hell_referat.html Correct. So, “hardly ever” seems to be an appropriate characteristic. Be that as it may. Stating that "Unfortunately, there are signs today, more than twenty years after the discontinuation of the Norwegian IBM project, that such an approach is not sufficient.” etc. the “signs” are contentions that a) essentially the project didn’t exist and that b) what came out of it was all plagiarism. See below. Second, even if it had been true, A should not be surprised that the public are ignorant about his work: Private invitations to IBM's offices are hardly a way to get the public to know about one's work (cf. p. 20: " Later, linguists from the University of Oslo were invited on several occasions, individually and in groups, to see the products at IBM premises (Kolbotn). " First of all, this contention is rather peculiar, given the fact that the reviewer holds, at various instances, that IBM’s contribution is wellknown. Cf. above. Secondly, this is a clear misunderstanding based on the reviewers own idea as to what my paper is all about. The point is that linguists have contended that the research and development did not take place, or that it only consisted in the copying of other sources, in casu Bokmålsordboka and Nynorskordboka. My information about visiting linguists was intended to show that there were even other linguists not involved in the projects - who were able to see with their own eyes that the project was actually carried out. There are witnesses outside of the project staff. Further, several projects that have used resources that have incorporated the IBM material have been documented extensively in papers and conferences (see the long list of references at the end of this review). As for “the long list of references”, hardly any of the papers listed mentions the IBM material – and for a good reason: The projects they report didn’t have anything to do with the IBM material. (See below.) As for the few papers actually reporting projects partly based on IBM material, IBM’s contribution is usually not mentioned when relevant. Which users were obliged to do according to IBM’s contract with the University of Oslo, when the material was sold for a symbolic sum for scientific use only. A has had ample opportunities to comment or rectify wrong claims, as indeed he says has been done in two recent papers from 2009 og 2011 (see my conclusion). This is not correct. In fact, here the reviewer is led by a strange conception of time and space. Many years after its publication, I found out, by chance, that Fjeld 2000, a paper read at the EURALEX meeting that year, contained allegations of fraud. (The EURALEX conferences are the biannual conferences of the European Association for Lexicography.) How could that possibly be rectified – if not by a personal communication and a paper on the subject in another forum in 2007 (later to be published as Engh 2009). Neither of which had any effect, though. Cf. Johannessen and Fjeld 2008 and Fjeld and Henriksen 2012. (Engh 2011 has a different subject and is irrelevant in this connection.) 2. A's feeling that he has been accused of fraud Throughout the paper, A hints at the allegations that are later revealed at p. 23: “Revealed”? “Documented” is the appropriate term. The documentation starts at p. 22, not on p. 23. p. 6: " even allegations of fraud have been levelled against it. " [i.e. against the IBM lexical material, reviewer's comment] p. 6: " In order to refute allegations about their origin and their very nature based on what is - to adopt a benevolent interpretation - a selective misinterpretation of earlier documentation attempts. " p. 7: "Furthermore, the existence of the Bokmål project is the one that has been most seriously contested afterwards. " On p. 23, A finally presents the actual document that he feels is an accusation of fraud on his part. This is not correct. There are several documents, and two of them are duly presented at page 22. A third document is presented at page 23, the one the reviewer chooses to comments. The document is an unpublished power point presentation of 10 slides that was presented on an internal project meeting five years ago. This project meeting was important for two reasons: It was a meeting of the local Norwegian CLARIN project (“CLARIN is one of the Research Infrastructures that were selected for the European Research Infrastructures Roadmap by ESFRI, the European Strategy Forum on Research Infrastructures.” 1), focusing on “Common Language Resources and Technology Infrastructure Norway” 2. I assume that there were foreign researchers present and/or that the report was made known to the European organization. (It is written in English, not in Norwegian.) This means that the audience consisted of the “establishment” of this branch of applied computational linguistics in Norway as well as leading representatives of the corresponding European environment. Directly or indirectly. At any rate, it was impossible for me to rectify the misinformation. I discovered this presentation by chance, by the way. 1 2 http://www.clarin.eu/content/general-information [5 October 2014] https://clarin.b.uib.no/2008/12/15/m%C3%B8te-solstrand-15-16-des-2008/ The actual citation turns out to be: " IBM to develop their own lexicon" mentioned together with six other named and some unnamed projects, under a headline of what projects have used two UiO (University of Oslo)-developed dictionaries. The main title of the page is the key to my interpretation of the sentence mentioned: “research org. as developers, commercial org. as users”. There can be no doubt about the intention, especially given the earlier and later statements of one of the authors/speaker. The citation does not say in what way the dictionaries have been used Indeed, it does. By stating that “research org.” are “developers”, and by mentioning IBM on a pair with the TROLL project, which, according to a paper in the reviewer’s list of relevant morphology R&D for Norwegian, was 100% based on one of the dictionaries mentioned, Bokmålsordboka. (which indeed would probably be very different for each of the projects on that list), and it does not quantify this. As I see it, the citation would be false if it turned out that the IBM project had not used these dictionaries at all. However, A admits that he has used them. “Admits”? A strange allegation from a person who supposedly, as a reviewer on behalf of Language Resources and Evaluation, is an expert on lexicography. Nothing is “admitted”. We are talking about a declaration of lexicographic sources, which is something totally different. As for the way the printed dictionary was used, see comments below. Below, I cite his paper: p. 9: "Native language user linguists supplemented this material to the best of their linguistic competence, adding new words while consulting printed dictionaries when necessary - by looking up single words, one by one. " The reviewer appears to have rather vague ideas about what is a legitimate use of a printed dictionary in contrast to plagiarism. p. 16: " The following written sources were consulted: [...] Landrø, Marit Ingebjørg, Boye Wangensteen et al.: 1986, Bokmålsordboka. Bergen: Universitetsforlaget [...] Here "consulted" means that the printed dictionaries were used the way their authors and publishers had intended: Words were looked up in order to verify the linguist's competence when necessary." The reviewer also has vague ideas about lexicographical common practice: declaring other dictionaries that have been lawfully consulted during the compilation. The reason why I made this declaration – common use among all competent and decent lexicographers – is exactly to point out that no copying, i.e. plagiarism, took place. The keywords here are “consulted” and “when necessary”. Let me add that Bokmålsordboka was not at all extensively consulted, quite the contrary. p.19: "Landrø, Marit Ingebjørg, Boye Wangensteen et al.: 1986, Bokmålsordboka. Bergen: Universitetsforlaget These titles were later made available for IBM internal use only as electronic dictionaries. This meant that IBM employees at the Norwegian headquarter could look up entry words on their terminals and PCs. " As stated, these titles were later made available for IBM internal use only. Later. This is spelled out clearly. At that time, the morphology projects were already accomplished. There is no log information available as to the actual use of these electronic versions. However, one has every reason to believe that they were used by translators working for IBM at IBM premises (as vendors) and possibly by staff members of the sales department with a special interest in the Norwegian language. For the linguistics development group, the electronic version was more or less irrelevant. To repeat my point from the reviewed paper: There is a crucial difference between normal use of a printed dictionary (consultation), which is what a dictionary is intended for, and copying of machinereadable material (plagiarism/theft). On p. 23, A finally tells us what he thinks is the only possible way of reading the citation: " In the light of the heading, emphasising the research organisations' role as developers and the commercial organisations' role as users, "Used by: IBM to develop their own lexicon" can only be construed to mean that 'IBM made their own lexicon on the basis of Bokmålsordboka and Nynorskordboka' - the old groundless assertion in new disguise. Together, these quotations contain one contention and one insinuation: On the one hand, IBM did not develop its own lexicon and morphology from scratch, but contented itself with reformatting a digital version of a published dictionary instead. On the other, IBM made illicit use of somebody else's intellectual property, since IBM did not have the right to commercial use of the published dictionary. Both allegations are false." I am unable to see that A's interpretation is the only one, or even the most likely one. The citation is so short and rudimentary that almost any interpretation is possible. It would have been interesting to see what a written version of the power point presentation would look like, Why? but none exists. Alas, it is impossible to push hard on one interpretation, based on six words that don't even contain a finite verb. Nonsense. This is a clear distortion, cf. above. Again, these words have to be interpreted in the light of the main title of the page. In fact, even the interpretation of this single page is unambiguous. And my interpretation is, unfortunately, confirmed by the prior allegations of one of the authors. 3. Unclear and undocumented claims p.1: "the major part of the language information accessible has been converted from printed sources - or simply created by individual linguists, drawing on their competences as native language users." What does A mean by language information? Be specific! Nonsense. This is perfectly clear from the context. Any further specification at this point would have been felt as an exaggerated precision and bad style. The claim is that the major part comes from two very different sources. What are the possible alternatives? And where does this information come from? I fail to grasp the meaning of this. Is this relevant? p.2: "In fact, existing digital language resources are the result of a process where all the approaches are involved, although to a varying degree. Generally, this process has been poorly documented. " - This is not true. Be specific! I still fail to find any documentation of such a basic broad-scale process. Cf. below. At the end of this review I present a list of 11 talks and papers on the creation of lexical resources from other resources, and these are only for Norwegian. Just a few of these talks and papers are about the very creation of lexical and morphological resources, and hardly any of them about how they are created from other resources the way I have documented it in connection with the IBM project. In fact, these talks and papers are of a rather different nature, so this is no argument. (See below.) In fact, they support my initial claim. I could have found more. A in addition cites 10 papers by Jan Engh and 3 by Ruth Fjeld, further showing that the situation is not bleak. Again, it seems that the reviewer has not read the paper properly. It would not have been necessary to write my paper if the documentation I have provided myself had been read and understood, let alone respected. One who has not understood these papers, if she has cared to read them at all, is Ruth Vatvedt Fjeld. And that is exactly the point of my paper. I want to rectify her allegations – once and for all - and after having tried repeatedly and by means of several papers to do so … Further, I raise the question of how to document such a project, so that similar allegations cannot be made again. It follows that the lament on the situation, filling most of this page, including a reference to one publication A mentions as an exception, ? is unjustified. As I have already explained above, this is completely beyond the point. 4. Whining tone with undocumented claims The whole paper has a whining and bitter tone that is not appropriate in a scientific paper. The tone is one of polemics and sometimes sarcastic. With the preconceived ideas of the reviewer, though, ‘whining’ is a natural interpretation, however, not the correct one. More about the tone of the review and the reviewer below. The state of affairs A alludes to is not at all documented, and from what I know of the field, neither can it be, since A is wrong in his claims. Highly debatable … As far as I can see, the last contention is not documented. For example, it cannot be true that computational morphology has not been regarded as interesting. This is no exact rendering nor any abstract of what I contend in my paper. On the contrary, I state that “descriptive morphology itself and, especially lexicography, which represents the context of the morphology development, are not particularly trendy parts of linguistics” and “The low academic status of lexical and morphological resources creation”. The latter is based on personal experience, which should not be unfamiliar to the reviewer. (See below.) The former is based on observations of all accessible publications within linguistics during the last 20 years. Nothing less. In my capacity as academic librarian. (Cf. even Hovdhaugen et al. 2000, 516f. and Norsk lingvistikk. En evaluering av forskningen ved fem universitetsinstitutter [Norwegian linguistics. An evaluation of the research at the departments of linguistics at five Norwegian universities], p. 68. Available at http://evalueringsportalen.no/evaluering/norsk-lingvistikk-enevaluering-av-forskningen-ved-fem-universitetsinstitutter) I would very much appreciate to see documentation of the opposite. And, again, it is not the same as contending that “computational morphology has not been regarded as interesting”. “Computational morphology” is not mentioned in my paper. On the contrary, keywords are “descriptive morphology” and “lexical and morphological resources creation”, which is something utterly different. The topic of my paper is how the actual descriptive work – the task of connecting the basic linguistic terrain to the computational morphological models - is carried out. In practice, step by step at a micro level. It should not be necessary to state that this is something totally different from the ‘computational morphology’ the reviewer accuses me of having ignored. In 1991 a book called simply Computational Morphology was published at the prestigious MIT Press. Further, the important organisation ACL has a special interest group: ACL Special Interest Group on Computational Morphology and Phonology. On Norwegian, a book appeared on morphological analysis and synthesis in 1990 (Johannessen, Janne Bondi: Automatisk morfologisk analyse og syntese. Novus Forlag, Oslo 1990). This book investigates and makes many of the observations that A mentions on p. 13-14 (e.g. removing a final -e before other suffixes). As already mentioned, sifting through the totality of linguistic publications during the last decades, there is no doubt in my mind that a far greater number of titles have been devoted to syntax and semantics than to (theoretical) morphology. Apart from that, it is interesting that the two titles mentioned date from the time when the IBM morphology project was already history. Also, the fact of stating that an –e ought to be removed in a typical case is not identical to carrying out and implementing a complete identification process for the total vocabulary, for all possible inflected forms. Which is exactly one of the important points of my paper. A monograph about how one particular theoretical model could be adapted for Norwegian illustrated by a limited selection of examples is simply different from a full implementation (of a different model) of “all” words with their respective inflected forms. In sum: Broad coverage. All lemmas, all forms – even those nobody have thought of/commented on/normalised, thereby even charting the consistency and the adequateness of the linguistic standardisation. Especially this last task turned out to be a rather time-consuming pioneer work. There simply does not existe any documentation that this kind of R & D has been carried out consistently and on a broad scale for Norwegian by anyone prior to the IBM project. In fact, Johannessen 1990 is an example of the type of exposition that I find insufficient from a descriptive linguistics point of view. Referring to this title simply represents an illustration of one of my points . Clearly, the reviewer does not understand what is discussed and described in my paper… Here are some examples: p.2: most of the page. Be specific. p.4: " Now, normative linguistics is generally not well seen by theoretical (descriptive) linguists. " - Says who? Claim should be documented. Is it relevant? This is obvious to the extent that there is no need for specific documentation. Demanding concrete documentation is the proof of superficial knowledge of theoretical linguistics – and the reason why the need was felt to constitute theoretical linguistics as a separate discipline in the first place. The second question – “Is it relevant?” begs another one: Why didn’t the reviewer bother to try to understand my paper and the reason why it was written? What is sound review practice? p.4: " descriptive morphology itself and, especially lexicography, which represents the context of the morphology development, are not particularly trendy parts of linguistics. " - Says who? Claim should be documented. Is it relevant? (See also the introduction this section.) I say so. Based on a survey of all linguistic publications received at one university lirary during 20 years (see above). However also based on a different kind of personal experience and facts from local academia: 1) In the evaluation of candidates for the position as the director of the University of Oslo’s empirical/descriptive computational linguistics programme, Tekstlaboratoriet, in 1996, computational and plain lexicography was not considered interesting as a linguistic field, and competence in the area was not considered of any value. 2) At the outset, the new Norwegian national system for registration and subsequent use of the results as a basis for research funding excluded lexicographic work of any kind. 3) Ironically, the Linguistics department of the University of Oslo (ILN) has recently decided to scrap what is left of the former National institute of lexicography and its collections, old-fashioned analogical or highly sophisticated digital ones, since it is considered without sufficient scientific/linguistic merit, according to the head of the department. Cf. http://www.uniforum.uio.no/nyheter/2014/06/tar-ikkje-ansvaretfor-spraksamlingane.html I am sorry, the proofs are all around us, especially in the local, Norwegian academic context. p.4: " Enrichment, on the other hand, seems to be slightly more appealing, perhaps because of its closer relationship to semantics and syntax, which have been the more fashionable parts of linguistics since the 1950s. " - Says who? Claim should be documented. Is it relevant? The TROLL project mentioned repeatedly in the reviewer’s bibliography, is part of the proof - and duly mentioned. Cf. comments above. p.6: "Unfortunately, there are signs today, more than twenty years after the discontinuation of the Norwegian IBM project, that such an approach [i.e. documention in reports, rev. comment] is not sufficient. The project is hardly ever mentioned in the relevant literature [...]." - Is this documented? In addition to my comments above: I would like to know how one can reasonably document something which is not there. The “counterexamples” (?) mentioned by the reviewer in the bibliography, are void. Cf. below. p.6: "The low academic status of lexical and morphological resources creation is in strong opposition to its importance and to the quality required. " - Says who? Claim should be documented. Is it relevant? (See also the introduction this section.) As for the low academic status, see comments above. As for its importance: Does the reviewer really object to the fact that lexicographical and morphological high quality resources are important for other branches of linguistics, e.g. computational syntax? If so, this is a clear indication of incompetence in the area. p.14: " Moreover, the creation of the complete morphology on the basis of discontinuous and often inconsistent morphological information from printed sources, among other things, was far from trivial. " - Who says it's trivial? Claim should be documented. Is it relevant? (See also the introduction this section.) In the first place, because this used to be a general comment to IBM’s Nordic language projects at that time. Secondly, this is a natural implication of the fact that this type of linguistic activity is hardly mentioned in linguistics literature, as shown for instance by the papers listed by the reviewer. Irrelevant? When this activity is constantly ignored – even explicitely considered to be of no scientific/linguistic value. Cf. above. p.19: " Although probably unheard of in Norway at that time, this type of student internship was common practice in IBM internationally " - The badly hidden criticism of the Norwegian society is totally irrelevant in this context. Nonsense. The result of this particular student internship (converting Bokmålsordboka text files into a database) probably represents the only clue for the one who presented the fraud allegations in the first place. Exactly because she apparently did not understand its status in relationship to the corporation’s R&D activities. How this can be interpreted as a “hidden criticism of the Norwegian society” is beyond my comprehension. p.21: " For unknown reasons, Academia never showed any interest in the "enrichment" part of IBM Norway's lexicographical products: information about a variety of semantic and syntactic properties of words, to which one could also add information about word compounding and hyphenation. " - If it's true, maybe the relevant researchers don't know about it? A strange contention, given that the reviewer has spent a great effort in maintaining that, in fact, the IBM project was well known. Cf. above. Public presentations of the IBM project were given at the Nordic conference of lexicography in 1991 (cf. Engh 1992a) as well as at the biannual national conference for Norwegian linguistics, MONS, in 1991 (cf. Engh 1992b) etc. This is one of the points where university linguists’ visits to the project come in. Again, the point of my paper is that although the IBM project was made known to the Norwegian linguistics community through regular channels, it was “ignored”. p. 21: "Thus IBM Norway's lexica and morphologies constitute an important part of the base of today's electronic infrastructure for the Norwegian language,64 unfortunately not generally acknowledged as such. " - This is untrue, given that the information is on the relevant web sites. Unfortunately: No, it isn’t … Nowhere on the pages of Norsk språkbank there was any mentioning of IBM Norway’s lexica and morphology at the time my paper was written. In fact, leading member of the board of Norsk språkbank, Marit Hovdnak, sent me, unsolicited, an e-mail dated 16. September this year (i.e. more than one month after I received this review) informing me that a reference to the IBM material had been added to the web pages of Norsk språkbank on her initiative after personal communication. 5. The author ignores other people's work In his eagerness to show that he has been wronged, A has ignored both web sites and papers that are relevant to his paper. I refer to the references at the bottom of this review, with 11 papers on Norwegian computation lexicography and the two on computational morphology more generally. In a paper whose main purpose is to complain about other people's ignorance and even accusations on fraud, this is an inexusable oversight. Did the reviewer read the papers listed? I have my doubts. To the extent that I have succeded in finding/reading them, my clear impression is that they are about something else. Moreover, to the extent that subjects related to those of my paper are discussed, the problems are identified as general problems. No extensive, let alone complete and detailed analyses are given nor discussed. These papers generally focus on different topics, and/or they have different angles of attack. A says: " However, while there is a flourishing literature on the more formal aspects and the technical innovation part of natural language processing, documentation on how the basic language resources were and partly still are established is scarce, and existing documentation may be ignored. " (Abstract, p.1) The list of papers referred to in my list at the bottom of the review are exactly examples of what the author claims does not exist. This is not correct. Cf. remarks above. 6. Other things - Naming and enumerating all the staff that have worked on parts of A's dictionary, as A does in the appendix, is not something that belongs in a scientific journal. Either they should be co-authors or thanked in a footnote. When there are more than five or ten people, the group can safely be thanked as a group, not individually. Indeed a strange contention, especially since documentation has been demanded for quite a few well-known truths above. I am perfectly aware of normal usage as far as scientific publishing is concerned. However, and as a countermeasure against the fraud allegations, all the persons involved in the project are mentioned – so that they can be asked about the work they actually carried out. - p.21: "Norsk ordbanken". It's called Norsk ordbank. p.21: " Norsk ordbanken a service from the University of Oslo, incorporated in the newly established Norsk språkbank under the auspices of Språkrådet. " - A is misinformed about the structure of these institutions. The University of Oslo and Språkrådet together are responsible for the development and maintenance of Norsk ordbank. It is available on the web site of the UiO as well as on that of Språkbanken, which is institutionally part of the National Library. The information was based on personal communication from a person involved in the hosting of Norsk ordbank and the current web pages of Norsk språkbank. I may have misunderstood parts of the information provided. However, that should not be of great importance for the purpose of my paper. In a final, printed version, any misunderstanding would have easily been corrected. LREV List of criteria o Significance of results The paper reports on very old results that have been published before. “Very old results”? That is beyond the point. My paper discusses how to document a certain type of basic computational linguistics activity. In general. The “very old results” serve as a case under discussion – exactly because it belongs to the past and that there has continuously and, indeed, recently been made allegations concerning it. o Technical quality OK, but nothing new. If it is OK, it is OK. “but nothing new” is irrelevant in this connection. o Appropriateness and soundness of the methodology OK O Evaluation of results None Debatable... o Knowledge of field Poor, as regards the field following the decades after A did his work. This is not correct, cf. comments above. o Rigor of arguments Poor (as regarding claims about other work and status of the field) This is not correct, cf. comments above. The paper may be seen as controversial. That is something quite different. o Originality Methods described are not original. Really? I have never seen any published article or book chapter about how to document such projects and how to prevent fraud allegation when sifting through most linguistics literature published in Europe or the US during the last 20 years, monographs and articles of journals or anthologies. Also, the reviewer fails to provide proofs to the contrary. o Clarity of presentation Poor. Lots of unargued claims. Nonsense. Cf above comments. o Acknowledgement of limitations None. What is this supposed to mean? o Organization o References to other work: Poor Nonsense. Cf above comments. o Relevance to the Language Resources and Evaluation audience Not relevant. Method would have been relevant some decades earlier. Nonsense. This is based on a fundamental misunderstanding as to what this entire paper is all about. Cf above comments. Conclusion The author believes that his work at IBM has been forgotten, and even that claims about the IBM material may look like fraud allegations. Having studied the arguments carefully, Hardly. Cf above comments. I don't think that A is right in that he has been forgotten (as discussed above), and I don't think the claims he refers to are allegations of fraud (as discussed above). Further, since A also volunteers the information that he put things straight in two conference papers: Engh 2009 (printed in a volume in History of Nordic computing, at Springer, and Engh 2011 (printed in another volume of History of Nordic Computing, also at Springer), the present paper seems unnecessary. Studying the arguments carefully, the reviewer would have understood that this is exactly the point of the paper: 1) The form of documentation, usual at the time of the project, turned out to be insufficient. 2) Later documentation efforts to put things straight after the first series of fraud allegations were not sufficient as well. 3) The current paper represents one last correction attempt – and in doing so, I discuss documentation of this type of projects in general. (As already mentioned, Engh 2011 has a different topic, and is totally irrelevant for the matter at issue.) The description of the work that was put into the IBM morphology is something that any computational morphologist will recognize. I have compiled a list of work on Norwegian computational morphology that comes in addition to A's own work referred to in the paper. As already mentioned, these papers may be on aspects of Norwegian morphology or have some relationship to it (e.g. how to harmonise the stems and the inflections of a phrase according to standardization level, “conservative”, “radical” etc. in a linguistic software function). However, their topic is different. They are simply irrelevant in the present context. There is nothing new that the present paper brings along that makes it worth publishing. For the future , though, I advise A to stop accusing others of their lack of interest (cf. the section on whining above), Again: The paper does not contain any accusation of others for lack of interest. It tries to repudiate the repeated allegations of fraud, based on a fundamental lack of knowledge as to how the original projects were carried out. In this connection, general aspects of project documentation is discussed. and instead get on with it himself. What is this supposed to mean? Norway is a small country, and the anonymous reviewer knows that I am earning a living as a librarian. And as the reviewer also knows perfectly well, I was prevented from continuing my activities as a linguist exactly because of preconceptions of the type I am referring to in my paper. It is practically impossible to initiate and carry out any great descriptive projects within computational linguistics alone without any support apparatus and in one’s leisure hours. And, I would like to add, in an institutional setting with the type of moral standards exhibited by the reviewer. Some talks and papers on the creation lexical digital resources for Norwegian (The list below has been compiled only to counter A's claim that there is nothing on the linguistic questions regarding the development of lexical resource. These are in addition to the ten papers A lists by Jan Engh, and the three by Ruth Fjeld.) De Smedt, Koenraad; Rosén, Victoria. 2000. Automatic proofreading for Norwegian: The challenges of lexical and grammatical variation.. I: NODALIDA '99. Trondheim: NTNU 2000 s. 206-215 An interesting paper about something else: The handling of the Norwegian variability in phrases in order to ensure uniformity as to the level of the linguistic standard – “radical” vs. “conservative” Bokmål forms etc. in phrases and compounds. Irrelevant in this context. Hagen, Kristin, Johannessen, Janne Bondi and Kristoffersen, Kristian Emil. 1997. Problemer ved bruk av andres lister til taggerformål. Foredrag på Møter om norsk språk 7, Universitetet i Trondheim, 20.-22. november. Not published in the conference report: Jan Terje Faarlund, Brit Mæhlum, Torbjørn Nordgård (eds.): 1998, MONS 7. Utvalde artiklar frå det 7. Møtet om norsk språk i Trondheim 1997. Oslo: Novus. Cf. http://www.nb.no/nbsok/nb/fefffc98e091b61d16b89c184e62d257.nbdi gital?lang=no#3 [accessible in Norway only. Unpublished paper according to http://www.tekstlab.uio.no/norsk/bokmaal/english.html [accessed 18 September 2014] Hellan, Lars; Nordgård, Torbjørn. 1997. The NorKompLex and TROLL lexical systems. Workshop on the Encoding of Verb Constructions Unpublished? Johannessen, Janne Bondi. 1998.Elektroniske hjelpemidler leksikografisk fornying. Norskrift 1998 ;Volum 97. s. 43-68 Focusing on various ways of using information technology while editing dictionaries. I.e. hardly relevant in the present context. Losnegaard, Gyri Smørdal; Samdal, Gunn Inger Lyse; Thunes, Martha; Rosén, Victoria; De Smedt, Koenraad; Dyvik, Helge J. Jakhelln; Meurer, Paul. 2012.What we have learned from Sofie: Extending lexical and grammatical coverage in an LFG parsebank. I: META-RESEARCH Workshop on Advanced Treebanking at LREC2012. European Language Resources Association 2012. Fairly irrelevant. Stating the need for information of the kind that apparently was of no interest to the academic research when IBM produced them, cf. p. 17f. of my paper. Nordgård, Torbjørn. 1997. Argument structure in NorKomples. Workshop on the Representation and Encoding of Verb Constructio ns and Verbs Unpublished? Nordgård, Torbjørn. 1998. Norwegian Computational Lexicon (NorKompLeks). Proceedings of the 11th Nordic Conference on Computational Linguistics 1998 s. 34-44 Not in the stocks of Norwegian university or research libraries according the national union catalogue, BIBSYS. One pointer found on the Internet turned out to be inactive. [18 September 2014] Nordgård, Torbjørn. 2000. NORKOMPLEKS - A Norwegian Computational Lexicon. COMLEX 2000. Workshop on Lexicography and Multimedia Dictionaries; A typo for COMPLEX 2000. Not in the national union catalogue, not on the Internet. [18 September 2014] Nordgård, Torbjørn. 1999. From NorKompLeks to HPSG. HPSG-dager i Trondheim; 1999 Unpublished? Rosén, Victoria. 2002. Fra Bokmålsutboka via NorKompLeks til et LFG-leksikon for norsk. MONS 9 Det niende møtet om norsk språk; 2002-11-22 2002-11-24 On how to cope with stylistic variation in phrases and compounds. Irrelevant in this context. Rosén, Victoria; De Smedt, Koenraad. 2000. *Er korrekturlesningsevnen di god? Resultater fra SCARRIE.. I: Artikler fra 8. møte om norsk språk (MONS 8). Tromsø: Universitetet i Tromsø 2000 s. 214-228 Another interesting paper about something else: The handling of the Norwegian variability in phrases in order to ensure uniformity as to the level of the linguistic standard – “radical” vs. “conservative” Bokmål forms etc. in phrases and compounds. Irrelevant in this context. In fact, a necessary step after the type of work described in my paper. Similar efforts were also made at IBM – without leading to any product. However, this is irrelevant to the matter discussed in my paper, and, consequently, was not mentioned there. Also mentioned in this review: Black, Alan , Stephen Guy Pulman, Graeme Donald Ritchie and Graham Russell. 1991. Computational Morphology. MIT Press. ACL-MIT Series in Natural Language Processing. I could have helped the reviewer finding more titles about computational and plain morphology, but the existence of such titles is simply beyond the point. It does not alter the fact that syntax and semantics have been subject to more interest than morphology since the late 1950ies. Johannessen, Janne Bondi. 1990. Automatisk morfologisk analyse og syntese. Novus Forlag, Oslo. Cf. comments above. * In sum: Partly unpublished and inaccessible conference papers. Partly titles published in places known only to the happy few. Every one of the titles actually published, was published many years after the finalisation of the IBM project. Still, no paper discusses in detail the broad scale, meticulous work needed to interpret, the highly defective official norms for the two variants of written Norwegian, not registration, nor completion etc. As proofs of my ignoring “other people's work”, this list of titles is simply void. One natural question is whether the reviewer ever read these papers. The report of the second reviewer, who for unknown reasons is referred to as Reviewer #3: The paper gives a detailed history of the '80 IBM project for producing lexica of Norwegian, and disputes the later claims that the project simply reformatted existing resources to produce the lexica. This history and the dispute are interesting mainly in the context of (the history of) Norwegian natural language processing, but much less to an international audience. The paper says very little about "How to document the creation of digital language resources" esp. how his should be done in 2013 - this topic would definitely be interesting for readers, whereas the history of a decades past project is much less so. The history of the project has also been published before, in several publications, also in English. The paper is very long - for project notes, the limit is 10 pages. ⃰ Concluding remarks There is, in fact, little to add. Basically, the “argumentation” of the reviewer tells its own tale. Cf. my annotations. Summing up: Contrary to what the reviewer contends, the main purpose of my paper is neither selfpromotion nor promotion of a project of historical interest only. After the documentation listed and one additional final overview, commissioned by the editor of Maal og minne in 2013 (Engh 2014), there would have been no point in doing so anyway. My paper was written as a reaction to allegations of fraud seemingly impossible to refute. And since I had noticed that descriptions of the very craft of creating digital language resources for a language with a fairly complex – and different – morphology, on top of everything a deficiently standardised one, were scarce, I thought I might as well address the general problem of project documentation in the light of my own experience. This is clearly spelled out in my paper for anyone who knows to read – and/or has no personal involvement in the case. So, the reviewer is completely missing the point - intentionally or not. On the whole, the reviewer’s analysis is poor, to say the least, and so is the rigor of argument. Clearly, the conclusion came first; the “arguments” were added to support it. Quite a few dubious assertions are made, especially as far as the context of my paper is concerned. (My chances to refute the misinformation etc.) The reviewer’s reading is generally tendentious and includes strange interpretations of isolated sentences, numerous minor misreadings, and erroneous references, bordering on bluff documentation. Further, the review is spiced up by a few odd remarks on irrelevant matters (cf. comment to page 19). A common characteristic throughout the review is professional incompetence at various levels. Some keywords: • morphology literature compared to linguistics literature in its entirety • language resources creation • lexicography • prescriptive vs. descriptive linguistics and the motive for establishing descriptive linguistics as a discipline • linguistics community culture/folklore: linguists’ attitudes and professional prejudices • a surprising lack of insight as far as contemporary Norwegian language processing literature is concerned From an isolated editing point of view, the review leaves a curious impression as well, as it is • characterised by contradictions and bewilderment. E.g. the reviewer alternates between contending that the IBM project was well known - and unknown • contains defective documentation • as well as a constant insistence on documentation - for matters that need no documentation • replete with dirty rhetorical tricks • focusing on one sentence, misinterpreting it in isolation in order to arrive at • faulty conclusions • displaying polemics against an imagined opponent, attribution of incorrect intentions etc. • use of certain loaded words and phrases: “allegations that are later revealed at p. 23”, “A finally presents ...”, “A’s feeling that he has been accused of fraud” • other tendentious characterisations, e.g. the repeated reference to a “whining tone” 3 The overall tone of the reviewer is aggressive, even malicious, and the author of the paper is characterised in derogatory terms and “told” what to do in the future… All in all, the reviewer is lacking in judgemental power as well as in competence in the relevant fields of linguistics. The tone makes one even wonder whether she is disqualified as a party in the case - or on someone’s behalf? 4 In fact, this begs the question: How did Language resources and evaluation select this reviewer? However, the review is not only interesting with regard to this particular journal. It is even revealing as far as today’s academic publishing in general, and in particular its peer review institution, is concerned. There simply isn’t any “peer” involved, neither with regards to social position nor, unfortunately, scientific insight. It is rather the result of revamped professorial rule. However, not the open, fairly transparent professorial rule of former times, but a pernicious, incompetent version protected by anonymity. The review brings to the surface the reviewer’s intense need to defend what one may call the national linguistic establishment and local academic circles - and its members’ right to adapt the reality to their advantage in international fora of their peers. Thus the review proves beyond any doubt why it was necessary for me to write my paper in the first place. Unfortunately, it even demonstrates why this type of paper will never be properly published – which implies that practices similar to the one criticised in my paper will most probably go on without being distracted by “inappropriate” critics. When someone with an academic title presents incorrect information, even allegations of fraud, in an international forum (EURALEX, CLARIN) without the knowledge of the person(s) involved, the latter will be unable to protest in a similar forum (for instance in Language resources and evaluation) once aware of the allegations. This is the reason why I, in the end, have to publish my paper in this way. In a “private”, remote, and unnoticed channel. Under the sign of sidelined polemics. In the long run, this situation will be detrimental to linguistics. Additional bibliography Engh, Jan: 2014, “IBMs leksikografiske prosjekt for norsk 1984–1991”. Maal og minne 106/1, 67-101 Hovdhaugen, Even et al.: 2000, The History of linguistics in the Nordic countries. Helsingfors: Societas Scientiarum Fennica 3 This expression merits a comment. “Whining tone” is a rather infrequent expression, at least in UK English. (A quick Google search for the phrase in the UK domain, excepting all electronic dictionary entries, produced 44 hits only.) Interestingly, the corresponding Norwegian verb SYTE is quite fashionable especially among women managers in order to dismiss unwanted professional proposals or criticism. SYTE implies that the criticism is immature and/or unjustified and as such morally debatable. 4 Oddly enough, at least one linguistic detail points in this direction: “The IBM company” is not a usual denomination of the ‘IBM corporation’. Incidentally, this phrase also appears in Fjeld 2000.