Reflective essay_1_Ling115 Jinxiao Song Discussion on the first week: Introduction (Barlow, 2011; Norvig, 2011) We talked about two articles on the first class. One is a presentation paper by Perter Norvig on 2011, another is Michael Barlow’s paper ‘Corpus linguistics and theoretical linguistics. Both articles gave us interesting discussions about a debate, and relationship between corpus linguistics and classical linguistics. Corpus Linguistics Corpus linguistics is the study of language as expressed in samples of real world text. It presents us a digestive approach to deriving a set of abstract rules by which a natural language is governed or relates to another language. [1] What research question does the article address? These two articles talks about a debate between data based linguistic research and the classical, theoretical based approach. 1) Barlow introduces some basic issues concerning the treatment of theory and data in corpus linguistics and outlines three broad areas in which corpus linguistics has made a significant contribution to our understanding of language, including, the provision of frequency information, the highlighting of the importance of collocations, and the description of variation and text types. The article also points out the drawbacks of traditional charategories of linguistic representation, and that corpus linguistics may serve as a handful tool to uncover some language inherent differences using data-based approaches. Besides, this paper illustrates such differences by providing an overview of British and American linguistic traditions with a focus on their connection with findings from corpus analysis. In addition, Barlow points out several current theoretical issues of abstractions and generalizations, patterns in the data and patterns in the mind, as well as the cognitive and social dimensions of language. [1] http://en.wikipedia.org/wiki/Corpus_linguistics 1 2) Norvig’s presentation article is more interesting. His essay discusses what Chomsky said, speculates on what he might have meant, and tries to determine the truth and importance of his claims. Chomsky strongly disagrees with the idea of using algorithms as a tool for linguistic research. He argues about the unscientific nature of statistical modeling as a method. But Peter Norvig, in response to Noam Chomsky’s criticism of statistical/probabilistic based NLP, articulates well his perspectives on issues relating to the philosophy of science and the relationship between science and engineering. Why do you find the research question interesting? Corpus linguistics make use of programs, data collections, which are machine-readable, to help human analyze a texts in a more convenient way. Various corpuses have been proved to be helpful to linguistic research. For example, when we need to examine the change of a word’s semantic meaning, such as when did this word first come into use, what context is related to it, is there any statistical proves that can show the word’s historical changes clearly. I realized the importance of corpus when I try to learn to use COCA, the corpus of contemporary American English. I wanted to know how has the word ‘entrench’ changed in its semantic meanings along with time. I just need to search [entrench], and the corpus will give me back a summarized list of words such as ‘entrenched’, ‘entrenching’, as well as the token frequency of the year I choose, charts that show clearly what genres are more likely to use this word, etc. This is very useful, convenient, and fast. Except for semantics, corpus can be used to conduct language acquisition research, spelling conventions, syntax changes. It’s hard to imagine how long it will take a human to collect the same set of data and generate the same report. However, at the same time, corpus data are at the very first stage based on human understanding of linguistics. Chomsky’s theoretical approach definitely has contributed to our understanding of the nature of language, but it does not mean statistical approach can not help human understand language deeper. Without our intuitive justification of what come out of statistical approach, corpus results can be misleading. As the example in Norvig’s article points out “ no matter how many repetitions of ‘ever’ you insert, two 2 sentences are grammatical, two are not. A probabilistic Markovchain model cannot handle all of English.” What conclusion about the question does the article draw? Barlow concluded that the relationship between corpus linguistics and theoretical linguistics is multifaceted and hence numerous ways of approaching the topic present themselves. Researchers will have differing views on particular aspects of the current issues and on the relationship between corpus data and linguistic theory. We will see progress within the field of corpus linguistics as the size and range of corpora increases and the theoretical frameworks become more sophisticated. Norvig criticized Chomsky’s view in a polite manner. He compared Chomsky’s viewpoint to a Platonist, a rationalist and perhaps a mystic. What Chomsky argues is a ideal, abstract. That’s why Chomsky is not interested in language performance. While from Norvig’s point of view, it is more empirical, more useful to human world. We can see that probabilistic/statistical approach have achieved huge achievements. Google, Corpus, Speech recognition, dictionary, sociolinguistics… We should not deny the good part of probabilistic/statistical approach because of some errors and misleading results that are still noticeable by most of us. Do you find the authors’ approach satisfactory? If not, how else would you do it? Chomsky’s argument is clearly not convincing. My own perspective on these things is that the resources and methods of statistical speech and language processing, rather than being some sort of alternative or competitor or replacement for the scientific study of speech, language, and communication, instead give us wonderful new tools for doing science in this area. Tools certainly have drawbacks, even using a knife we might cut our fingers, but should we deny the wonderful artworks, foods, sculptures created by us using a knife? What’s more important is to train us to handle these tools using more experienced, controlled methods. 3