COCA Teacher’s Guide COCA (CORPUS OF CONTEMPORARY AMERICAN ENGLISH) BASICS 1. Explain COCA: COCA is online concordancer that uses Corpus of Contemporary American English Corpus: an electronic database of written texts or transcriptions of speech (similar to text files on the internet. Difference is that it only has “words”, not images or videos) Simply put, it is a large collection of texts. COCA is a corpus of texts in contemporary American English Concordancer: A tool to search through and analyze a corpus (similar to google search engine, or “ctrl + F” function of your computer. But concordance is much more sophisticated than google) Most common usage of concordance for ESL learners is to see if certain words (string of words) are actually used, or how frequently used. 2. Getting started Go to: http://www.americancorpus.org. You need to register (for free!) first. 3. Basic functions A. Display Options - LIST: Show a list of word(s) or combination of words (ranked according to their frequency) E.g) Spring break (653 times) - CHART: Show a chart comparing frequencies of a word in different genre or time. E.g) cool (is not frequently used in an academic writing) vs. suffrage (is frequently used in academic writing, but rarely in speech) - KWIC: Show the key word(s), i.e. search word(s), in contexts E.g) There is no limit to ~ ing - COMPARE: Compare two words according to their frequencies (just generally or with a certain collocate) E.g) hot debate vs. heated debate B. Types of queries (Search string) - W ORD: a search word or phrase - COLLOCATES: a word (not a phrase) that occurs within up to 10 words before / after the search word(s) You can choose the collocation range by clicking two little boxes next to the COLLOCATE box. - POS LIST: List of “parts of speech” This function is used when you don’t know the exact word (or a collocate of the word) you are looking for, but you know which part of speech (noun, verb, adjective, etc.) you want. This is Created by Jin Kim, 2011 COCA Teacher’s Guide also called “wildcard” search. Place the cursor in the word or collocates box and click the drop-down arrow and select a part of speech that you are looking for. For example, if you don’t know which preposition to use in a sentence like “I am going to New York ___ Spring Break”, you may try… [WORD]: Spring break [COLLOCATE]: [i*] (This will appear if you click “prep ALL” option in the POS LIST) From the search result, you can conclude that “on” is most frequently used with Spring break. COCA (CORPUS OF CONTEMPORARY AMERICAN ENGLISH) EXERCISES A. Learning collocations with COCA Step 1: Look at the example sentence below. Answer questions a) - c). Then, circle the word that you should use in your paper. I am fully / totally aware of the problem. a) In which genre is “totally” most frequently used? spoken b) In which genre is “fully” most frequently used? Academic c) So, which word would you use in your paper? “fully” HINT) You should use CHART display option. Step 2: Using COCA, find a better (more frequently used) collocate for the word “technology” in the sentence below. I’m studying utilization / application of modern technology in classes. Answer: application HINT) You should use COMPARE display option with COLLOCATE search string. Step 3: Look at the following sentences. Which one is correct? Use KWIC option in COCA to find out the answer. 1) I am looking forward to meeting you in class. 2) I am looking forward to meet you in class. Answer: 1) HINT) You should use KWIC display option. Step 4: Look at the example sentences below. The underlined word in each sentence is an awkward collocate of the word in bold. Using COCA, find better collocates and revise the Created by Jin Kim, 2011 COCA Teacher’s Guide sentences. (You should keep the original meaning of each sentence) Decide which function you should use on your own. If you can’t think of which one to use, look at the hint below.) 1) I hope to succeed the goal. achieve, accomplish, reach 2) There has been a hot debate over the issue. heated, intense 3) He firmly recommended this place. strongly HINT) You should use LIST display option with a wildcard (v*, adj*, adv*) COLLOCATE or a synonym COLLOCATE search string. Detailed Teacher’s Guide I. How to solve problems in Step 1 1) Try CHART inquiry (The result is displayed on the right) of “totally” as below. Result: “totally” is most frequently used in speaking! Select CHART Type in the word Click! 2) Do the same thing for the word “fully”. You will find in the result that it is most frequently used in an academic genre. (Note that next to the “genre” chart, there also is a “time” chart, which some people might be interested to look at.) II. How to solve problems in Step 2 I’m studying utilization / application of modern technology in classes. 1) Try COMPARE inquiry following steps below. Created by Jin Kim, 2011 COCA Teacher’s Guide Select COMPARE Type words to compare (one word only for each box!) Type in the This means the collocate collocating word Click! should occur within 4 words (four word slots) after the search word(s). This means the collocate should 2) You will see the result as below. occur within 0 words (0 word slot) before the search word(s). The word “utilization”(W1) was used only 21 times while “application”(W2) was used 120 times. So, “application” is a better choice. Note: For some reason, you cannot click the numbers to see context (although it says so at the top. Weird, huh?) So, if you want the context of each case, you may try the KIWIC search of each word separately. Also, don’t be too concerned with other functions like W1/W2 or SCORE for now. III. How to solve problems in Step 3 1) I am looking forward to meeting you in class. 2) I am looking forward to meet you in class. 1) Try KWIC inquiry following steps below. Show me the word(s) below in context… Type in the word(s) Click search! Created by Jin Kim, 2011 COCA Teacher’s Guide 1) You will see the result as below. From the result, you can see what kinds of words follow the given phrase. Here, “to” is used as a preposition since it is followed by noun (phrases). (Words of the same part of speech are marked with the same color.) IV. How to solve problems in Step 4 1) I hope to succeed the goal. 1) Try Wildcard collocate LIST inquiry, following steps below Select LIST This means the collocate should occur within 4 words (four word slots) before the search word(s). Select the part of speech you are looking for. If you select one, its acronym will automatically appear in the collocates box above. Click! 2) Look at the search result. Can you find which word you want to use from the list? Aha! This word is what I’m looking for! Click the word if you want to see the context 3) Still not satisfied? Do you want to see ONLY the synonymous verbs? Then, try a synonym inquiry as below. Created by Jin Kim, 2011 COCA Teacher’s Guide Synonym Collocate LIST inquiry Select LIST This means synonyms of “succeed” that collocates with the word “goal” Click! Aha! This word again is the best collocate! Example 2), 3) can all be solved in the same way. E. Be careful! 1. Always check the CONTEXT and GENRE! It is often dangerous to look at only the frequency count and decide which one to use. Having similar frequency counts does not always mean both words are possible in a given context. One limitation of COCA or some other corpora, however, is that we can usually only see “one line” of the context, which sometimes is not enough. 2. You may have multiple words in the [WORD] slot, but you cannot have strings of two or more words in the [COLLOCATES] box. Try reformulating your query so that the multiple words are in the [WORD] slot, and the single word is in the [COLLOCATES] box. 3. If there is no or few result showing, it happened for one of the following reasons: - One of the words could be spelled wrong, or an ungrammatical word. - The word combination is impossible or rare. 4. Currently, you cannot search for collocates of words that occur more than 3,000,000 times in the corpus, or which are both very frequent (sum of all forms) and have many different forms. 5. COCA can’t do “Compare” words on 1- 2 letter words since it creates problems with script. Created by Jin Kim, 2011