collocations – introduction

advertisement
COCA Teacher’s Guide
COCA (CORPUS OF CONTEMPORARY AMERICAN ENGLISH) BASICS
1. Explain COCA: COCA is online concordancer that uses Corpus of Contemporary
American English
Corpus: an electronic database of written texts or transcriptions of speech (similar to text files on the internet.
Difference is that it only has “words”, not images or videos) Simply put, it is a large collection of texts. COCA
is a corpus of texts in contemporary American English
Concordancer: A tool to search through and analyze a corpus (similar to google search engine, or “ctrl + F”
function of your computer. But concordance is much more sophisticated than google) Most common usage
of concordance for ESL learners is to see if certain words (string of words) are actually used, or how
frequently used.
2. Getting started
Go to: http://www.americancorpus.org. You need to register (for free!) first.
3. Basic functions
A. Display Options
- LIST: Show a list of word(s) or combination of words (ranked according to their frequency)
E.g) Spring break (653 times)
- CHART: Show a chart comparing frequencies of a word in different genre or time.
E.g) cool (is not frequently used in an academic writing) vs. suffrage (is frequently
used in academic writing, but rarely in speech)
- KWIC: Show the key word(s), i.e. search word(s), in contexts
E.g) There is no limit to ~ ing
- COMPARE: Compare two words according to their frequencies (just generally or with a
certain collocate)
E.g) hot debate vs. heated debate
B. Types of queries (Search string)
- W ORD: a search word or phrase
- COLLOCATES: a word (not a phrase) that occurs within up to 10 words before / after
the search word(s) You can choose the collocation range by clicking two little boxes next to the
COLLOCATE box.
- POS LIST: List of “parts of speech”
This function is used when you don’t know the exact word (or a collocate of the word) you are
looking for, but you know which part of speech (noun, verb, adjective, etc.) you want. This is
Created by Jin Kim, 2011
COCA Teacher’s Guide
also called “wildcard” search. Place the cursor in the word or collocates box and click the
drop-down arrow and select a part of speech that you are looking for.
 For example, if you don’t know which preposition to use in a sentence like “I am going to New
York ___ Spring Break”, you may try…
[WORD]: Spring break
[COLLOCATE]: [i*] (This will appear if you click “prep ALL” option in the POS LIST)
From the search result, you can conclude that “on” is most frequently used with Spring break.
COCA (CORPUS OF CONTEMPORARY AMERICAN ENGLISH) EXERCISES
A. Learning collocations with COCA
Step 1: Look at the example sentence below. Answer questions a) - c). Then, circle the word
that you should use in your paper.
I am fully / totally aware of the problem.
a) In which genre is “totally” most frequently used? spoken
b) In which genre is “fully” most frequently used? Academic
c) So, which word would you use in your paper? “fully”
HINT) You should use CHART display option.
Step 2: Using COCA, find a better (more frequently used) collocate for the word “technology” in the
sentence below.
I’m studying utilization / application of modern technology in classes.
Answer: application
HINT) You should use COMPARE display option with COLLOCATE search string.
Step 3: Look at the following sentences. Which one is correct? Use KWIC option in COCA to
find out the answer.
1) I am looking forward to meeting you in class.
2) I am looking forward to meet you in class.
Answer: 1)
HINT) You should use KWIC display option.
Step 4: Look at the example sentences below. The underlined word in each sentence is an
awkward collocate of the word in bold. Using COCA, find better collocates and revise the
Created by Jin Kim, 2011
COCA Teacher’s Guide
sentences. (You should keep the original meaning of each sentence) Decide which function
you should use on your own. If you can’t think of which one to use, look at the hint below.)
1) I hope to succeed the goal. achieve, accomplish, reach
2) There has been a hot debate over the issue. heated, intense
3) He firmly recommended this place. strongly
HINT) You should use LIST display option with a wildcard (v*, adj*, adv*) COLLOCATE
or a synonym COLLOCATE search string.
Detailed Teacher’s Guide
I. How to solve problems in Step 1
1) Try CHART inquiry (The result is displayed on the right) of “totally” as below.
 Result: “totally” is most
frequently used in speaking!
 Select CHART
 Type in the word
 Click!
2) Do the same thing for the word “fully”. You will find in the result that it is most frequently
used in an academic genre. (Note that next to the “genre” chart, there also is a “time”
chart, which some people might be interested to look at.)
II. How to solve problems in Step 2
I’m studying utilization / application of modern technology in classes.
1) Try COMPARE inquiry following steps below.
Created by Jin Kim, 2011
COCA Teacher’s Guide
 Select COMPARE
 Type words to compare
(one word only for each box!)
 Type in the
 This means the collocate
collocating word
 Click!
should occur within 4 words
(four word slots) after the search
word(s).
 This means the collocate should
2) You will see the result as below.
occur within 0 words (0 word slot)
before the search word(s).
The word “utilization”(W1) was used only 21 times while “application”(W2) was used
120 times. So, “application” is a better choice.
Note: For some reason, you cannot click the numbers to see context (although it says
so at the top. Weird, huh?) So, if you want the context of each case, you may try the
KIWIC search of each word separately. Also, don’t be too concerned with other
functions like W1/W2 or SCORE for now.
III. How to solve problems in Step 3
1) I am looking forward to meeting you in class.
2) I am looking forward to meet you in class.
1) Try KWIC inquiry following steps below.
 Show me the word(s)
below in context…
 Type in the word(s)
 Click search!
Created by Jin Kim, 2011
COCA Teacher’s Guide
1) You will see the result as below. From the result, you can see what kinds of words
follow the given phrase. Here, “to” is used as a preposition since it is followed by
noun (phrases). (Words of the same part of speech are marked with the same color.)
IV. How to solve problems in Step 4
1) I hope to succeed the goal.
1) Try Wildcard collocate LIST inquiry, following steps below
 Select LIST
 This means the collocate
should occur within 4 words
(four word slots) before the
search word(s).
 Select the part of speech
you are looking for. If you
select one, its acronym will
automatically appear in the
collocates box above.
 Click!
2) Look at the search result. Can you find which word you want to use from the list?
Aha! This word is what I’m looking for!
Click the word if you want to see
the context
3) Still not satisfied? Do you want to see ONLY the synonymous verbs? Then, try a synonym
inquiry as below.
Created by Jin Kim, 2011
COCA Teacher’s Guide
Synonym Collocate LIST inquiry
 Select LIST
 This means
synonyms of “succeed”
that collocates with the
word “goal”
 Click!
Aha! This word again is the best collocate!
Example 2), 3) can all be solved in the same way.
E. Be careful!
1. Always check the CONTEXT and GENRE! It is often dangerous to look at only the frequency
count and decide which one to use. Having similar frequency counts does not always mean
both words are possible in a given context. One limitation of COCA or some other corpora,
however, is that we can usually only see “one line” of the context, which sometimes is not
enough.
2. You may have multiple words in the [WORD] slot, but you cannot have strings of two or more
words in the [COLLOCATES] box. Try reformulating your query so that the multiple words are in
the [WORD] slot, and the single word is in the [COLLOCATES] box.
3. If there is no or few result showing, it happened for one of the following reasons:
- One of the words could be spelled wrong, or an ungrammatical word.
- The word combination is impossible or rare.
4. Currently, you cannot search for collocates of words that occur more than 3,000,000 times in the
corpus, or which are both very frequent (sum of all forms) and have many different forms.
5. COCA can’t do “Compare” words on 1- 2 letter words since it creates problems with script.
Created by Jin Kim, 2011
Download