This is the author’s postprint. Copyright is now held by Edinburgh University press. Final version is available at http://www.euppublishing.com/doi/abs/10.3366/cor.2014.0049. A, an and the environments in Spoken Korean English Glenn Hadikin School of Languages and Area Studies University of Portsmouth Park Building King Henry 1 Street Portsmouth PO1 2DZ Glenn.hadikin@port.ac.uk This paper comprises an analysis of small corpora of spoken Korean English: a burgeoning New English that is rarely discussed in published articles. With a theoretical framework based on Hoey’s Theory of Lexical Priming (Hoey 2005) the lexical environment surrounding the items a, an and the in two Korean corpora (one comprising Korean English speakers in Liverpool, England and the other, speakers in Seoul, Korea) are compared with two British comparator corpora. The results show a balance of differences and similarities between the Korean corpora which may suggest that while Korean English is distinct from British varieties recent priming effects and the L1 are interacting in complex ways that give each corpus a unique identity. 1 Introduction The Republic of Korea, or South Korea, is a small country situated between China and Japan and has a population of just fewer than 50 million1. The people speak Korean, considered to 1 Population data taken from http://www.worldatlas.com/aatlas/populations/ctypopls.htm on 24/3/12 1 be either an isolated language or part of the Altaic group that includes Turkic and Japonic languages (Lee and Ramsey 2000), and, as Porter (2011) reports, ‘English mania’ has now become so widespread that people have even had surgery on their tongue in the hope that it will improve pronunciation. The search term “English in Korea” gets notably more hits on Google than the term “English in China” (1.6 million and 618 000 respectively2); English has been taught to all middle and high school students since 1945 and in all elementary schools since 1997 (Tollefson 2002) but until very recently it was rarely listed as a World English in reference books such as Crystal (2003:111). Crystal reproduces a ‘circle of World Englishes’ from McArthur (1987) that includes 51 regional varieties including Appalachian and Inuit English as well as Chinese and Japanese English but it does not mention the English used in South Korea whatsoever (in this paper the terms Korea and South Korea are both used to refer to the Korean republic). The large number of teaching positions being advertised online at the time of writing, however, is a reflection of the level of interest in English3 and the extent to which it is used in Korea in the 21st century is a reflection of Korea’s developing multiculturalism (Mehlsen 2011 provides a useful summary) where English is used more and more outside of the classroom albeit typically in groups where at least one speaker does not speak Korean. Korean English has more recently been discussed as part of a World Englishes model (see Kachru and Nelson 2006) and the variety will be discussed in this light throughout this paper i.e. although Korean speakers often refer to external norms it is a developing form of English in its own right and its unique features can be described and discussed without necessarily being seen as erroneous. This is in contrast to discussions of Konglish (a disparaging term used to describe a mixture of Korean and English). Korean Learner English, comprising the features described in Lee (2001), is a more positive construct that highlights certain cases where the L1 affects Korean people’s English, and is noted to still carry the implicit message that features of English unique to Korea or East Asia are problematic. Previous Korean English studies have tended to focus on pronunciation (see Yeni-Komshian, Flege and Liu 2000 for example) or ideological and pedagogical issues of teaching English such as Park’s (2009) study which claims three underlying ideologies in Korean English: necessitation is the 2 Google searches conducted on 24/3/12 3 See travelandteachrecruiting.com and teachkoreans.com as examples 2 idea that English is a necessary tool for success in a global economy, externalisation suggests that English is often still seen as the language of the other and can conflict with a Korean identity and, finally, a shared ideology of self-depreciation – that no matter what they do Koreans see themselves as poor at English (Park 2009). A study that explicitly argues that Korea now has a form of codified English that is taught in schools is Shim (1999). Shim highlights a variety of usages that are found in Korean English textbooks ranging from lexico-semantic differences (on life used as a synonym for alive e.g. gardens come on life again) to grammatical differences such as her claim that the simple present tense is not differentiated from the progressive form; as an example Shim reports that the following exchange would be acceptable in codified Korean English: Q What happens to the grass and trees when spring comes? A The grass is turning green and trees are budding with fresh leaves. (Shim 1999: 253) Shim (1999) discusses articles twice under a heading of morpho-syntactic differences (between Korean and American English): the first is her suggestion that students are taught that a noun phrase must be preceded with the definite article when the noun phrase contains a relative clause so he is the man who can help other people (Shim’s example) must be used rather than he is a man who can help other people. Shim’s second point regarding articles is that Korean English allows for more variation in terms of count/noncount nouns and gives the example although it is a hard work, I enjoy it as an acceptable structure. For the purposes of this paper I accept Shim’s claims of codification as evidence that Korean English has begun to separate from related varieties. Note, however, that this study is now over twelve years old and there has not, to my knowledge, been a corpus-driven study published that highlights Korean English as it is actually used in the 21st century. With this lack of corpus-driven studies in mind I collected and transcribed two corpora of Korean spoken English for my PhD. The motivation for creating two corpora was to allow me to explore similarities and differences between Korean English as spoken by volunteers in Korea itself with that of comparable Korean volunteers speaking English in the UK. The theoretical basis for this paper is Hoey’s Lexical Priming which postulates that: 3 As a word is acquired through encounters with it in speech and writing, it becomes cumulatively loaded with the contexts and co-texts in which it is encountered, and our knowledge of it includes the fact that it co-occurs with certain other words in certain kinds of context. (Hoey 2005:8) Lexical Priming repositions the related phenomena of collocation and colligation at the very heart of language so that even traditional grammar is seen as a secondary output. The following ten priming hypotheses are posited: 1. Every word is primed to occur with particular other words; these are its collocates. 2. Every word is primed to occur with particular semantic sets; these are its semantic associations. 3. Every word is primed to occur in association with particular pragmatic functions; these are its pragmatic associations. 4. Every word is primed to occur in (or avoid) certain grammatical positions, and to occur in (or avoid) certain grammatical functions; these are its colligations. 5. Co-hyponyms and synonyms differ with respect to their collocations, semantic associations and colligations. 6. When a word is polysemous, the collocations, semantic associations and colligations of one sense of the word differ from those of its other senses. 7. Every word is primed for use in one or more grammatical roles; these are its grammatical categories. 8. Every word is primed to participate in, or avoid, particular types of cohesive relation in a discourse; these are its textual collocations. 9. Every word is primed to occur in particular semantic relations in the discourse; these are its textual semantic associations. 10. Every word is primed to occur in, or avoid, certain positions within the discourse; these are its textual colligations. Reproduced from Hoey (2012) Hoey (2005) argues that cultures harmonise their primings in three key ways: formal education, shared literary and religious traditions and the mass media. If we are primed then by television, radio, adverts, our friends, teachers, neighbours and family members - indeed every single instance of language we are exposed to - it would be reasonable to expect 4 measurable differences in the language used by two communities in two different countries even when they share a first language and cultural background. This study was developed to test such a hypothesis as well as to explore the level of similarity between the corpora. The following research questions have guided the study: 1 What are the key similarities between the two Korean English corpora in terms of the lexical environment around the articles a, an and the? 2 What are the key differences between the Korean corpora and two British reference corpora in terms of the lexical environment around articles? 3 To what extent can Lexical Priming theory account for any observed variation? Hoey (2005) makes use of selected 2-grams and 3-grams (amongst others) in his own analysis. On the face of it an analysis of function words may seem surprising (not least when one notes Hoey’s claim that every word has pragmatic and semantic associations) but Hoey provides a convincing argument that in the winter has different semantic primings from in winter – in his corpus of news articles from The Guardian; the former tends to occur with material process verbs whereas the latter tends to occur with relational process verbs which highlights the potential priming effects of the. The focus on function words is also likely to minimise variation depending on the topics discussed during data collection. Note Hoey’s (2005) caution - shared by the author - that concordance lines cannot be taken as direct evidence that speakers are primed (psychologically) to associate certain words and strings with other words and strings; the concordance lines and corpus data are to be seen as an indication of strings that could reasonably be seen as being primed. In a similar vein I share Hoey’s rejection of lemmatisation for the purposes of this study simply because one cannot assume different forms of a lexical item will share primings; get a job for example will be treated independently from got a job unless the data provide a reason to discuss common primings. Hoey (2005) also suggests that a speaker’s L2 primings will be superimposed on their L1 primings; this is clearly a complex relationship that requires further research but it will be touched upon in this paper and partially explains the choice of articles as the focus of this paper. Articles were chosen partially because there is no article system in the Korean language (and thus less obvious L1 interference) and, perhaps more so, because many of my students and 5 respondents reported that it was the most challenging aspect of learning and using English. Lee (2001) only refers to articles by stating that the Korean language does not have them while Ko et al. (2012) highlight a number of problems surrounding article use in Korean English but is based on analysis of very specific tasks consisting of the volunteers responding to prompts such as draw circles around the books during written and picture-based tasks rather than speech. Chuang and Nesi (2006) use a corpus-driven method to study article use in Chinese Written English and show that up to 29.7% (including problems with the zero article) of the learners’ errors are article related. It also seems reasonable to suggest that articles would be subject to subconscious priming effects (rather than the more conscious effects of education) while speakers are focussing on the (more salient) content of their speech. I begin with a brief summary of the two Korean corpora and two British comparator corpora that were used. 2 Four corpora The two spoken Korean English corpora were collected in Liverpool and Seoul in 2008 and are named SK (for Seoul Koreans) and LK (for Liverpool Koreans). For each recording the Korean informant and myself were situated in a small room as I began by asking questions about their reasons for studying English, hobbies and career ambition (for example); I aimed to keep the conversations as informal as possible and was keen to find a subject that would ‘get them talking’ freely without focussing on form. Table 1 shows that SK and LK were well matched for age and years of learning English but that SK has a more notable female bias. Ideally the number of speakers and gender balance would be better matched but difficulties finding respondents who were willing to be recorded speaking English prevented this. (Recall Park’s (2009) suggestion that Korean speakers tend to feel that they are poor at English and thus may be hesitant to volunteer for such studies.) 6 SK LK Number of respondents 39 28 Average age 25 27 Gender 29f (78%) 16f (57%) 8m (22%) 4 12m (43%) 9.7 12.2 Average years learning English Table 1: Korean informants The respondents in Liverpool had spent an average of two years living in the UK. My own utterances were removed from the main Korean corpora and not used in subsequent frequency counts but all audio files and complete transcripts were kept for reference. The total number of word tokens in each of the four corpora used in this study is shown in Table 2. Liverpool Korean corpus (LK) 83 446 Seoul Korean corpus (SK) 112 621 Scouse corpus (SCO) 106 562 Demographic section of spoken BNC (BNC) 3 945 881 Table 2: Corpus sizes 4 Two respondents in Seoul chose not to complete the demographic data sheet 7 As I was directly involved in preparing the Korean corpora great care was taken to keep them as comparable as possible; the comparator corpora, however, were not prepared specifically for this study so certain differences must be noted and taken into account. A corpus of native Liverpool spoken English or ‘scouse’ (SCO) was developed by a colleague between 2001 and 2004 (Pace-Sigge 2010) so I used this as a comparator corpus because of its similar size and to allow me to account for any possible influence of the local primings on the Liverpool (LK) volunteers. Note, however, that the SCO has a total number of 51 speakers, a larger proportion of males at 54%, the volunteers are slightly older than the Korean speakers with an average age of 33 and there is also a large number of group discussions compared with the one-to-one arrangement used for the Korean corpora. Finally the much larger demographic spoken section of the British National Corpus (BNC) was used. This is a large reference corpus with data collected by 124 volunteers in 38 UK locations in the section used for this study (What is the BNC? 2012) but with notably older recordings collected in 1991 and 1992 one has to be cautious about any language structures that may be changing in this timescale; the difference between the one-to-one interview-like type data collection in SK and LK and the freer recording used in the SCO and the BNC (including groups) must also be noted. All analysis was carried out using WordSmith tools version five (Scott 2011); a simple orthographic transcription was used for all the data and small differences such as the transcription of filled pauses (er and um) between the Korean data and SCO were not judged as being problematic for the current study. 3 The a environment Table 3 shows the frequency of the article a in each of the four corpora alongside the most frequent R1 collocates. L1 items would be affected by the speakers’ primings but R1 items were only focussed on in this paper in order to explore the relationship between the article and the other components of a noun phrase that follow. Note that while many R1 items would traditionally be seen as specific components of a noun phrase, for the purposes of this study I would see this as simply as colligation (a, for example, appears to colligate with quantifiers 8 such as lot and little in the data shown in Table 3) and I wish to avoid further assumptions about structure unless it comes out of the data. The column labelled dispersion shows how many files the string occurs in and shows that the strings under observation are evenly spread through the files rather than being clustered in one or two (and, particularly for SK and LK, this would suggest only one or two speakers are using the form). Table 3: Frequency details for a and most frequent R1 collocates in four corpora 3.1 a lot versus a bit The most frequent 2-gram in the three smaller corpora is a lot which is discussed in some detail in Hadikin (2011a) and Hadikin (2011b) so is not repeated other than to mention two key points. The first is that a high frequency 5-gram there are a lot of seems to be driving the very high frequency of 170 occurrences of a lot (or 1509 per million) seen in SK; SK contains 14 occurrences of there are a lot of compared to just 17 in the BNC which is 35 9 times larger (p < 0.0001 with two-tailed Fisher’s exact test (FE)). The second is the possible influence of the speakers’ first language form 많은 (manun) which is often used without any variation and glossed as there are/is a lot of in translation dictionaries. The LK data seems more heavily influenced by the string quite a lot which more closely reflects the percentage values in the British corpora; eight percent of the usage of a lot in LK consists of the string quite a lot compared with four percent in SCO and six percent in the BNC (Table 4). p = 0.25 when LK is compared with the BNC for quite a lot versus OTHER a lot and so the difference is not statistically significant but p = 0.0009 (two-tailed FE) when SK is compared with the BNC – this is clearly below the oft-cited cut off point of 0.05 for statistical significance (see Gries 2009 for example). Table 4: Frequency detail for a lot and quite a lot in four corpora Rather than the string a lot, this highest frequency position is occupied by a bit in the BNC and an important factor affecting the frequency of this 2-gram is the use of the string a bit of (Concordance 1). From the concordance this structure appears to be used for a number of functions ranging from the expression of large amounts (quite a bit of work) to somewhat fixed expressions (a bit of a pain). 10 Concordance 1: Sample concordance of a bit of from BNC 11 The normalised frequency of a bit of in the BNC is 264pm compared with 197pm in SCO and 9pm and 48pm in SK and LK respectively; this suggests that the form may not be established in Korean English (two-tailed χ2 with Yates’ correction is 22.5 when a bit of is compared with OTHER of in SK and the BNC, df=1, p <0.0001). Four occurrences in LK suggest that the form may be developing with the presence of a bit of a passion, two closely related strings relating to studies from two speakers: I studied a bit of sociology and I studied a bit of infectious disease and what appears to be a reformulation as a speaker produces we had a bit of we had a few complaints which suggests the speaker was primed to refer to small amounts as a bit of but then became aware of the problematic string we had a bit of complaints and reformulated. 3.2 a little bit Returning to Table 3 it is interesting to note that the string a lot takes up 15% of all uses of a in both Korean corpora while the corresponding figure is 4% in both British corpora. The variation of the second most frequent 2-gram, a little, in both Korean corpora appears to be less striking with 8% of SK, 6% of LK and 2% of the British corpora but these numbers hide a surprising point. The 3-gram a little bit takes up 78% of all a little strings in SK, 79% in LK but just 31% and 32% in SCO and the BNC (see Table 5, p <0.0001 when a little bit is compared with a little OTHER for SK and the BNC, two-tailed FE). After the highest frequency of 70 occurrences of a little bit in SK the second most frequent a little * string is a little different but with just three occurrences there is a notable drop in frequency that is shared with LK. It seems reasonable to suggest that speakers of Korean English are primed to use the string a little bit and that this may partially explain the smaller frequencies of the string a bit seen in Table 3. The phrase a little bit appears as an adjective modifier in most cases with examples ranging from a little bit different and a little bit better to a little bit free and a little bit hard; this could reflect the L1 primings for the item 조금 (chokum). 12 Table 5: Frequency details for a little and high frequency R1 items in four corpora 3.3 get a job This final section, before moving on to the an environment, centres on the third most frequent a * 2-gram in SK: a job. It stands out amongst the other top four frequent 2-grams in that it is a psychologically complete unit that does not refer to an amount (cf. a lot, a little, a very, a few, a bit and a good). This should not come as a complete surprise considering that many of the informants were students and were likely to be practising English to improve their job prospects but, as is often the case, the lexical environment that this particular 2-gram carries with it tells a story about the primings of the informants that may mark Korean English as subtly different from other varieties. 13 Table 6: Frequency and percentage chart for the 3-gram get a job (based on Biber 2009) In this case it appears that the Korean English speakers are primed to use the string get a job and I will use a technique inspired by Biber (2009) to highlight the ways in which this string is used differently in the corpora. Table 6 shows three main data columns: the first showing data for the frequency of the string get a job compared with the total frequency of * a job strings, the second column compares get a job with all get * job strings and the final column compares get a job with all get a * data. As an example, the highlighted parts of Table 6 show that there are 14 occurrences of get a job out of a total 25 occurrences of all * a job strings in SK and, in the lower part of the chart, that this is equivalent to 56%. Recall that the dispersion column shows the number of files in which the string occurs so get a job, for example, occurs in seven different files in SK and this corresponds with seven different speakers. This 56% value for SK combined with a figure of 77% for LK shows a clear level of relative fixedness when compared with the equivalent figures in the British corpora: 14% in SCO and 13% in the BNC (p < 0.0001 when get a job is compared with OTHER a job in SK and the BNC, two-tailed FE). Concordance two shows the range of contexts in which the 2-gram a job occurs in the BNC (in a sample of data) while the Korean data consists largely of strings such as I want to get a job and any chance to get a job which could easily be influenced by the fact that the data were collected in a university setting and many informants were involved in looking for a job. The speakers would, in terms of traditional grammar, have had the option to say I want a job or any chance of a job in this context, however, so we are left 14 with a mixed picture. Korean speakers may be primed to use the string get a job more strongly than British speakers in a range of contexts but the data sets are not comparable enough for a strong claim at this point. A study based on a comparison between a corpus of Korean English and a purpose built, directly comparable corpus of spoken British English would be useful to further explore this area. Note, however, that while the same limitation might be expected to suggest primings for the string got a job a rather different picture emerges. In the Korean data the item got shows a much weaker attraction to a job when compared to get a job (just 16% of * a job strings are filled with the item got in SK compared with 8% in LK, 7% in SCO and 14% in the BNC. One’s attention is drawn more to the third element when conducting a got a job analysis with 22% of SK showing the item job in the got a * frame compared with just 8% of LK, and 1% in each of SCO and the BNC thus highlighting the need for caution when it comes to lemmatisation. Concordance 2: Sample concordance of a job in the BNC The fully fixed status of the article in the string get a job is clear from the two figures of 100% in Table 6: SK and LK show no variety at all. SCO and the BNC have figures of 67% and 61% respectively though it should be noted that there are only three occurrences of get * job in SCO; the 33% comes from a single use of get your job back. The variety in the BNC is interesting in that 24% of the get * job frame makes use of the article the and could reasonably have been expected to appear in the 24 lines of Korean data if speakers shared 15 similar primings (p = 0.004 when get a job is compared with all other get * job strings for SK and the BNC, two-tailed FE). The third column in Table 6 shows some of the greatest differences between the Korean corpora and the British corpora; the item job completes the get a * frame in just 3% of cases in the BNC, for example, compared with a figure more than 10 times larger - 37% - in SK (p < 0.0001, two-tailed FE). SCO BNC 1 few 1 bit 2 car 2 lot 2 grant 2 lot 2 bit 2 big 2 job Table 7: Items occupying R1 position following get a in British corpora The apparent British primings for the colligation a QUANTIFIER influence these data with get a few forming the most frequent get a * 3-gram in SCO and the items car, grant, lot, bit and big occurring at the same frequency of job with two occurrences each (see Table 7). The strings get a bit and get a lot are the most frequent get a * strings in the BNC but get a job is clearly the most frequent in both Korean corpora. The Korean corpora have no occurrences of the BNC’s most frequent get a * 3-gram get a bit though there are two occurrences in SCO (19pm) and 104 occurrences in the BNC (26pm). SCOs most frequent get a * 3-gram get a few is also completely absent from both Korean corpora (there are three occurrences in the similar sized SCO (28pm) and 40 occurrences in the BNC (10pm)). This suggests that get a QUANTIFIER strings are used rather differently in 16 Korean English but the potential combined primings for delexicalised verbs (see Chi, Wong and Wong 1994 for a study of learners in Hong Kong for example), articles and quantifiers give a complex picture that would be beyond the scope of this study. 4 The an environment The lexical environment surrounding the item an in the four corpora is clearly rather different than that of a as Table 8 shows. Table 8: Frequency details for an and most frequent R1 collocates in four corpora There are very few obvious similarities in the most frequent ten items that follow an (though note that the Korean data shown contains a number of single occurrences). The rather low normalised frequencies of an (222pm in SK and 467pm in LK) are quite striking however compared with the British corpora (1398pm in SCO and 1303 in the BNC); ratios of the 17 frequency of a to an are 45-1 in SK, 21-1 in LK, 11-1 in SCO and 15-1 in the BNC which clearly positions LK between SK and the comparator corpora and may suggest primings are shifting to a British level. LK stands out amongst the four corpora, however, in that the string an hour is not the most frequent an * 2-gram. It is possible that these speakers are primed to say thirty minutes rather than half an hour (the most frequent * an hour string in both SCO and the BNC); this would reflect my experience of teaching time phrases in Korea and, indeed, LK has the highest normalised frequency of the string thirty minutes at 48pm. SK has 18pm, the BNC has just 3pm and there are no occurrences at all in SCO. Clearly, a Korean preference for the thirty minutes form does not explain why SK differs from LK but note that SK only has 25 occurrences of an so is particularly susceptible to statistical variation and that the most influential * an hour 3-gram in SK is for an hour rather than half an hour; it is perhaps something of an illusion that SK is more closely aligned with the British corpora. Concordance 3: Complete concordance of an * student in LK The item hour is not only absent from the top of the an * frequency list in LK, it ranks sixth below international, English, exchange, interview and essay. These items suggest a possible priming based on the concept of international student activities which would be understandable considering the demographics of the informants. Such an explanation should not take away from the fact that this is a difference between SK and LK (potentially) based on recent primings (note, however, that the difference between an hour and other an * strings in SK and LK is not statistically significant with p=0.199, two-tailed FE). The an * student frame shown in Concordance 3 is, in fact, more frequent than the most frequent 2-gram an international in LK and lends support to this idea. 18 5 The the environment In this final analysis-based part of the paper I will be discussing areas of the the environment as it is shown in Table 9. The frequencies of the most frequent the * 2-grams are much lower than a * 2-grams, particularly in the Korean corpora, but it is notable that there is a clear division between the Korean data and the British data: the Korean corpora make greatest use of the string the first while the British corpora make greater use the string the other. For this reason as well the fact that both strings are available for the formation of longer noun phrases I will take these forms as my starting point. (Recall that I am cautious about simply assuming that structures such as noun phrases have an important role in corpus-driven studies but Korean learners are taught phrase structure from an early age so it is reasonable to think about the primings effects of such an education.) Table 9: Frequency details for the and most frequent R1 collocates in four corpora 19 5.1 the first time The first is the most frequent the * 2-gram in the Korean data with normalised frequencies of 611pm and 462pm in LK and SK respectively; by comparison the normalised frequencies for SCO and the BNC are 197pm and 270pm. The 3-gram the first time is the most frequent the first * 3-gram in each of the corpora so I selected this string to produce the frequency chart shown in Table 10. Table 10: Frequency and percentage chart for the 3-gram the first time (based on Biber 2009) Compared to the chart shown in Table 6, and other charts in Hadikin (2011a) and Hadikin (2011b) Table 10 is quite unusual because the Korean corpora show the most flexibility in the first slot while the British corpora appear somewhat fixed. 20 Concordance 4: Sample concordance of first time in SK Concordance 4 shows a sample concordance from SK where the following strings can be seen: 1 when I look back first time 2 when I met him first time 3 in Korea at first time This suggests that many of the speakers are weakly, if at all, primed to insert the item the before the 2-gram and may be primed to use the string first time in similar ways to the way a British speaker simply uses first or at first (p < 0.0001 when the first time is compared with other * first time strings in SK and the BNC, two-tailed FE). Indeed, LK has seven occurrences of at first time which highlights this potential example of mixed primings (there are three occurrences in SK and none in either SCO or the BNC). The second column of Table 10 returns to a more familiar pattern of the string in the Korean data appearing more fixed with 100% of the * time strings taking the form the first time in LK, 58% in SK but just 25% in both SCO and the BNC. It is a curious point that to a large extent (12/34 occurrences or 35% of all the * time occurrences) the variation comes from the use of the same time in SK which is completely absent in LK; there is a single occurrence of same time as a 2-gram. With 5/16 occurrences (31%) SCO actually has the only time as its 21 most frequent the * time phrase and the BNC has the first time as its most frequent (138/724 or 19%) followed by the same time (129/724 or 18%) and the next time (45/724 or 6%). The percentage of the first * that takes the form the first time shows that there is a certain amount of flexibility in all four corpora; LK shows the most fixedness but with just 45% of the concordance in the form the first time there is actually a large set of other strings such as the first thing, the first floor and the first one. It seems that the numbers are most notably affected by the high normalised frequencies of the first time in the Korean corpora – 178pm in SK and 276pm in LK compared to 38pm in SCO and 45pm in the BNC. This appears to be the result of a more general use that compares and overlaps with the meaning of at first in British English (p=0.55 when the first time is compared with other the first * strings in LK and SK but 0.0003 between SK and the BNC, two-tailed FE). 5.2 the other This most frequent the * 2-gram in both SCO and the BNC does not lend itself as readily to analysis by a frequency/percentage chart as it tends to form quite different longer strings depending on whether one is looking at the Korean corpora or the British. In both SK and LK it has a tendency to form and the other but in SCO one mostly finds the other side and in the BNC the most notable 3-gram is the other one. SK stands alone in its high relative use of and the other compared to the total number of * the other strings with 14/38 (37%) compared to 5/35 (14%) in LK, 2/48 (4%) in SCO and 202/2918 (7%) in the BNC (p = 0.035 when SK is compared with LK, two-tailed FE). 22 Concordance 5: Complete concordance of and the other in SK (above) and LK (below) Use of and the other shown in Concordance 5 shows that the string is being used mostly for comparisons or for the addition of a new point in the conversation in both Korean corpora. A colligation and the other thing BE is noteworthy with a presence in both corpora shown in Concordance 5 but is not present in SCO and only occurs six times in the near four million word section of the BNC (cf. four times in the much smaller SK) always in the same form and the other thing is; the Korean informants appear to be more strongly primed to use this string/colligation to add information without necessarily specifying that they were going to make two or more points earlier in the discourse as one may expect (p = 0.002 when and the other thing is compared with other and the other * strings in SK and the BNC, two-tailed FE). The most frequent * the other/the other * string in SCO is the other side which makes it appear somewhat different from both the Korean corpora and the BNC. 11/48 (23%) occurrences of the other form this 3-gram in SCO and are mostly used to refer to physical or geographic areas (the other side of the Liverbuilding, the other side of Wigan etc) though it is 23 important to note that four of the occurrences were produced by the researcher himself. There are no occurrences of this string in the Korean corpora which may suggest different primings; 279/2919 (10%) are the corresponding figures for the other side in the BNC and it appears to be used with a similar function of describing physical locations. The most frequent 3-gram in the BNC’s the other environment is the other one with 480/2919 (16%) of all the other * strings taking this form compared with 3/48 (6%) in SCO, 3/38 (8%) in SK and 1/35 (3%) in LK. The right side of the string the other in the Korean corpora appears to be more flexible than in the British corpora; recall that 23% of the other * in SCO takes the form the other side and 16% of the other * in the BNC takes the form the other one. The most frequent the other * in SK is the other thing with 4/38 (11%) of occurrences and the most frequent the other * 3-gram in LK is the other countries with 3/35 (9%) in this form. Despite this apparent R1 flexibility time expressions such as the other day, the other week and the other night are conspicuously absent from 38 lines of the other in SK but these forms make up 23% of the BNC’s the other * occurrences, 21% of SCO’s and 9% of LK’s occurrences (a single occurrence in SK would have represented approximately 3%); this suggests that the informants in SK are weakly primed or, possibly, not primed to use the other in time expressions but the LK informants may have begun a shift to British primings during their time in the UK (Concordance six illustrates the lack of time expressions in SK, p=0.105 when LK is compared with SK for time expressions but p < 0.0001 when SK is compared with the BNC). 24 Concordance 6: Sample concordance of the other in SK showing R1 flexibility but a lack of time expressions 25 6 Conclusion In this paper I have tried to show some of the variety as well as the consistency of the English spoken by Korean adults by discussing three lexical environments: the phraseology and lexis surrounding the items a, an and the in two small corpora of Korean English and, as comparator corpora, a small ‘scouse’ corpus and the spoken demographic section of the BNC. Some of the similarities between the Seoul Korean corpus (SK) and the Liverpool Korean corpus (LK) include the following: The percentage of a * that takes the form a lot is consistent at 15% and this is notably higher than the British corpora. The percentage of a little * that takes the form a little bit is very similar at 78% (SK) and 79% (LK); this is also much higher than the British corpora. The string get a job shows a comparable level of fixedness in the Korean corpora with 100% of get * job taking the form get a job, for example, compared with 67% and 61% in SCO and the BNC. The item an occurs with very low normalised frequency in the Korean corpora: 222pm and 467pm in SK and LK compared with 1398pm and 1303pm in the BNC. The string the other has a strong tendency to form the 3-gram and the other in the Korean corpora compared to SCO and the BNC which tend to form R1-based the other * strings. While differences between SK and LK include: LK has a greater frequency of quite a lot than SK despite being approximately 25% smaller. LK stands out amongst the four corpora because it does not have an hour as the highest frequency an * 2-gram LK has a higher frequency of at first time than SK 26 LK uses the other to form time expressions such as the other day and the other night but there are no occurrences in SK This kind of variation suggests that while priming effects may separate LK from SK in certain areas, other strings are being used with great consistency. Similarities such as the high frequency use of a lot, for example, are arguably influenced by a comparable L1 form and consistent translation across pedagogic materials - Korean learners may be using a lot with a comparable frequency to the Korean equivalent 많은 (manun); their primings are then reinforced by high exposure to a lot when reading Korean English texts - but then, as language researchers, we might ask how and why the speakers are primed to use this English form as part of longer utterances. The SK speakers appear primed to produce there are a lot of while LK speakers appear more likely to say quite a lot. This arguably reflects a recent change to a British priming as the LK speakers rely less on there are a lot of (possibly stored as a formulaic chunk for many speakers and an exact translation of its Korean equivalent) and begin to include a hedging term quite that they will have been exposed to in the UK. It is also interesting to note that a colligation appears to have crossed over from the L1: for each article under consideration in this paper the normalised frequencies are lower in the Korean data compared to the British data. The article a occurs in SK at a normalised rate of 10 176 pm but 19 621 pm in the BNC, for example, and this pattern is consistent across the data for an and the suggesting that noun phrases are weakly primed for colligation with articles in Korean English. In at least one case British-like primings appear to have come together to create a uniquely Korean result: the primings for at first and first time appear quite unexceptional but then combine to give a high frequency of at first time with its own function of referring to the first time something is done or experienced. Lexical Priming is arguably unique in that its focus on the primings of each word (which is actually shorthand for the primings of the language user’s or users’ use of that word) allows for a detailed consideration of how and why a string appears to be changing form. I hope that this paper has highlighted some of the areas of spoken language which might be expected to show priming effects whereas until now any suggestions would have been merely theoretical (Hoey 2005 was largely based on written texts). The issues discussed here may also suggest which areas of spoken English are the first to change when individuals move 27 into a new geographical area and which parts of one’s idiolect are relatively fixed or slower to change – this could have exciting implications for both pedagogy and language evolution. Many of these changes of language forms could, of course, be discussed without reference to Lexical Priming but the alternative model would need to account for communities and individual speakers on both a psychological and sociological level changing their collocational behaviour in a short space of time – linguistic models such as those proposed in Sinclair (1991) and Wray (2002) are suitable but are also generally compatible with Lexical Priming as discussed in Hadikin (2011a) and Hadikin (2011b). Wray (2002), however, argues that learners tend to break down chunks into their parts based on meaning thus leaving a fuzzy picture when it comes to function words; it is not clear how the details of Wray’s model could explain how LK speakers have manipulated the strings at first and first time to create a new form, for example. The work also raises research questions such as ‘how and why do individuals vary in their use of a lot?’, for example, and pedagogic questions such as how alternatives could be taught or, indeed, if it is actually beneficial to try to reproduce the primings of native speakers. The need for very carefully chosen/carefully prepared comparator corpora is a further issue raised because the differences between interview-type data in the Korean corpora and the wider contexts recorded in the comparator corpora will interfere with and exaggerate any ‘true’ differences between the language varieties. There are, however, very few papers published about Korean spoken English so I hope this one can add to the developing picture of corpus-based language variation and act as a starting point for further research work as well as providing Korean learners, language users and teachers with, what some may see as points to notice (in the sense of Schmidt 1990) such as an overall tendency to drop articles while, to others, these are simply differences between two equally valid World Englishes and in many cases the speaker’s meaning would be unhindered. 28 References Biber, D. 2009. A corpus-driven approach to formulaic language in English: multi-word patterns in speech and writing. Presentation given at Corpus Linguistics 2009 on 23rd July 2009. Chi, A.M., Wong, K.P. and Wong, M.C. 1994. ‘Collocational problems amongst ESL learners: A corpus-based study’ in L. Flowerdew and A.K.K. Tong, Entering text. Hong Kong: Language Centre, Hong Kong University of Science and Technology, and Department of English, Guangzhou Institute of Foreign Languages, pp. 157-165. Chuang, F. and Nesi, H. 2006. ‘An analysis of formal errors in a corpus of L2 English produced by Chinese students’, Corpora 1(2): 251-271. Crystal, D. 2003. The Cambridge Encyclopaedia of the English Language (2nd Edition). Cambridge: Cambridge University Press. Gries, S. 2009. Quantitative Corpus Linguistics with R. London: Routledge. Hadikin, G. S. 2011a. Corpus, Concordance, Koreans: a corpus-driven study of an emerging New English. Manuscript submitted for publication. Hadikin, G. S. 2011b. Corpus, Concordance, Koreans: a comparison of the spoken English of two Korean communities. Unpublished PhD thesis, University of Liverpool. Hoey, M. 2005. Lexical Priming: a New Theory of Words and Language. London: Routledge. Hoey, M. 2012. Priming hypotheses. Retrieved from http://lexicalpriming.org/priminghypotheses/. Ionin, T., Baek, S,. Kim, E., Ko, H. and Wexler, K. 2012. ‘That’s not so different from the: definite and demonstrative descriptions in second language acquisition’, Second Language Research, 28, 69-101. Kachru, B. and Nelson, C. 2006. World Englishes in Asian contexts. Hong Kong: Hong Kong University Press. Lee, I. and Ramsey, R. 2000. The Korean Language. Albany, NY: SUNY press. Lee, J. 2001. ‘Korean speakers’ in Swan, M. and Smith, B. (eds.) Learner English: a teacher’s guide to interference and other problems. Cambridge: Cambridge University Press. McArthur, A. 1987. The English Languages? English Today 11, pp 9-13. Mehlsen, C. 2011. The Rise of Multiculturalism in Korea. Retrieved from http://www.dpu.dk/fileadmin/www.dpu.dk/ialeimagazine/multiculturaleducation/IALEI_ Magazine_18-20.pdf. 29 Pace-Sigge, M. T. L. 2010. Evidence of lexical priming in spoken Liverpool English. Unpublished PhD thesis, University of Liverpool. Park, J. S. 2009. The Local Construction of a Global Language: Ideologies of English in South Korea. Berlin: Mouton de Gruyter. Porter, C. 2011. ‘Review of ‘The Local Construction of a Global Language: Ideologies of English in South Korea’’, TESL-EJ 14 (4). Schmidt, R. 1990. The role of consciousness in second language learning. Applied Linguistics 11, pp 17-46. Scott, M. 2011. Wordsmith tools. Retrieved from http://www.lexically.net/wordsmith/index.html. Shim, R. J. 1999. Codified Korean English. World Englishes 18 (2), pp. 247-259. Sinclair, J.McH. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press. Tollefson, J. 2002. Language Policies in Education: Critical Issues. Routledge: London. Wray, A. 2002. Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. Yemi-Komshian, G., Flege, J. and Liu, S. 2000. ‘Pronunication Proficiency in the First and Second Languages of Korean-English Bilinguals’, Bilingualism: Language and Cognition 3 (2) pp. 131-49. What is the BNC? 2012. Retrieved from http://www.natcorp.ox.ac.uk/corpus/index.xml?ID=intro. 30