A la découverte des corpus: British National Corpus (BNC) o Simple Search http://www.natcorp.ox.ac.uk/ sur la page d’accueil, cliquez sur more pour connaître le nombre de mots et la date du corpus o VIEW interface http://view.byu.edu/ Business Letter Corpus http://www.someya-net.com/concordancer/ British National Corpus (BNC) o VIEW interface http://view.byu.edu/ dans “sections” choisissez “spoken” COMPARA http://193.136.2.104/COMPARA/psimples.php?language=en Cliquez dans le menu à gauche pour en savoir plus sur les textes inclus dans le corpus EuroParl http://opus.lingfil.uu.se/cwb/Europarl7/frames-cqp.html Tapez le mot entre guillemets et cliquez sur run query sur la page d’accueil, cliquez sur “home” pour trouver le nombre de mot et la date du corpus Type de mot Mot à chercher obsolète nouveau commun rare Typiquement oral littéraire technique régional sentimental religieux politique étranger counterpane MP3 with epicure yeah amiable pelagic lass darling rosary coalition rapporteur BNC BNC spoken EuroParl EN COMPARA EN ** Business Letter corpus a. Which is the only corpus that has counterpane? Does it surprise you? Think of some more old-fashioned words and check in which corpora they appear. b. Which is the only corpus with MP3? Do the dates of the texts included in the corpus give you any explanation of why this might be so? c. Why do you think with appears in all five (sub-)corpora? Why do you think it is more frequent in some than in others? d. Which is the only corpus that has epicure? Think of some more rare words and check in which corpora they appear. How big do you think a corpus has to be for you to find rare words in it? e. Which two corpora do not have yeah in them? Why do you think it does not appear? f. In which two corpora is amiable more frequent? Can you think of an explanation for this? g. Which two corpora have the word pelagic? Why is a technical word like this unlikely to be found in the other three (sub)corpora? h. Which two corpora have the word lass? In which two corpora are regionally marked words like this least likely to be found? i. Which corpus does not have darling, and which corpus has only one occurrence of this word? Why do you think this is so? j. Which two corpora have the word rosary? In which of them is the word comparatively more frequent? Why could this be? k. Coalition appears in all five corpora. In which is it comparatively more frequent? Why? l. The foreign word rapporteur does not appear in three of the English language corpora, but in one it is exceptionally frequent? Why? Read the description of the Business Letter Corpus. Given this information, decide which of the words and expressions below are likely to be very frequent in the corpus, and which are unlikely to be found in it. When you finish, use the corpus to test your predictions. Cheerio Thank you for I am pleased to very funny I love you We regret looking forward to Who’s there? soup Yours sincerely a. Which of the above words and expressions is the most frequent in the corpus? b. Which four search terms cannot be found in the corpus? c. Were all your predictions right? If not, which results surprised you? Why? Look for “work” and then for “works”, “working”, “worked” in the BNC online. What can you say about the query results? Look up the following strings of words in the BNC and write down their frequencies. What can you conclude from your results? It It was It was okay It was okay as It was okay as far It was okay as far as I It was okay as far as I could It was okay as far as I could see was okay as far as I could see okay as far as I could see as far as I could see far as I could see as I could see I could see could see see Below are sequential three-word clusters taken from the sentence As a rule of thumb you need a litre of paint to every 12 square metres of wall. Which clusters are likely to turn up in the BNC? Which are unlikely to be found? Can you guess which will be the most frequent? Test your predictions and then discuss your results. as a rule a rule of rule of thumb of thumb you thumb you need you need a need a litre a litre of litre of paint of paint to paint to every to every 12 every 12 square 12 square metres square metres of metres of wall look up the following pairs in the BNC: *payed-paid, *pronounciation-pronunciation, *accomodation-accommodation Using the BYU online interface of the BNC, look up a word like honestly in the sub-corpus of speech and then in the subcorpus of fiction. What statistical conclusions can you draw? Look up a word like congratulations followed by a preposition in the Collins Wordbanks Online corpus. What prepositions can be used following congratulations? Can they be used interchangeably?