LIN3098 Corpus Linguistics Practical Task VIII Albert Gatt 1. Introduction This tutorial will focus on the use of corpora for comparing the distributions of syntactic constructions across genres and registers. Note: If you do not finish this tutorial within the allotted time, you are strongly encouraged to continue on your own and submit your work for correction and feedback. 2. The corpus We will be using the SketchEngine interface to the British National Corpus. Log in to the SketchEngine at http://the.sketchengine.co.uk/auth . You will be provided with a user name and password. Do the following: 1. Once you’re logged in, select the British National Corpus from among the list of available corpora. 2. Click on the Concordance tab in the top menu. From the left menu, click Query Type. Select CQL. 3. For the following exercise, you will find it handy to have open the web page that lists the POS tags for the BNC. You have a direct link to it from the SketchEngine’s BNC Concordance interface (the link says Tagset Summary and is found beneath the CQL field). 3. Comparing the distribution of clauses across text types We will focus on extraposed to-clauses such as It’s hard to see what’s wrong with this construction. The extraposed variety that we will concentrate on consists of the following parts: a. the pronoun it b. the verb be in one of its forms (is, was etc) c. an adjective or a verb in the past participle form (e.g. held) d. the to- complementiser We will be interested in whether the distributions of extraposed to-clauses (i.e. the first variety) differ across genres and registers. In a study of these kinds of constructions, Biber, Conrad and Reppen (1998) found that extraposed to-clauses are more frequent in academic writing. They argue that this is because of the following 2 characteristics of extraposed clauses: a. They present a proposition “anonymously” (i.e. without attributing it to someone or something). b. They use adjectival or participial predicates, which are “static”, in the sense that they denote conditions rather than events. For example, it is hard to resolve this equation is much less personal and much more “static” than We found it hard to resolve this equation. 3.1 Constructing queries 1. Construct a CQL query for extraposed to-clauses and write it down below (NB: the BNC has a special tag for infinitival to, namelt TO0) Extraposed to-clause query: ____________________________________________________ 2. Run the CQL query separately using the Concordance interface to the BNC. Make sure you’ve selected CQL as your query type. Eyeball the results. If your query overgenerates (returns matches which are not what you need), try and refine it. a. Populate the table below with examples of the clause. Do your results confirm Biber et al’s intuitions that these tend to be non-dynamic and generally do not contain direct attribution to a person? Extraposed Clause Examples b. Compare the frequency of the clause in the two registers you’re investigating. The SketchEngine allows you to do this: in the left menu, under Frequencies, select text type. c. Study the results of the frequency. For more directly comparable results, you need to look at proportions rather than absolute frequencies (see the column headed Rel [%]). Notice especially the sections entitled text type (whether written or spoken) and domain (roughly: the subject). Answer the following: 1. Are extraposed clauses more frequent in written text than in spoken? 2. Now, look at the various sub-domains of written text. What further conclusions can you draw?