LIN3098 Corpus Linguistics

advertisement
LIN3098 Corpus Linguistics
Practical Task VIII
Albert Gatt
1. Introduction
This tutorial will focus on the use of corpora for comparing the distributions of
syntactic constructions across genres and registers.
Note: If you do not finish this tutorial within the allotted time, you are strongly
encouraged to continue on your own and submit your work for correction and
feedback.
2. The corpus
We will be using the SketchEngine interface to the British National Corpus. Log in to
the SketchEngine at http://the.sketchengine.co.uk/auth . You will be provided with a
user name and password.
Do the following:
1. Once you’re logged in, select the British National Corpus from among the list
of available corpora.
2. Click on the Concordance tab in the top menu. From the left menu, click
Query Type. Select CQL.
3. For the following exercise, you will find it handy to have open the web page
that lists the POS tags for the BNC. You have a direct link to it from the
SketchEngine’s BNC Concordance interface (the link says Tagset Summary
and is found beneath the CQL field).
3. Comparing the distribution of clauses across text
types
We will focus on extraposed to-clauses such as It’s hard to see what’s wrong with this
construction. The extraposed variety that we will concentrate on consists of the
following parts:
a. the pronoun it
b. the verb be in one of its forms (is, was etc)
c. an adjective or a verb in the past participle form (e.g. held)
d. the to- complementiser
We will be interested in whether the distributions of extraposed to-clauses (i.e. the
first variety) differ across genres and registers.
In a study of these kinds of constructions, Biber, Conrad and Reppen (1998) found
that extraposed to-clauses are more frequent in academic writing. They argue that this
is because of the following 2 characteristics of extraposed clauses:
a. They present a proposition “anonymously” (i.e. without attributing it to
someone or something).
b. They use adjectival or participial predicates, which are “static”, in the sense
that they denote conditions rather than events.
For example, it is hard to resolve this equation is much less personal and much more
“static” than We found it hard to resolve this equation.
3.1 Constructing queries
1. Construct a CQL query for extraposed to-clauses and write it down below
(NB: the BNC has a special tag for infinitival to, namelt TO0)
Extraposed to-clause query:
____________________________________________________
2. Run the CQL query separately using the Concordance interface to the BNC.
Make sure you’ve selected CQL as your query type.
Eyeball the results. If your query overgenerates (returns matches which are not
what you need), try and refine it.
a. Populate the table below with examples of the clause. Do your results
confirm Biber et al’s intuitions that these tend to be non-dynamic and
generally do not contain direct attribution to a person?
Extraposed Clause Examples
b. Compare the frequency of the clause in the two registers you’re
investigating. The SketchEngine allows you to do this: in the left menu,
under Frequencies, select text type.
c. Study the results of the frequency. For more directly comparable
results, you need to look at proportions rather than absolute
frequencies (see the column headed Rel [%]). Notice especially the
sections entitled text type (whether written or spoken) and domain
(roughly: the subject).
Answer the following:
1. Are extraposed clauses more frequent in written text than in spoken?
2. Now, look at the various sub-domains of written text. What further
conclusions can you draw?
Download