TESOL 2013 POWERPOINT: Formulaic Language

advertisement
Formulaic Language
in Academic Study
Norbert Schmitt
Single Words vs. Multi-word Units
• Most discussion of vocabulary (including
academic vocabulary) has been
conceptualized in terms of single words or
word families
How Much Vocabulary is Needed in English?
• Nation (CMLR, 2006)
 6,000 - 7,000 word families for spoken
discourse
 8,000 - 9,000 word families for written
discourse
Frequency and Coverage
Levels
1st 1,000
2nd 1,000
3rd 1,000
4th–5th 1,000
6th–9th 1,000
10th–14th 1,000
Proper nouns
Not in the lists
Approximate
written
coverage (%)
78–81
8–9
3–5
3
2
<1
2–4
1–3
Approximate
spoken
coverage (%)
81–84
5–6
2–3
1.5–3
0.75–1
0.5
1–1.5
1
Nation (2006)
AWL (Coxhead, TQ 2000)
capacity
assistance
abstract
brief
focus
hierarchy
hypothesis
incentive
minimum
diverse
cooperate
funding
enormous
investigation
circumstance
offset
rational
publication
evidence
maintain
invoke
integrity
reverse
manual
sum
scope
entity
item
purchase
revise
spherical
successive
release
AWL (Coxhead, TQ 2000)
capacity
assistance
abstract
brief
focus
hierarchy
hypothesis
incentive
minimum
diverse
cooperate
funding
enormous
investigation
circumstance
offset
rational
publication
evidence
maintain
invoke
integrity
reverse
manual
sum
scope
entity
item
purchase
revise
spherical
successive
release
Academic Vocabulary
• Successive comes with its own typical
phraseology
• What words collocate with successive?
COCA Results
•
•
•
•
•
•
•
each successive
successive generations
successive governments
successive administrations
successive waves
successive layers
successive stages
Typical Collocations
• Each successive president chose entanglements
and evasion over transparency, legality, and
independence.
• Turning schools around could help save
successive generations of kids who quit and
often end up jobless.
Phraseology in Language
• There is a great deal of recurrent
phraseology in language (including academic
language)
• This ‘formulaic language’ is crucial for
accurate, appropriate, and fluent language
use
What is Formulaic Language?
• Recurrent multi-word lexical items that have a single
meaning or function (Schmitt, 2010)
• It is a umbrella cover term for a number of formulaic
categories
–
–
–
–
–
–
–
Idioms
Collocations
Phrasal verbs
Lexical bundles
Lexical phrases
Phrasal expressions
etc
What is Formulaic Language?
• multi-word units, multiword chunks, fixed expressions,
frozen phrases, phrasal vocabulary, routine formulas,
chunks, prefabricated routines …
• Individual phrasal items will be referred to as a formulaic
sequences
Why is Formulaic Language Important?
•
Formulaic language is one of the most
important components of language
overall
•
The reasons for this are numerous:
Why is Formulaic Language Important?
•
Formulaic language is ubiquitous in
language use
Why is Formulaic Language Important?
•
•
Formulaic language is ubiquitous in
language use
Meanings and functions are often
realized by formulaic language
Why is Formulaic Language Important?
•
•
•
Formulaic language is ubiquitous in
language use
Meanings and functions are often
realized by formulaic language
Formulaic language is necessary for
appropriate functional language use
Why is Formulaic Language Important?
•
•
•
•
Formulaic language is ubiquitous in
language use
Meanings and functions are often
realized by formulaic language
Formulaic language is necessary for
appropriate functional language use
Formulaic language has processing
advantages
Why is Formulaic Language Important?
•
Formulaic language is an important
component of language acquisition
Why is Formulaic Language Important?
•
•
Formulaic language is an important
component of language acquisition
Formulaic language is a feature of many
languages
Why is Formulaic Language Important?
•
•
•
Formulaic language is an important
component of language acquisition
Formulaic language is a feature of many
languages
The use of formulaic language helps
speakers be fluent
Why is Formulaic Language Important?
•
•
•
•
Formulaic language is an important
component of language acquisition
Formulaic language is a feature of many
languages
The use of formulaic language helps
speakers be fluent
Phraseology is a main feature that
distinguishes different synonyms
Ubiquitous in Language Use
•
•
•
•
•
•
•
52-58%
32%
48-80% (M=66%)
once every five words
21% 30%
31% - 40%
15%
Erman and Warren (2000)
Foster (2001)
Oppenheim (2000)
Sorhus (1977)
Biber, et al. (1999)
Howarth (1998)
Rayson (2008)
Ubiquitous in Language Use
•
•
•
•
•
•
•
52-58%
32%
48-80% (M=66%)
once every five words
21% 30%
31% - 40%
15%
•
Figures depend on the method of measurement, and
whether spoken vs. written discourse
Erman and Warren (2000):
Foster (2001)
Oppenheim (2000)
Sorhus (1977)
Biber, et al. (1999)
Howarth (1998)
Rayson (2008)
Meanings and Functions
•
The more recurrent a language need is (e.g.
need to apologize, make a request, explain a
particular idea), the more likely there will be a
conventionalized expression (i.e. formulaic
language) to express it
Meanings and Functions
•
•
•
•
Expressing a concept: (get out of Dodge [City]
= get out of town quickly, usually in
uncomfortable circumstances)
Stating a commonly believed truth or advice:
(Too many cooks spoil the soup = it is difficult
to get a number of people to work well
together)
Providing phatic expressions which facilitate
social interaction: (Nice weather today is a nonintrusive way to open a conversation)
Signposting discourse organization: (on the
other hand signals an alternative viewpoint)
Meanings and Functions
•
•
•
•
Providing technical phraseology which can
transact information in a precise and efficient
manner: (2-mile final is a specific location in an
aircraft landing pattern)
Maintaining conversations: (How are you?, See
you later)
Realizing the topics necessary in daily
conversations: (When is X? (time), How far is
X? (location))
Expressing functions: I'm (very) sorry to hear
about ___ to express sympathy
Appropriate Language Use
•
Formulaic language is expected by the
speech community, and so word
combinations which do not comply to the
norm sound ‘unnatural’
Appropriate Language Use
•
gap Native speaker or learner?
– Betty very skillfully stopped the gap of the
mailbox so that birds could not get in.
– … but to bridge the gap between existing …
Appropriate Language Use
•
Betty very skillfully stopped the gap of the
mailbox so that birds could not get in.
– Meaningful but awkward
•
… but to bridge the gap between existing
– Appropriate word (collocation) choice
Appropriate Language Use
•
Schmitt (ELIA, 2005-2006)
•
Define border
•
How is it used?
Appropriate Language Use
BNC frequency
border
borders
bordering
bordered
X + on
8,011
2,539
367
356
Figurative sense
89 (1%)
84 (3%)
177 (48%)
99 (28%)
71%
75%
Appropriate Language Use
•
•
•
His passion for self-improvement bordered on the
pathological.
But his approach is unconscionable, bordering on criminal.
Some other words which occur to the right of
bordered/ing on:
a slump
a sulk
alcoholic poisoning
antagonism
apathy
arrogance
austerity
bad taste
blackmail
carelessness
chaos
conspiracy
contempt
cruelty
cynicism
Appropriate Language Use
SOMETHING (is/are) bordered/bordering on SOMETHING
UNPLEASANT
Processing Advantages
Pawley and Syder (1983)
•
•
•
Formulaic sequences offer processing efficiency
because single memorized units, even if made up of a
sequence of words, are processed more quickly and
easily than the same sequences of words which are
generated creatively.
The mind uses an abundant resource (long term
memory) to store a number of prefabricated chunks of
language that can be used ‘ready made’ in language
production.
This compensates for a limited resource (working
memory), which can potentially be overloaded when
generating language on-line from individual lexical
items and syntactic/discourse rules.
Processing Advantages
Figurative
Personally, I think you can have the highest degree from the best
university in the world, but at the end of the day it’s your
contribution to the society that matters, and not the name of the
university you went to at all.
Literal
However, I still had to carry most of my stuff in small boxes from
my old room to the new one. I had to make at least 50 trips so
at the end of the day I was absolutely exhausted.
Novel
I know that at the end of the war he went on to teach students at
the Military Academy.
Processing Advantages
Siyanova, Conklin, and Schmitt (SLR, 2011)
First Pass Reading Time = 3 + 4 (early)
Total Reading Time = 3 + 4 + 6 (late)
Fixation Count = 3 + 4 + 6
(late)
Processing Advantages
Siyanova, Conklin, and Schmitt
Figurative
First Pass Reading Time (ms) 447
Total Reading Time (ms)
514
Fixation Count
2.8
Literal
454
507
2.7
Novel
497
628
3.2
Processing Advantages
Siyanova, Conklin, and Schmitt
Figurative
First Pass Reading Time (ms) 447
Total Reading Time (ms)
514
Fixation Count
2.8
Literal
=
=
=
454
507
2.7
Novel
=
<
<
497
628
3.2
Language Acquisition
• Peters (1983) suggests that formulaic
sequences may be decomposed and the
individual components extracted through a
process of segmentation, to give insights
into vocabulary and grammar:
An hour ago, a year ago, a month ago

A(n) _____ ago + hour, year, month
Occurs in a Range of Languages
• Formulaic language has been found in a
range of languages:
English, Russian, French, Spanish, Italian,
German, Swedish, Polish, Arabic, Hebrew,
Turkish, Greek, and Chinese
• Is it a universal trait of all languages?
Helps Speakers be Fluent
• The largest unit of novel discourse that native
speakers are able to process is a single clause
of 8-10 words
• When speaking, proficient speakers will speed
up and become fluent during these clauses
• But they will then slow down or even pause at
the end of these clauses
• NS seldom pause in the middle of a clause, or at
least not for long
Helps Speakers be Fluent
• But proficient speakers can fluently say multi-clause
utterances:
- You can lead a horse to water, but you can’t make him
drink.
• Kuiper (2004) shows that speakers who operate under
severe time constraints (play-by-play sports announcers,
auctioneers) use a great deal of formulaic language in
their speech
• So, formulaic language helps speakers be more fluent
Distinguishes Synonyms (Stubbs, 1994)
How are the following (near) synonyms used?
•
•
•
•
•
WORK
JOB
CAREER
LABOR
EMPLOYMENT
Distinguishes Synonyms (Stubbs, 1994)
WORK:
workaholic, workforce, workload, workplace aid
worker, factory worker, office worker, social
worker
Distinguishes Synonyms (Stubbs, 1994)
WORK:
workaholic, workforce, workload, workplace aid
worker, factory worker, office worker, social
worker
neutral? (frequent word = many contexts)
Distinguishes Synonyms (Stubbs, 1994)
JOB:
botched, crummy, bad, hatchet, menial
Distinguishes Synonyms (Stubbs, 1994)
JOB:
botched, crummy, bard, hatchet, menial
negative?
Distinguishes Synonyms (Stubbs, 1994)
CAREER:
brilliant, distinguished, glittering, acting,
director, film, international, literary
Distinguishes Synonyms (Stubbs, 1994)
CAREER:
brilliant, distinguished, glittering, acting,
director, film, international, literary
positive?
Distinguishes Synonyms (Stubbs, 1994)
LABOR:
casual, cheap, deskilling, manual,
unproductive
Distinguishes Synonyms (Stubbs, 1994)
LABOR:
casual, cheap, deskilling, manual,
unproductive
negative?
Distinguishes Synonyms (Stubbs, 1994)
EMPLOYMENT:
conditions, contract, discrimination, rights
Distinguishes Synonyms (Stubbs, 1994)
EMPLOYMENT:
conditions, contract, discrimination, rights
legal?
Learner Use of Formulaic Language
•
Learners don’t use many idioms
•
Learners do use many high-frequency
collocations (nice day)
•
Learners don’t use many lower-frequency but
tightly-bound collocations (preconceived notions)
Learner Use of Formulaic Language
•
But learners often do not use the collocations
they know appropriately
•
Inappropriate collocations is a leading problem
in learner language
•
Learners often use words with their correct
meanings, but do not understand the correct
context of use (collocation, register, frequency)
Learner Use of Formulaic Language
•
Learners consistently overestimate their
comprehension of reading texts that contain
formulaic sequences that they either fail to
identify or misunderstand, even at high levels
of proficiency (Martinez and Murphy, TQ 2011)
Learner Acquisition of Formulaic Language
•
Boers & Lindstromberg (ARAL 2012) reviewed
acquisition research:
–
–
–
–
–
Learning from exposure requires repetition
(frequency)
Intentional learning produced better results
Raising awareness of formulaic language is not a
powerful accelerator of learning
Knowing the component words makes learning a
formulaic sequence easier
Providing learning strategies (dictionaries,
concordance lines) produced mixed results
Learner Acquisition of Formulaic Language
•
Does learner use of formulaic language
(e.g. collocations) improve just from
studying in an academic environment?
•
Incidental acquisition
•
Li and Schmitt (JSLW, 2009)
Learner Acquisition of Formulaic Language
•
We followed a Chinese MA student at
Nottingham over one academic year and
compiled a learner corpus from all of her
essays and dissertation
•
We then analyzed all of her assignments
and dissertation for formulaic language
Learner Acquisition of Formulaic Language
•
Would the student produce more
formulaic language over the year?
•
Would the student produce better
formulaic language over the year?
•
Would the student become more
confident in producing formulaic
language over the year?
D
is
se
r
sa
y
sa
y
sa
y
sa
y
sa
y
sa
y
sa
y
sa
y
8
7
6
5
4
3
2
1
ta
ti o
n
Es
Es
Es
Es
Es
Es
Es
Es
Amount Produced
Average Tokens Per 700 Words
35
30
25
20
15
10
5
Appropriateness
Inappropriateness Rate
Less Appropriateness Rate
Appropriateness Rate
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
ta
ti o
n
8
is
se
r
D
Es
sa
y
7
Es
sa
y
6
Es
sa
y
5
Es
sa
y
4
sa
y
3
Es
sa
y
2
Es
sa
y
Es
Es
sa
y
1
0%
Confidence
Less Confident
Confident
Very Confident
100.0%
90.0%
80.0%
70.0%
60.0%
50.0%
40.0%
30.0%
20.0%
10.0%
n
8
is
se
rt
at
io
D
Es
sa
y
7
Es
sa
y
6
Es
sa
y
5
Es
sa
y
4
Es
sa
y
3
Es
sa
y
2
Es
sa
y
Es
sa
y
1
0.0%
Learner Acquisition of Formulaic Language
•
Does learner use of formulaic language
(e.g. collocations) improve from explicit
teaching?
•
Focused instruction
•
Jones and Haywood (2004, In Schmitt
(Ed.) Formulaic Sequences)
Learner Acquisition of Formulaic Language
•
Learners had better awareness of formulaic language
after 10 weeks and could identify a greater number of
sequences in a text
•
Some learners made some progress in producing
more formulaic sequences in a C-test:
He suspected that too much of th__ ki__ o__
chemical might encourage the immune system…)
•
Most learners made no noticeable improvement in the
number of formulaic sequences produced in their
essays over 2 weeks
Necessity of Formulaic Language
Cowie (1992:10) goes so far to say:
“It is impossible to perform at a level acceptable to
native users, in writing or in speech, without
controlling an appropriate range of multiword
units.”
Pedagogical Implications
• Meunier review (ARAL, 2012)
• If formulaic sequences are so important:
• They need to be included in teaching
syllabuses and materials
• We can’t assume they will just be learned
from exposure
• They need to incorporated into language
tests to a greater extent
Pedagogical Implications
• But what formulaic sequences?
• Vincent (JEAP, 2013) proposes a 6-stage process for
identifying academic phraseology
• Martinez (ELTJ, 2013) suggests a selection framework
based on frequency and transparency
• In order to incorporate formulaic sequences into their
teaching and testing, most practitioners need a list of
formulaic sequences to address
An Academic Formulas List
(1-24)
Simpson-Vlach & Ellis (AL, 2010)
• in terms of
• at the same time
• from the point of view
• in order to
• as well as
• part of the
• the fact that
• in other words
• the point of view of
• there is a
• as a result of
• this is a
• on the basis of
• a number of
• there is no
• point of view
• the number of
• the extent to which
• as a result
• in the case of
• whether or not
• the same time
• with respect to
• point of view of
An Academic Formulas List
(1-24)
• The table showed the first 24 formulas on the
core list (written and spoken), ranked by a
combination of frequency and MI scores
• All component words of these formulas come
from the 1st 1,000 frequency band
An Academic Formulas List
Written 177-200
•even though the
•this does not
•was based on
•the nature of the
•in the course of
•degree to which
•be argued that
•in terms of a
•for this reason
•are based on
•in a number of
•two types of
•the total number
•is more likely
•which can be
•are able to
•be considered as
•be used to
•b and c
•depend on the
•is that it is
•is affected by
•should also be
•if they are
An Academic Formulas List
Written 176-199
•even though the
•this does not
•was based on
•the nature of the
•in the course of
•degree to which
•be argued that
•in terms of a
•for this reason
•are based on
•in a number of
•two types of
•the total number
•is more likely
•which can be
•are able to
•be considered as
•be used to
•b and c
•depend on the
•is that it is
•is affected by (AWL)
•should also be
•if they are
An Academic Formulas List
• Top 200 from written texts
• 1st 1,000
• 2nd 1,000
• AWL
127 different words
2 different words
16 different words
An Academic Formulas List
• To learn formulas from the AFL, learners must
either:
– Know the high frequency component words already
– This makes the learning easier
• Or
– Learn the AFL formulas as wholes even if some
component words are not known
– Less efficient
• Knowing AWL words would not help much
• Knowing the 1st 1,000 words is key
An Academic Formulas List
• Many of the AFL are structural
components of meaningful sentences, but
may not contain clear a meaning sense in
their own right:
•
•
•
•
is that it is
is affected by
should also be
if they are
An Academic Formulas List
• The AFL is based around functions:
• Framing attributes
– the idea that
– the change in
• Quantity specification
– a series of
• Identification and focus
– different types of
– such as a
An Academic Formulas List
• Identification and focus
– exactly the same
– (the) difference between (the)
• Locatives
– in the real world
• Vagueness markers
– and so forth
• Hedges
– to some extent
An Academic Formulas List
• Obligation and directive
– I want you to
• Expressions of ability and possibility
– allows us to
– are able to
• Evaluation
– an important role in
– is consistent with
• Discourse markers
– even though the
– in conjunction with
Formulaic Framework (Martinez, ELTJ, 2013)
Infrequent
take credit
27
Frequent
take issue
121
take time
910
Transparent
take credit
take place
10,556
Opaque
take time
take issue
take place
Formulaic Framework (Martinez, ELTJ, 2013)
Frequent
take time (2)
take place (1)
Transparent
Opaque
take credit (4)
take issue (3)
Infrequent
PHRASE List (Martinez & Schmitt, AL, 2012)
• PHRASE List (PHRASal Expressions)
• Some formulaic sequences are very
frequent
• 500 phrasal expressions within 5,000
BNC frequency level
• Based on same frequency as individual
BNC words
• Phrases which are opaque and not
easily guessable
PHRASE List
• LEAD TO (CAUSE) 13,555 (1st 1,000
frequency level)
Excessive smoking can lead to heart
disease.
• HAVE GOT TO (must) 12,270 (2nd 1,000
frequency level)
You have got to try this salad.
• BY THE TIME (when) 3,607 (3rd 1,000
frequency level)
By the time dinner started there were none
left.
PHRASE List
Integrated Phrase Frequency
Spoken
List
(per 100 million) general
Rank
Written Written
Example
general academic
107
HAVE TO
83,092
***
**
*
I exercise
because I
have to.
463
GOING TO
(FUTURE)
28,259
***
**
x
I’m going to
think about it.
894
WAS TO
14,366
x
***
**
The message
was to be
transmitted
worldwide.
PHRASE List
Integrated Phrase Frequency
Spoken
List
(per 100 million) general
Rank
Written Written
Example
general academic
5502
MAKE UP
ONE’S MIND
788
***
**
x
You’d better
make up
your mind.
5503
AT WORK
787
x
***
***
There were
strange forces
at work.
Experimental PHRASE Test
Inclusion in the Vocabulary Levels Test
1 take place
2 have got to
3 seek to
4 find out
5 make sure
6 carry out
_____ do
_____ try
_____ must
Experimental PHRASE Test
Inclusion in the Vocabulary Levels Test
1 take place
2 have got to
3 seek to
4 find out
5 make sure
6 carry out
__6__ do
__3__ try
__2__ must
Experimental PHRASE Test
1 take place
2 have got to
3 seek to
4 find out
5 make sure
6 carry out
__6__ do
__3__ try
__2__ must
X Didn’t work well – learners needed
context to make sense of many phrasal
expressions
Experimental PHRASE Test
turn out: It turned out different.
a. started
b. seemed
c. became
d. did not look
Experimental PHRASE Test
turn out: It turned out different.
a. started
b. seemed
c. became
d. did not look
Experimental PHRASE Test
at least: At least it is warm.
a. other things may be bad, but
b. many days have passed and now
c. I cannot believe that
d. the least important thing is
Experimental PHRASE Test
at least: At least it is warm.
a. other things may be bad, but
b. many days have passed and now
c. I cannot believe that
d. the least important thing is
Experimental PHRASE Test
•
Seems to work much better
•
Still in piloting
•
Ron Martinez
(San Francisco State University)
Vocabulary Website Resource
Most Norbert Schmitt (& co-author) publications
and other vocabulary resources can be accessed
at his personal website:
www.norbertschmitt.co.uk
• This PowerPoint presentation is available
• The PHRASE List is available
• Link to COCA Corpus BYU web site
Download