Eisenbeiss (2009): Elicitation Experiments in Language Acquisition

advertisement
Elicitation Experiments in
Language Acquisition
Sonja Eisenbeiss (University of Essex)
seisen@essex.ac.uk
Overview
I:
Elicitation and other Types of Production Data
II:
Semi-Structured Elicitation and Stimulus Materials
III:
Production Experiments
See Eisenbeiss (2009)
Part I: Elicitation and
other Types of Production Data
• naturalistic data
• semi-structured elicitation
• production experiments
• transcription and data analysis
Naturalistic Data
• other term: spontaneous speech data
• recording of ongoing communicative
events (free play, dinner table
conversation,…)
Advantages
• age-independent
• no special task-demands, thus high
ecological validity
• frequency information available
• input-analysis possible
• analysable for different phenomena
Problems
• low comparability
• underestimation of productivity due to
recurrent situations which require similar
linguistic encoding
• lack of data for low-frequency phenomena
(morphemes, constructions,…)
NPs in German Child Language
(Eisenbeiss 1994, 2003)
child
AND
ANN
CAR
HAN
LEO
MAT
SVE
total
age
2;1
2;4-2;9
3;6
2;0-2;8
1;11-2;11
2;3-3;6
2;9-3;3
1;11-3;6
files (w. elicit.)
1
6
1
8
15 (all)
18
15 (10)
64 (20)
utterances
1.4500
1.977
1.795
1.399
4.383 (4.383)
1.978
3.811 (2814)
16.793 (7.197)
(Clahsen 1982, Wagner 1985, Clahsen et al. 1990)
NPs in Spontaneous Speech
Noun phrase number
with
% of
correlation with
utterances
mean length of
context for
utterance
article
2.646
28
0.489; sign.
adjective +
249
3
0.274; n.s.
19
<1
-0.097; n.s.
article
possessive ‘s
Problems for Naturalistic Corpus
Studies on NPs
• Some NP types are comparatively frequent
and become more frequent with increasing
utterance length (MLU).
• BUT: Some NP types (e.g. those with contexts
for adjective+article or possessive ‘s) :
•are rare
•occur in some files, but not in others
•do not become more frequent over time
(though children’s utterances get longer)
Semi-Structured Elicitation
• Encouraging speech production in a
naturalistic (often game-like) setting.
• e.g. eliciting complete sentences with the
verb to give in an "animal feeding game":
participants have to feed toy animals and
explain which food items they would like to
give to which animals (Eisenbeiss 1994)
• often used as supplements to naturalistic
data or experiments
Advantages
• appropriate for very young learners (<2)
• high comparability
• no underestimation of productivity due to
recurrent situations which require similar
linguistic encoding
• no underestimation of productivity due to
high task demands
• data for low-frequency phenomena
(morphemes, constructions,…)
• analysable for different phenomena
NPs with Adjective+Article
(Eisenbeiss 2003)
For instance:
• picture-matching game:
putting picture cards on a board with
pictures (red balloon, blue ballon,...)
What is this ? This is ... (NOM).
What goes onto this picture here? (NOM)
What do you want to put here (ACC)
NPs with Adjective+Article
% of analysable utterances
25
child
KORPUS
20
sve
sve
15
mat
leo
10
han
car
5
ann
0
1.0
and
1.5
2.0
2.5
3.0
M LU
black symbols: files with elicitation
3.5
4.0
4.5
NPs with possessive –s
(Eisenbeiss 2003)
For instance:
• possession-matching game:
assigning possessions (depicted on cards)
to people (depicted on board)
Whose bicycle is this? This is ...
NPs with possessive -s
KORPUS
child
6
% of analysable utterances
sve
5
sve
mat
4
leo
3
leo
han
2
car
1
ann
0
and
1.0
1.5
2.0
2.5
3.0
M LU
black symbols: files with elicitation
3.5
4.0
4.5
Problems
• age-dependent (usually at least 1;6 years)
• no frequency information available
• no input-analysis possible
Experiments
• Systematic control of variables
(properties of participants and
stimulus materials)
• Standardized procedures
• Limited range of response options
Advantages
• high comparability
• data for low-frequency phenomena
(morphemes, constructions,…)
• no underestimation of productivity due
to recurrent situations which require
similar linguistic encoding
Problems
• age-dependent (usually at least 3 years)
• underestimation of productivity due to
comparatively high task-demands
• no frequency information available
• no input-analysis possible
• analysis restricted to target phenomenon
Transcription and Data Analysis
• Naturalistic Data and Semi-Structured
Elicitation Games require Transcription.
• The most common format is the CHAT-format,
developed for the largest child language data
deposit: CHILDES
(http://childes.psy.cmu.edu/).
• Digital CHAT-files can be searched using the
CLAN tools provided by CHILDES.
• All three data types require “error”-analysis,
i.e. classifying deviations from the target.
Transcription: CHAT-Format
• Transcripts are written in a text editor and
stored as unformatted ASCII files (text only or
plain text).
• All lines are ended by a carriage return
(ENTER).
• Every transcript must begin with the line:
@BEGIN and end with the line: @End.
• Between @BEGIN and @End:
– headers with information about the
transcript (obligatory: @Participants)
– main tier for transcription
– dependent tiers for further annotations
CHAT-Format: Basic Structure
•
•
•
•
•
•
•
•
@BEGIN
@Participants
[other headers]
*JOE: [spoken material]
%mor: [morpho-syntactic coding]
*INT: [spoken material]
%mor: [morpho-syntactic coding]
@End
CHAT-Format: Headers
• three letters followed by a colon and a tab
• obligatory: @ Participants, on the second
line of the transcript; e.g.:
@Participants: JOE Joe child, INT Interviewer
• optional; e.g.:
– @Birth of Learner: …
– @Age of Learner: …
– @Date: …
– @Language: …
– @Transcriber: …
CHAT-Format: Main Tiers
• what was actually said, one utterance per tier
• introduced by "*", the three-letter code for the
participant and a tab; e.g.:
*JOE: the boy put the leash on the cat.
• orthographical transcription in lower case Latin
letters; except for proper nouns (e.g. John)
and "I"
• numbers spelled out (ten, not 10)
• normalisation of phonetically deviant forms
(phonetic information about forms can be
presented on a %pho dependent tier)
Main Tiers: Markers
•
•
•
•
•
•
•
•
•
•
•
unfilled pauses:
#
filled pauses:
eh@fp
interruption:
+/.
self-interruption:
+//.
repetition w/o correction:
[/]
repetition with correction:
[//]
unintelligible speech:
xxx
material coded on phonol. tier: yyy
doubtful material:
[?] or [=? text]
omitted parts of words:
()
to refer to more than one word: < >
CHAT: Dependent Tiers
further annotations, e.g.
•
•
•
•
•
•
%mor
%pho
%syn
%err
%com
%spa
[morphosyntactic coding]
[phonological coding]
[syntactic coding]
[errors]
[comments]
[speech acts]
CLAN: Windows
• the commands window where you specify the
folders, files, and commands you want to use
• the CLAN output window, where you will see
the results of your searches. If you have not
specified an output file, your results will be
displayed in this window. If you have saved
your outputs into a file (as you will be asked to
do for this exercise), you will not be able to
see it in the output window, but the name and
location of the output file will be displayed in
the output window.
CLAN: Command Window
CLAN: Steps
• specify your working and lib directory, where
the files you will be working with are stored
• specify your output directory, where any
output files will be stored
• select a command
• select one or more transcription files for
analysis
• optionally use some so-called switches to
modify the commands.
CLAN: Core Commands
• FREQ:
will provide you with type and
token frequency information
• COMBO:
will find utterances matching a
given set of criteria
• MLU:
will calculate the MLU
(mean length of utterance)
CLAN: Useful Switches
+f
+s
+t
+u
+o
+w
saves output to file. For each transcription that
you have chosen to analyse, an output file will be
generated. By default, this output file will have
the name of the transcription file and an
extension that will show you which command was
used to create the output (e.g. frq, mlu or cmb).
searches for a string in a file.
restricts the search to a particular tier – e.g.
the tier of a particular speaker.
treats all files together.
orders FREQ lists according to token frequency
–w1 and +w1 provide one preceeding/following
line, -w2 and +w2 will provide two
preceeding/following lines, etc.
CLAN: Search Strings
^
+
!
*
“”
immediately followed by
inclusive OR
logical NOT
“joker”
strings including blanks, etc.
should be put in quotes
CLAN: Search Strings
^
+
!
*
“”
immediately followed by
inclusive OR
logical NOT
“joker”
strings including blanks, etc.
should be put in quotes
Error Analysis
• % suppliance in obligatory contexts (e.g. %
–ed in past tense contexts) vs. % correct use
of a particular form (e.g. % correct use of all
present tense forms)
• errors of omission (e.g. two mouse) vs. errors
of comission (two mouses)
• overregularisations (e.g. sing-singed) vs.
overirregularisations (e.g. bite-bote)
Part II: Semi-Structured Elicitation
and Stimulus Materials
• Interactional Setting
• Target Type
• The Puzzle Task
• Stimulus Type
• Transcription and
Analysis
Interactional Setting
• Director/matcher (or “confederate description”):
A “director” describes a scene/object etc. and a “matcher”
who is not able to see this scene/object, has to recreate it.
E.g.: The matcher has to build a toy house identical to the
one created by the director who is hidden behind a screen.
• Speaker/Listener:
A speaker provides information for someone who does not
have access to this information. E.g.: The speaker retells a
story (s)he heard/read while the listener was not in the room.
• Co-Players:
All participants are involved in a game and provide each other
with information to co-ordinate their actions.
E.g.: The players are involved in a construction or puzzle
game where not everyone has access to all pieces.
Target Type
• broad-spectrum (generally encouraging
participants to speak)
• form-focused: the use of a particular form
or construction
• meaning-focused: the linguistic encoding of
a particular function or meaning (which can
be encoded in different ways)
Broad-Spectrum
• Frog story: a picture book w/owords used to
elicit narratives (Berman/Slobin 1994)
• Bag task: a bag with bag for blocks and
animals of different sizes and colours. The
bag has pockets that match the animals in
colour and have coloured buttons, ties, etc.;
and children frequently refer to colours,
sizes and locations when they ask other
players to help them hide or find animals in
the pockets (Eisenbeiss 2009b)
Form-focused
• Picture-matching game: aimed at the
elicitation of noun phrases with adjectives in
different case contexts (Eisenbeiss 1994),
see part I
• Possession-matching-games: aimed at the
elicitation of adnominal possessive
constructions (Eisenbeiss 1994)
Meaning-focused
• “circle of dirt”: a picture book w/owords
used to elicit descriptions of part-whole
relationships and actions affecting (body)
parts (Eisenbeiss and McGregor 1999):
• “cut-and-break”: video stimulus created for
cross-linguistic studies of “separation and
material destruction” events (Bohnemeyer,
Bowerman and Brown 2001).
The Puzzle Task (Eisenbeiss 2009b)
• a task with co-players:
child describes
contrasting pictures on
a puzzle board, adult
finds the matching
pieces, child puts them
into the correct cut-out
• exchangable pictures
and puzzle pieces
• can be used to elictit
particular forms or to
elicit the linguistic
encoding of particular
meanings
Eisenbeiss/Matsuo (2005)
Eisenbeiss et al. (2009)
• German:
39 recordings
(picture descriptions and asking for pieces)
1286 utterances with/without V
21 children (3;7-6;6)
• Japanese:
67 recordings (asking for pieces)
421 utterances with V
16 children (2;11-6;5)
Elicitation Material: give
Elicitation Material: bite
Elicitation Material: wash
Elicitation Material: put on
German: The Use of Verbs (%)
GAME
GIVE
BITE
WASH
PUT
GIVE
60
0
<1
<1
BITE
0
62
0
0
0
0
70
0
PUT
0
0
0
48
other
25
16
14
27
no V
14
22
15
25
Verb WASH
Error Types
•More contexts for errors but no qualitative differences
in error types observed so far
•For instance, PPs instead of IOs:
• naturalistic (Carsten 3;6):
für'n papa sollste aber den schenken
for the daddy shall PART this-one give
• elicited (Jannik 6;4):
da gibt das baby fuer das schaf ehm den gras.
there gives the baby for the sheep ehm the grass
Overt Arguments in Japanese (%)
100
90
80
70
60
50
40
30
20
10
0
subject
object
Jun (3;8)
Puzzle
(</=3:8)
Puzzle (all)
Jun: naturalistic data
Stimulus Type
• pictures
• photos
• computer animations
• videos
• toys
• real objects
Pictures
• better for descriptions of static objects/properties
than for event descriptions and verb elicitation
• requires knowledge of artistic conventions (e.g.
lines for movement, shading etc.)
• can be used for “unrealistic” events (e.g. animals
in different colours or positions)
• comparatively easy to create with clip art and
standard software
• comparatively easy to modify for minimal
variations (e.g. colour)
Photos
• better for descriptions of static objects/properties
than for event descriptions and verb elicitation
• do not require knowledge of artistic conventions
and are comparatively “natural”
• problematic for “unrealistic” events (e.g. animals in
different colours or positions)
• comparatively easy to create and manipulate with
standard photo equipment and Photoshop or
similar software
• cannot be as easily modified for minimal variations
(e.g. colour) as pictures, but possible in principle
Computer Animations
• better for descriptions of dynamic events and
verb elicitation than for descriptions of static
objects/properties
• not very naturalistic, in particular when it comes
to natural movements of people and animals
• can be used for “unrealistic” events (e.g. funny
movements of animals)
• difficult and time-consuming to create
• good control for minimal variations (e.g.
direction or manner of motion)
Videos
• better for descriptions of dynamic events and verb
elicitation than for descriptions of static
objects/properties
• comparatively natural
• cannot easily be used for “unrealistic” events
• comparatively simple to create with standard digital
video equipment and editing software (Adobe etc.)
• cannot be as easily modified for minimal variations as
computer animations (e.g. direction or manner of
motion) because “actors” tend to introduce
unwanted modifications
Toys
• e.g. stuffed animals, cars, blocks
• appropriate for descriptions of dynamic events and verb
elicitation as well as for descriptions of static
objects/properties
• not completely naturalistic, in particular when it comes to
natural movements of people and animals
• often very culture-dependent
• can be used for “unrealistic” events (e.g. funny
movements of animals)
• usually easily obtainable
• object properties can be fairly well controlled, but for
dynamic events, actions of toy “actors” tend to introduce
unwanted modifications
Real Objects
• e.g. tools, household items like pots, dishes
• appropriate for descriptions of dynamic events and verb
elicitation as well as for descriptions of static objects/properties
• very naturalistic
• basic objects are often less culture-dependent than toys
• can be used for realistic and “unrealistic” events (e.g. funny
movements of pots)
• usually easily obtainable
• object properties are a bit harder to control than for toys and
modifications might reduce naturalness (e.g. atypical colours
for household items)
• for dynamic events, manipulations of the objects tend to
introduce unwanted modifications
Part III: Production Experiments
•
•
•
•
•
•
elicited imitation
elicited production
speeded production
syntactic priming
input/feedback manipulations
eye-tracking and speech production
Elicited Imitation
Participants are asked to imitate spoken sentences.
Thus, it is clear what learners should say; and
when stimuli are sufficiently long and complex,
participants cannot simply memorise them as a
whole, but have to employ their own grammar to
recreate them. Thus, a comparison of the target
utterance and the learner’s actual production can
shed light on the grammatical knowledge that
learners employ to express a given meaning.
Elicited Imitation: Pro & Con
Advantages
• easy to carry out
• clear target for comparisons
Problems:
• good performance might be due to simple
memorisation.
• errors might be due to a lack of vocabulary
knowledge, memory limitations, etc.
Elicited Production
Participants receive a prompt to produce a form
(e.g. This is a door. These are two …?).
Alternatively, learners can be instructed to
turn a given sentence into a question, a
negated sentences, etc. (e.g. I'll say
something and then you say the opposite).
Elicited Production: Pro & Con
Advantages
independence of memorised models
Problems:
• requires reliable prompts
• unclear influence of participants’ earlier
experience - unless novel words are used
(This is a wug. These are two …?, Berko
1958).
• performance errors due to task difficulties
(especially with novel words)
Berko (1958)
Speeded Production
The frequency of stimulus items is manipulated to
study storage and computation in learners’ developing
mental lexicon: If a morphologically complex form
such as walk-ed is stored as a whole, then highfrequency forms should have stronger memory traces,
due to additional exposure. Hence, they should be
retrieved and produced faster than low-frequency
forms. In contrast, if morphological complex forms are
computed from stems and affixes, production
latencies should only be affected by the frequency of
these components (e.g. walk and –ed), not by the
frequency of the complex form (e.g. walk-ed).
Speeded Production: Pro & Con
Advantages:
can provide information about storage and
computation
Problems:
• requires highly reliable prompts
• requires frequency information for the words in
the variety the learner is exposed to (e.g. child
directed German,…)
• requires reaction-time equipment
Clahsen et al. (2004)
Elicitation of German participles with a
computer game involving an alien:
overgeneralisations of regularly inflected
participles inflection and a frequency effect
for irregulars only; which suggests that
irregular word forms are stored as wholes,
while regularly inflected word forms are
computed.
Syntactic Priming
Speakers tend to repeat syntactic structure
across otherwise unrelated utterances (Branigan
2007). For instance, speakers are more likely to
use a passive after hearing or producing a
passive sentence than after an active sentence.
If learners show this effect, this indicates that
they have acquired a grammatical representation
that can be activated by priming.
Syntactic Priming: Variants
• presentation of primes:
• between groups
• in blocks
• alternating
• required participant response to primes:
• listening only
• listening and repeating
• lexical overlap between prime and target:
• no lexical overlap
• same verb or head noun
Syntactic Priming: Pro & Con
Advantages:
provides insights into representations
Problems:
requires pairs of alternative structures that
speakers could use (e.g. active/passive)
Input/Feedback Manipulation
The effect of different types of input (e.g.
models) or feedback (explicit corrections,
reformulations, etc.) on speakers’ (elicited)
production is studied.
Input/Feedback Manipulation:
Pro & Con
Advantages:
can provide information about the role of
input/feedback
Problems:
dependency of corrective feedback on the
occurrence of errors
Eye-Tracking and
Speech Production
Eye-movements are monitored while
participants describe pictures and videos. This
allows us to investigate the amount of visual
information required to start speaking and the
planning processes involved in speech
production (pre-viewing scenes, aligning
speech and vision, etc.).
Dobel et al. (2009)
Eye-Tracking and Speech
Production: Pro & Con
Advantages:
can provide information about the role of visual
information and speech planning processes
Problems:
• requires specialised equipment
• requires good knowledge of visual processing
Conclusion
Converging Evidence
References I
Berko, J. 1958. The child's learning of English morphology. Word
14, 150-177.
Berman, R. A., & Slobin, D. I. (1994). Relating Events in Narrative:
A Crosslinguistic Developmental Study. Hillsdale, NJ: Lawrence
Erlbaum.
Bohnemeyer, J , Brown, P. & Bowerman, M. 2001. Cut and Break
Clips. In ‘Manual’ for the field season 2001, Levinson, S.C. &
Enfield, N.J. (eds), 90-96. Nijmegen: Max Planck Institute for
Psycholinguistics.Branigan, H. (2007). Syntactic Priming.
Language and Learning Compass: 1, 1-16.
Clahsen, H. 1982. Spracherwerb in der Kindheit. Eine
Untersuchung zur Entwicklung der Syntax bei Kleinkindern.
Tübingen: Narr.
References II
Clahsen, H., Hadler, M. & Weyerts, H. 2004. Speeded production
of inflected words in children and adults. Journal of Child
Language 31: 683-712.
Clahsen, H., Vainikka, A. & Young-Scholten, M. 1990.
Lernbarkeitstheorie und Lexikalisches Lernen. Eine kurze
Darstellung des LEXLERN-Projekts. Linguistische Berichte 130,
466-477.
Dobel, C., Glanemann, R., Kreysa, H., Zwitserlood, P., Eisenbeiss,
S. (2009) Visual encoding of meaningful and meaningless
events. Submitted.
References III
Eisenbeiss, S. 1994. Elizitation von Nominalphrasen und
Kasusmarkierungen. In: Sonja Eisenbeiß, Susanne Bartke,
Helga Weyerts & Harald Clahsen (Eds.), Elizitationsverfahren in
der Spracherwerbsforschung: Nominalphrasen, Kasus, Plural,
Partizipien. (Arbeiten des Sonderforschungsbereichs 282, Nr.
57). Düsseldorf: Heinrich-Heine-Universität, 1-38.
Eisenbeiss, S. 2003: Merkmalsgesteuerter Grammatikerwerb: Eine
Untersuchung zum Erwerb der Struktur und Flexion von
Nominalphrasen. Dissertation; University of Duesseldorf.
http://diss.ub.uni-duesseldorf.de/ebib/diss/file?dissid=1185
Eisenbeiss, S. 2009a: Production Methods. Ms. University of Essex
Eisenbeiss, S. 2009b: Semi-Structured Elicitation Tasks. Ms.
University of Essex (to appear in Essex Research Reports in
Linguistics: http://www.essex.ac.uk/linguistics/errl/)
References IV
Eisenbeiss, S., Matsuo, A. 2003. External and Internal Possession -
A Comparative Study of German and Japanese Child Language.
Paper presented at the 28th Annual Boston University
Conference on Language Development, Boston University
Wagner, Klaus R. (1985). How much do children say in a day?
Journal of Child Language 12, 475-487.
Eisenbeiss, S. & Matsuo, A. 2005. Eliciting Language Production
Data from Young Children. Presentation at the Xth International
Congress for the Study of Child Language, Berlin, Germany
Eisenbeiss, S., Matsuo, A. & Sonnenstuhl, I. (2009): Learning to
Encode Possession. Submitted.
Eisenbeiss, S. & McGregor, W.B. 1999. The circle of dirt. Ms. MaxPlanck-Institute for Psycholinguistics, Nijmegen.
Further Readings I
Breakwell, G.M., Hammond, S., Fife-Schaw, C. (eds.) 2003.
Research Methods
in Psychology. London: Sage
Publications.
Field, A., Hole, G. 2003. How to Design and Report
Experiments. Sage
Publications.
McDaniel, D., McKee, C., Smith Cairns, H. (eds.) 1996.
Methods for Assessing
Children's Syntax.
Cambridge, MA: MIT Press, 3-22.
Menn, L., Bernstein Ratner, N. (eds.) 2000. Methods for
studying language
production. Mahwah, N.J. :
Lawrence Erlbaum Associates
Further Readings II
Sekerina, I.A., Fernandez, E.M., Clahsen, H. (eds) 2008.
Developmental Psycholinguistics: On-Line Methods in
Children’s Language Processing. Amsterdam:
Benjamins.
Wei, Li, Moyer, M. (eds) 2008. The Blackwell Guide to
Research Methods in Bilingualism and Multilingualism.
London: Blackwell.
Wray, A., Bloomer, A. 2006. Projects in Linguistics. A
Practical Guide to
Researching Language. London:
Arnold.
-s/’s and von/of Possessives
English: ‘s for animate, topical, prototypical poss. N(P)
German: -s restricted to unmodified N (name, kinship)
Search in CLAN for strings: FREQ, COMBO (with context)
• Search in CLAN for codes : FREQ, COMBO (with
context)
• How would you collect spontaneous speech
(participants, setting, activities)?
• How would you create:
• a broad-spectrum elicitation game
• a meaning-focused elicitation game
• a form-focused elicitation game
• an elicitation/comprehension experiment for
recursive –s: the boy’s father’s mother’s ballon
Nikola Koch:
Elicitation/Comprehension
Nikola Koch:
Elicitation/Comprehension
Download