Direkt Profil

advertisement
Direkt Profil: an automatic analyzer of texts
written in French as a second language
Jonas Granfeldt(1), Pierre Nugues(2), Suzanne Schlyter(1),
Malin Ågren (1), Edin Kukovic (1), Emil Persson (1), Jonas
Thulin (2), Lisa Persson (2), Fabian Kostadinov (3)
(1) Lund University, Centre for languages and literature, French
(2) Lund Institute of Technology, Department of Computer Science
(3) University of Zürich, Department of Computer Science
http://profil.sol.lu.se
Jonas.Granfeldt@rom.lu.se
OUTLINE
•
Introduction
–
–
–
–
•
The idea
Rationale
The knowledge bases
Demo
Theoretical background
– Developmental sequences and developmental stages in L2 French
•
Method
– CEFLE - The development corpus
– The Direkt Profil system
• Overview of the system
• Annotation
• Defining profiles/stages with machine learning
•
Results
• Annotation
• Defining profiles/stages
• Example of an applied study with Direkt Profil
– Direkt Profil and teachers’ assessments: a correlation study
•
Conclusion
– Problems
– Future work
INTRODUCTION
• The idea was…
– To provide researchers, teachers and learners
with an easy-to-use tool for overall diagnostic
assessment of developmental stage.
– To base the assessment on current research
on second language acquisition.
– To automatically provide feedback to teachers
and learners on language level and central
target features of the language.
– To use learners’ free written production as the
basis of assessment (rather than close-tests)
INTRODUCTION
Rationale
• Language acquisition is a process which
follows a specific and definable order.
• Learners and teachers want to know about
the progress the learners make.
• Instruction is probably most effective if it
is adopted to the learners’ present
developmental level (cf. The Teachability
Hypothesis, Pienemann, 1985)
INTRODUCTION
The knowledge bases for the project
•
•
•
•
Second Language Research
Linguistics (French)
Natural Language Processing
Engineering
INTRODUCTION
An example: a learner text from the corpus
<CORPUS>
<SAMPLE SUBJECT_ID="XXXX">
<TEXT>C'est deux personne, une fille et sa mère. La fille est grand et elle a une robe blue. Sa
mère est petite mais grosse et elle a une robe vert. Elles va à L'Italie dans ses vacances. La
fille pense à les garcons italien et sa mere pense du soleil. Elles sont derière un table avec
une map. Elles boire des café. Leur voiture est vert. La voiture est trés petite est la bagage
n'est pas fit. Maintenant elles à destination D'Italie. Elles check in. Le monsiuer fait une
ronde tête est une grand moustache. Leur chambre est beau avec deux lis est une trés
beaux vue. Elle est sur la plage. Sur la mere il y a des bateaux. Elles fait du soleil. Dans la
soir elle a dîner dans une restaurant. À côté il y a un garcon avec une costume blue. Après
le diner elles boire du vin rouge dans la bar. Les deux garcon d'italien ils voir la mère et sa
fille. Ils sont d'amour. Ils parlent et boire de alcohol. Aprés ils fait du dancing. Le jour aprés
ils fait du sightseeing avec Tony et son autobus rouge. Il est bold. Après le sightseeing ils
visite un marche. La dame grosse a une hat rouge. Le monsieur grand a un hat noir. La fille
grand amour le garcon petite mais grosse. Sur le soir ils separé - le grand monsieur avec
la petite mais grosse dame et la grand fille avec le petite mais trés grand monsieur. Le jour
après ils revenir a Suede avec les deux monsieurs. </TEXT>
<INFO TASK_NAME="VOYAGE_ITALIE" GROUP_SUBJECT="MAIN" SUBJECT_LEVEL="2"
SOURCE_SCHOOL="XXXX"/>
</SAMPLE>
INTRODUCTION
DEMO HERE
THEORETICAL BACKGROUND
French L2 in a developmental perspective
• Many projects since 1980s (examples)
– ESF-project (Perdue, 1993, L2 French, different L1s)
– InterFra project (Bartning, 1997 and later) (Swedish L1)
– FIFI/DURS project (Schlyter, 1986 and later, Granfeldt, 2003)
(Swedish L1)
– Myles & Mitchell Myles (2002 and later) (Flloc-project, English L1)
• Empirical objectives of this research:
– arrive at rich and empirically valid descriptions of how French
interlanguage develops over time.
– identify features at different linguistic levels which are
developmentally related.
• Some syntheses are emerging:
– Bartning & Schlyter (2004): A proposal of six stages of
development.
– Véronique et al. (2009): A proposal of three stages
THEORETICAL BACKGROUND
Benchmarking grammatical development of French L2
(Bartning & Schlyter, 2004)
•
Objectives:
•
•
•
Data:
•
•
•
Describe developmental sequences in French L2 for a number of
morphosyntactic phenonema
Establish general learner stages/profiles wrst to grammatical
development
Oral corpora of French L2 (L1 = Swedish).
Post-puberty learners (N=35, 80 recordings)
Method:
•
•
Frequency analysis and linguistic profiling
Manual and semi-automated tagging of transcriptions
A model with 6 profiles/stages (sample)
Initial
Intermediate
Advanced
1
2
3
4
5
6
% finite lexical verbs
(elle boit vs elle boire)
50-75
70-80
80-90
90-98
100
100
% 3e pers plural irreg.
lexical verbs (elles
vient vs elles
viennent)
_
_
a few
cases
 50
few
errors
100
Tense use
Pres.
Pres
(P.C.)
Pres
P.C.
(Impf)
Pres.
P.C
Impf
Pres
P.C
Impf
P-Q-P
Cond
Pres.
P.C
Impf
P-Q-P
Cond
P.Simp.
% gender agreement
(NP art + N)
55-75
60-80
65-85
70-90
75-95
90-100
Stage
Granfeldt (2003); Bartning & Schlyter (2004)
METHOD
Direkt Profil
•
Objectives:
•
•
•
To implement the model of Bartning & Schlyter (2004)
To develop an easy-to-use system for automated annotation, extraction
and frequency analysis of as many as possible of the features in B&S
work
To develop a system for defining developmental stages/profiles
•
Method:
• Constructing an interlanguage partial parser for L2 French
• Connecting the parser to a module for machine learning
• Constructing an interface
•
We have expanded on B&S original work wrst :
• Type of data (written rather than oral)
• Quantity of data
• Additional features (more morph.synt. features, lexical and
quantitative features)
Overview of Direkt Profil
The development corpus CEFLE
CEFLE corpus
Selection of CEFLE analyzed
Task
name
Homme
Elicitation
type
Pictures
Words
Souvenir
Italie
Moi
Pers. Narrative 14365 Stage 2 (N=98)
Pictures
30840 Stage 3 (N=97)
Pers. Narrative 30355 Stage 4 (N=58)
92820 Control (N=41)
17260 Stage 1 (N=23)
Text
length
78
Sent.
length
6.9
161
212
320
308
8.4
9.8
11.6
15.2
•CEFLE: Corpus Ecrit de Français Langue Etrangère
• 400 texts written under controlled conditions by 85 Swedish and 22 French
students (317 texts used here) 4 texts / learner.
• Manual assignment of “stage” to one text from each learner using B&S
criteria (Voyage en Italie)
Granfeldt, Nugues et al. (2006)
ANNOTATION
• We developed an annotation scheme based on B&S (2004)
framework.
• The concepts of noun or verb group is the grammatical
representation of most phenomena in this framework.
• Essential to the Direkt Profil annotation
• Many syntactic annotation frameworks for French take this
into consideration
– An example from Gendner et al. (2004):
et mademoiselle qui <NV> appelait </NV> au secours ! ... ou
plutôt non , <NV> on ne l' entendait </NV> plus ... <NV> elle
était </NV> peut-être morte ...
• This annotation make no provision however for the specific
details in B&S framework
ANNOTATION (cont’d)
•
The Direkt Profil annotation is an
XML-based mark up, split into 5
levels:
1. Tokenisation
2. Identification of prefabricated
structures (c’est; je m’appelle etc)
ANNOTATION (cont’d)
3) POS-tagging (Det, Prep, Pron,
V(être/avoir), Konj)
4) Groupe detection/chunking: rule-based
(decision tree) and uses a set of
grammatical words (« mots vides »,
Tesnière, 1959; Vergne, 1998)
5) Chunk classification: rule-based
feature checking between elements.
The sentence Ils parlons dans la bar is annotated as
<segment class="c5148"><tag pos="pro:nom:pl:p3:mas">
Ils</tag> <tag pos="ver:impre:pl:p1"> parlons </tag></segment>
dans <segment class="c3071"> <tag pos ="det:fem:sg">la</tag>
<tag pos="nom:mas:sg">bar</tag> </segment>
c5148 reads: “Lexical verb/Present tense/3rd.pers.PL/no_agreement”
c3071 reads: “Det_Noun_NP/singular_det/without_gender_agreement”
– Features are finally counted and raw occurrences are converted to
percentages (where relevant)
The dictionary
• The engine uses a dictionary of French
inflected forms available freely from
Association des Bibliophiles Universels
(ABU)
• We have corrected, complemented it, and
converted it to XML.
• We have also added frequency-of-use
information from the Lexique database
(New, Pallier & Ferrand, 2005)
DEFINING STAGES/PROFILES
•
Using the criteria in Bartning & Schlyter (2004) two researchers
manually classified 82 texts of the sub-corpus Le voyage en Italie
(part of CEFLE).
•
The classification was subsequently re-used with all texts from
the same learner, resulting in 317 classified texts.
•
We trained/build classifiers where we used automatically
extracted phenomena as features representing the learners’ texts.
•
Currently 142 phenomena (features/attributes) are used when
establishing a learner profile stage.
•
We used C4.5 (Quinlan, 1986), LMT (Landwehr et al., 2003), and
Support Vector Machines (Boser et al al., 1992) from the Weka
collection (Witten & Frank, 2005)
RESULTS
RESULTS
Annotation
Direkt Profil (v.1.5.1) Recall and Precision
100%
80%
60%
Recall
Precision
40%
20%
0%
Sta. 1
Sta.2
Granfeldt, Nugues et al., 2005
Sta. 3
Sta. 4
Contr
RESULTS
CLASSIFICATION using all features
Granfeldt & Nugues, 2007
A sample decision tree
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
% NPs with gender agreement <= 93
| % nominative pronouns <= 4: 1 (7.0/1.0)
| % nominative pronouns > 4
| | % NPs with num+gen agreement <= 94: 1 (2.0)
| | % NPs with num+gen agreement > 94
| | | % pluperfect verbs in S-V agreement <= 0
| | | | S-V agreement w/ modal verbs <= 10
| | | | | Average sentence length <= 15
| | | | | | % of the next 2,000 words <= 0: 1 (2.0/1.0)
| | | | | | % of the next 2,000 words > 0
| | | | | | | % D-N-A in agreement <= 0: 2 (11.0)
| | | | | | | % D-N-A in agreement > 0
| | | | | | | | % D-A-N in agreement <= 50
| | | | | | | | | % of the next 2,000 words <= 1: 2 (8.0/1.0)
| | | | | | | | | % of the next 2,000 words > 1
| | | | | | | | | | % prepositions <= 9
| | | | | | | | | | | % vbs in the imperfect <= 0
| | | | | | | | | | | | % mod+inf verbs in S-V agreement <= 33: 2 (4.0)
| | | | | | | | | | | | % mod+inf verbs in S-V agreement > 33: 3 (3.0/1.0)
| | | | | | | | | | | % vbs in the imperfect > 0: 3 (2.0)
Attribute selection
•
•
We ran an attribute selection procedure in order to identify the best
features at this point.
To evaluate the 142 attributes, we measured the information gain for each
attribute with respect to the class. This method is derived from ID3 and is
part of the Weka software.
Top 10 features according to InfoGain metric
Average merit
0.4371
0.3351
0.3232
0.2925
0.2565
0.2082
0.1953
0.1793
0.1739
0.153
Feature
% Determiner Noun agreement (gender errors)
% Unknown words (i.e. not in dictionary)
% NPs with gender agreement (including adjectives)
Average sentence length
% Prepositions (out of all parts-of-speech)
% S-V agreement with modal verbs followed by infinitive
% Noun Adjective with agreement (gender and number)
% S-V agreemet w auxiliary in passé composé
% S-V agreement with être/avoir 3ppl (all tenses)
% K1Tokens (out of all tokens)
Granfeldt & Nugues, 2007
Results after feature selection
(top 20 attributes)
Granfeldt & Nugues, 2007
Direkt Profil and teachers’
assessment: a correlation study
• An example of an applied study with Direkt Profil
•
Several scholars have suggested that work on developmental
sequences and stages could be used as a mean for assessing
language development of a particular individual at a given time
(Clahsen, 1985, Pinemann & Johnston, 1987, the Rapid Profile program
Pinemann & Mackay, 1992, Brindley, 1998)
Research questions
1. What is the correlation between the
developmental stage and teachers’
assessments of the same texts?
(RQ1)
2. To what extent can the
developmental stage predict
teachers’ ranking of a particular
text? (RQ2)
Method
-
-
50 texts from the CEFLE- corpus (Ågren, 2005) were selected (Task: Le
voyage en Italie picture series)
The learner texts had previously been manually analysed according to
developmental stage following the criteria in B&S
Stage 1
(man)
Stage 2
(man)
Stage 3
(man)
Stage 4
(man)
Natives
10 texts 10 texts 10 texts 10 texts 10 texts
The texts were also analysed by Direkt Profil resulting in
two separate indications for developmental stage (manual
and automated)
Method (cont’d)
7 experienced teachers of upper secondary school rated the 50 texts
on a six grade scale (6 = highest level)
They were asked to assess the texts in three domains:
(a) “Form”, i.e. language (grammar, lexicon, spelling etc.)
(b)“Content and Communication” (content in relation to the pictures,
the communicative success of the text)
(c) “Overall”, i.e. combining a and b (in a way they found suitable)
The teachers also stated for each assessment the degree of certainty
with which they had rated the text (scale of 5 where 5 indicated
completely certain and 1 indicated completely uncertain)
RESULT: Median and distribution of ratings for
form (language)
Granfeldt & Ågren, 2009
RESULT Inter-rater agreement between
teachers
Assessment
Form
Content and communicative success
Overall
Granfeldt & Ågren, 2009
Krippendorffs α
(Kalpha)
,738
,749
,750
RESULT: Correlating developmental stage and teachers’
assessments
Answering Research Question 1:
Developmental
stage (man)
(natives excluded)
Direkt Profil
(natives excluded)
Instructional level
(natives excluded)
Assessment Assessment of
of form
content/
(median)
communicative
functions
(median)
,908
,902
Overall
assessment
(median)
,883
,872
,876
,865
,774
,780
,776
The developmental stage is better correlated with the assessments of the
teachers than instructional level.
Granfeldt & Ågren, 2009
RESULT: Regression analysis
Answering Research Question 2:
Developmental stage
(excl. natives)
Direkt Profil
(excl. natives)
Instructional level
(excl. natives)
Overall
assessment (r2)
,735
,703
,566
Apprx. 70% of the variance in the teachers ranking of the
texts can be explained by the developmental stage as
analysed by Direkt Profil
Conclusion
•
We have presented a system for assessment of developmental
stage/profile in French as a second language French.
– The system implements the current theory of stages/profiles of development in
French.
•
The system consists of
– a interlanguage partial parser for French L2 called Direkt Profil and
– a machine-learning module connected to it.
•
Results:
– An evaluation of the annotation showed mixed results, depending very much on
the developmental stage of the writer.
– Results from classification experiments show:
• Best results with a 3-stage classification: a mean F of 0.82
• Stage 1 is the most problematic
• The texts from the natives are relatively easy to classify: a mean F of 0.91
• A large feature set does not seem to be necessary (at least not for this
data)
• Using an attribute/feature selection method, we have identified a list of ”10
best attributes”
Problems
”Briefly, the language produced by learners is about
the worst imaginable type of language for NLP.”
(Tschichold, 2007)
– Lexical spelling (orthographe lexicale) is a
problem – incorrect forms lead to increased
ambiguity and to incorrect annotation
– Attribute selection is not sufficiently studied.
– Amount of data is still insufficient.
Future work
• Optimising annotation:
• Procedures to adress the spelling problem
• Review the rules
• Ongoing student tests with a stochastic parser (trained on the Le
monde corpus)
• Adding more texts from higher stages of development
• Expanding to other languages (Italian L2)
• Continue working with other assessment schemes, i.e. the
Common European Framework of Reference (Granfeldt, 2008)
• Thank you for your attention!
•
Direkt Profil is free to use
• Available at this adress:
– http://profil.sol.lu.se
• Acknowledgments
The profiling team in Lund: Pierre Nugues, Suzanne Schlyter, Malin
Ågren, Edin Kuckovic, Emil Persson, Fabian Kostadinov, Lisa
Persson
This work was supported by the Swedish Research Council
Grant number 2004-1674
Download