1. Synopsis of the workshop proposal

advertisement
ANR-06-CORP-006
Échange de corpus d'apprentissage multimodaux (MULCE)
Eurocall 2010 Workshop proposal
"Dissemination and comparison of research findings: developing
Contextualized Learning and Teaching Corpora (LETEC)"
Eurocall 2010, Bordeaux,
mercredi 8 septembre 2010
Workshop coordinators
Maud CIEKANSKI (University of Paris 8), Marie-Laure BETBEDER (University of
Franche-Comté)
Mulce Project coordinator
Thierry CHANIER (University of Clermont-Ferrand), ANR Corpus in Social and Human
Sciences.
1
The Eurocall 2010 workshop will be the second scientific event for Mulce1, after our first
symposium in EPAL07 (Grenoble).
This document is an initial workshop proposal for the Eurocall 2010 conference, Bordeaux.
This is a full session workshop in order to gather researchers on corpora from different
communities. The morning session will focus on “out of the Eurocall members”, in particular
CSCL members and CEHL members (involving in learning corpora, not necessarily in
language learning). The afternoon session will highlight language learning applications. Only
the afternoon session will be integrated into the Eurocall forma.
Sincerely,
Maud Ciekanski and Marie-Laure Betbeder
Workshop coordinators
1
Mulce is funded by the ANR Corpus et Outils en SHS (ANR-06-CORP-006). Mulce gathers members of
several laboratories and universities: LRL (Université Blaise Pascal), LIFC (Université de Franche-Comté) and
CREET (The Open University), coordinated respectively by Thierry Chanier, Christophe Reffay and MarieNoëlle Lamy.
2
1. Synopsis of the workshop proposal
Title
of Dissemination and comparison of research findings: Developing Contextualized
Learning and Teaching Corpora (LETEC) http://mulce.univ-fcomte.fr/
workshop
Marie-Laure Betbeder and Maud Ciekanski
Workshop
coordinators
Practitioner-researchers from the two main communities working on learning and
Target
teaching corpora (CALL community and Computing Environment for Human
audience
Learning (CEHL) community).
The workshop welcomes experienced participants who contribute to the
Prior
development of learning and teaching corpora (LETEC) in any way: research,
knowledge
pedagogic developments, tools and interface, and corpora of various types
required
(monolingual, bilingual, spoken, written, multimodal, specialized, learner, etc.).
Whilst it is becoming increasingly easy to save traces of interaction in online
Contents
educational exchanges, there is at the same time a growing interest in the research
community for the construction of data sets allowing for the study of the learning
processes themselves. However, such data sets are rarely structured into corpora,
and comparing or re-analysing them is difficult. The workshop is a concrete step in
this direction. For a deeper collaboration within and between our communities, we
suggest sharing structured data collections. The Mulce project (http://mulce.univfcomte.fr/) aims at proposing a structure for Teaching and Learning Corpora
(including pedagogical and research contexts), paying particular attention to the
logging and analysis of ‘traces’ of interaction. Two main corpora (asynchronous
data and synchronous data) have been built according to this structure.
The workshop proposes a dialogue in two phases (morning and afternoon) on
sharing corpora and tools to improve interaction analysis from different fields
(CSCL, CALL, and CEHL). The morning programme focuses on the CSCL and
CEHL perspectives whereas the afternoon programme focuses on CALL perspective
and on spoken corpora researches.
Part of the workshop defines the notion of a ‘Teaching and Learning Corpus’, shows
its main structure and browses some parts of the structured interaction data as
developed as part of the Mulce project. Several of the activities will use analysis
tools (Calico, Tatiana), standards (TEI, XML), data annotation (multimodal
interaction, spoken interaction) and the use of corpora (Mulce platform to browse
and analyze a shared corpus; Computer Learner Corpora).
Two reasons have motivated the choice of Eurocall 2010 Conference to present the
achievement of the Mulce (Multimodal Learning Corpus Exchange) project, after
our previous workshop presented in EPAL07 (Echanger Pour Apprendre En Ligne,
Grenoble):
- The dissemination of our work in the CALL community involving several
SIG concerned by corpora, on the European and international scales
(approximately 300 participants from 30 countries);
- The proximity of the conference setting (Bordeaux) which enables different
French communities working on corpora (CSCL, CEHL and spoken corpora)
to join the Eurocall audience.
In addition, the Eurocall conference also gives the opportunity to offer a frame for
3
publication (Recall (Eurocall association) and Alsic (ADALSIC association),
according to the usual publishing procedures.
The speakers will initially prepare an extract from their learning interaction corpus
Workshop
objectives and in their format tool and give the possibility to the audience of the workshop to use
their demo after the workshop (access to their tool and platform, possibility to work
methodology
on own corpora).
Participants will discover how to transform data and to create their own corpora,
based on the Mulce format, how to use a variety of tools for interaction analysis
(Calico, Tatiana), how to annotate and to specify multimodal data and spoken data
(TEI, specific format), and the applications of such corpora (eg. Computer learner
corpora).
A full-day organized in two phases in order to encourage dialogue on language
Presentation
learning corpora - and on the specificity of research related to this topic - between
time
the following research communities:
- the CSCL (Computer-Supported Collaborative Learning) and CEHL
communities
- the CALL (Computer Assisted Language Learning) community
From the CSCL community: in addition to our current partners in the Mulce project
(Calico, Bruillard; Tatiana, Lund), and after fruitful contacts in CSCL09, we expect
to invite international specialists such as Suthers or Harrer.
From the CALL community, we aim to gather several specialists concerned by the
project (interaction analysis, Hampel), (Learner corpora, Granger), (spoken corpora,
Jacobson).
A round-table conference at the end of the day will bring together the coordinators
of 3 Eurocall SIGs working on related issues: “Computer-Mediated
Communication” (R. O’Dowd), “Natural language Processing/Intelligent CALL”
(C. Tschichold) and “CorpusCALL” (A. Boulton).
Out of the Eurocall format
Previsionnal
9:45: Workshop Introduction and Welcome Address (workshop coordinators)
Workshop
10:00: Structures for corpora in CSCL: new challenges? (A. Harrer, University of
programme
Duisburg-Essen)
10:30: Benefits of structuring learning and teaching corpora for the understanding of
online learning and online interactions: The Mulce platform (Christophe Reffay,
ENS Cachan)
11:00: Coffee Break
11:15: Analysis Tool presentation I- Corpus exchange and interoperability : The
Calico project (E. Bruillard, ENS Cachan and Alain Mille, Université Lyon1)
11:45: Analysis Tool presentation II- the Tatiana project (K. Lund, University of
Lyon2)
12:15: Feedback, questions, discussion on structure, instrumentation, collaboration
and sharing in CSCL
12:30: Break
Integrated into the Eurocall format
14:00: Corpus-based research in CALL: what are we looking for? (M.N Lamy,
Open University)
14:45: Data processing I- Online multimodal interaction corpora : alignment,
annotation, transcription (R. Hampel, Open University)
15:15: Kurt Kohn, University of Tubiegen, Allemagne
15:45: Coffee Break
16:00: Use of corpora
in research- Tools and questioning interface on
4
heterogeneous data (Mulce project)
16:30: Use of corpora in teaching- Applications of research on Computer Learner
Corpora in CALL (S. Granger, Catholic University of Leuven)
17:00: Round-table conference and open discussion on the potentialities of
Disseminating and comparing research findings: E. Bruillard (University of Caen),
T. Chanier (University of Clermont-Ferrand), M-N. Lamy (Open University), R.
O’Dowd (University of Leon), C. Tschichold (University of Wales Swansea), A.
Boulton (University of Nancy2).
18:00: Workshop ends
AV equipment Access to videoconferencing system for some of the sessions (led by speakers at a
distance).
provided
Fees to attend
Coordinators
qualifications
Our objective is to create a rich dialogue between researchers from different
communities. Since the members of CSCL and CEHL communities are not
necessarily members of Eurocall, we propose two modalities:
- Non-Eurocall members : they may attend the full session paying 45 euros per
½ day = 90 euros the full session (without paying Eurocall fees);
- Eurocall members (paying Eurocall fees): they may attend the full session
paying half fees (45 euros the day).
Marie-Laure Betbeder is a lecturer in Computer science (Computer Science
laboratory at the University of Franche-Comte) in the area of Technology Enhanced
Learning. She is interested in the analysis of the interaction in collaborative
situations of distance training / learning. Since 2006, as a member of Mulce, she has
also been studying the structure of the data stemming from such situations, to
constitute shareable corpora, usable by other members of the community.
Maud Ciekanski is a lecturer in Applied Linguistics and Distance Education. Her
main research focus concerns interaction analysis in the area of language learning
contexts, multimodal communication and intercultural communication. Since 2006,
she has been member of Mulce and is in charge of the analysis tasks. Her previous
research concerns self-directed language learning, autonomy and adviser training.
5
2. Introduction and organisation
2.1. Our conference of choice
We have chosen Eurocall 2010 (http://www.eurocall-languages.org) as the appropriate
international scientific event for our Mulce workshop. There are two reasons for this choice:
(1) to ensure dissemination of the outputs of the Mulce project within the CALL community
both across Europe and beyond (about 300 delegates from 30 different countries are expected
to attend the conference); (2) the location of the conference (Bordeaux) means that the French
CSCL (Computer-Supported Collaborative Learning) community and French colleagues
working on oral corpora can easily attend.
Eurocall'2010 is co-organised by Eurocall (European Computer Assisted Language Learning),
and by the ‘Adalsic’ association. These associations manage the refereed journals Recall and
Alsic (http://alsic.org) respectively. Following the conference a special issue of each journal
will be published, with articles selected according to the usual procedure (using three
referees). A framework for publication will thus be available for our colloquium presenters.
The colloquium will take place over one day, in two stages. Stage 1 will focus on the CSCL
community; Stage 2 will be oriented to the CALL community. Overall the programme of the
day will allow these two groups to network together around the concept of corpus and
research allied to it.
We plan to invite international CSCL experts (D. Suthers), as well as our French partners –
those with whom we share tools for analysis (E. Bruillard, ENS Cachan; K. Lund, ICAR; A.
Mille, LIRIS). Concerning the CALL aspects of the colloquium, we will bring together
experts in online communication for language learning online (R. Hampel), others with
expertise in the neighbouring domain of learner corpora (S. Granger), and others again with
experience of working on corpora from a linguistic point of view (K. Kohn). Finally, we will
invite to a Round Table the 2 Eurocall SIG leaders who work on themes close to ours, namely
"Computer-Mediated Communication" (R. O'Dowd), "Natural Language Processing /
Intelligent CALL" (C. Tschichold) and "Corpus CALL" (A. Boulton) .
2.2. Rationale
"Dissemination and comparison of research findings:
developing Learning and Teaching Corpora (LETEC)"
Our workshop is built on the colloquium "Corpus d’apprentissage en ligne : conception,
réutilisation, échange"2 organised by the MULCE (MUltimodal contextualized Learner
Corpus Exchange) project team at EPAL07 (France). It involves researchers from diverse
backgrounds and inviting them to examine data collected during online learning sessions, as
well as the tools and research methods used, with a view to building shareable corpora to be
made available to different groups of researchers.
The objective of the workshop is to bring together researchers and practitioners who helped
create the existing corpora, or who wish to participate in the creation of new corpora from
online learning modules, using corpus research methodologies from EIAH (Environnements
2
oai : edutice.archives-ouvertes.fr:edutice-00161113_v1
6
Informatiques pour l'Apprentissage Humain, Computer Environments for Human Learning)
or those that have been or are being developed in CALL.
Whilst it is becoming increasingly easy to save traces of interaction in online educational
exchanges, there is at the same time a growing interest in the research community for the
construction of data sets allowing for the study of the learning processes themselves.
However, such data sets are rarely structured into corpora, and comparing or re-analysing
them is difficult. When constructing a corpus, there is a need to systematically assemble the
data around converging themes, aiming to cover the chosen themes exhaustively, then to
organise and structure these data according to shared standards (XML, TEI, etc.). Finally, the
data need to be accessible and downloadable online, via search or annotation tools. Because
the data are complex and non-homogeneous, a system of synchronisation and internal
linkages is required, including access traces, interaction traces, learner productions, tests,
interviews etc. Making sense of the learner interactions after the event is a priority.
We will dedicate the morning to reflection on how to research corpora, and how corpora are
used within CSCL, highlighting questions of specification, instrumentation, implementation
and interoperability, all of which being aspects which inform our understanding of the
conditions for supporting multiple analyses and re-analyses. Participants will show examples
of environments and tools for, among other things, helping researchers to manage,
synchronize, visualize and analyze their data in order to create new representations that will
make it easier to understand how computer-mediated collaboration works. In these examples,
online collaboration, in a variety of domains, will be a main focus. In the afternoon,
researchers working on corpus-building in linguistics and applied languages will come
together. These disciplinary areas present many new challenges, not least because of the
importance of synchronicity and multimodality. The activities will cover a range of domains
of application within which the notion of corpora has become central to research, such as
corpora of online learning of languages, learner corpora and corpora of spoken language.
Highlighting potential cross-fertilisation between the chosen methodologies, tools and
methods of application (notably in the area of language learning), participants will support
their point based on demonstrations of software using corpus extracts. Part of the discussion
will focus on ethics and rights issues. The workshop will be conducted in English, with
examples from different languages. Interfaces and tools will be in French and in English.
2.3. Presentation format
The workshop is mainly targeted at CALL community practitioners and researchers, but more
widely at the CEHL (Computing Environment for Human Learning) community. Speakers
will focus on operational aspects of their research. They will prepare an extract from their
interaction corpus (language or other online learning situations), in the format of their own
analysis tool. Speakers will enable the audience to test their tool with the given corpus extract
after the workshop. Therefore, speakers will fill out a short form describing the corpus and the
analysis tool : the pedagogical context, a short description of the corpus extract, of the format
used by the tool, a short description of the downloadable tool (together with a download link
and access code), and a description of the research questions associated to the tool. The form
data will then be published in the workshop proceedings.
2.4. Proposed agenda (1rst draft)
The following agenda is work in progress. Speakers have not yet been formally contacted but
are in touch with one or more members of the Mulce project.
7
Morning out of the Eurocall format

9h00-9h45 : Participant welcoming

9h45- 10h00 : Workshop agenda presentation (two perspectives : CSCL and CALL)

10h00-10h30: Structures for corpora in CSCL: new challenges?
University of Duisburg-Essen)
(A. Harrer,

10h30-11h00: Benefits of structuring learning and teaching corpora for the
understanding of online learning and online interactions (C. Reffay, ENS Cachan)
Coffee Break

11h15-11h45: Analysis Tool presentation I- Corpus exchange and interoperability :
The Calico project (E. Bruillard, University of Caen and Alain Mille, University of
Lyon1)

11h45-12h15: Analysis Tool presentation II- the Tatiana project (K. Lund, University
of Lyon2)

12h15-12h30: Feedback, questions, discussion on structure, instrumentation,
collaboration and sharing in CSCL

12h30: Lunch
Afternoon integrated into the Eurocall format

14h00-14h45: Corpus-based research in CALL: what are we looking for? (M.N Lamy,
Open University)

14h45-15h15 : Data processing I- Online multimodal interaction corpora : alignment,
annotation, transcription (R. Hampel, Open University)
 15h15-15h45 : (Kurt Kohn, Université de Tubiegen, Allemagne)
Coffee Break

16h00-16h30 : Use of corpora in research- Tools and questioning interface on
heterogeneous data (Mulce project)

16h30-17h00 : Use of corpora in teaching- Applications of research on Computer
Learner Corpora in CALL (S. Granger, Catholic University of Leuven)

17h00-18h00: Round-table conference and open discussion on the potentialities of
Disseminating and comparing research findings: E. Bruillard (University of Caen), T.
Chanier (University of Clermont-Ferrand), M-N. Lamy (Open University), R.
O’Dowd (University of Leon), C. Tschichold (University of Wales Swansea), A.
Boulton (University of Nancy2).

18h00 : End of the workshop
2.5. Workshop proceedings
The workshop proceedings, coordinated by Marie-Laure Betbeder and Maud Ciekanski will
gather free of right papers together with a synthesis of the workshop’s discussions. As well as
the papers, a note on access and download procedures for the corpora and tools will be made
available. The proceedings will be published in the Edutice online archive.
8
Mulce Project
Maud Ciekanski, Marie-Laure Betbeder, Thierry Chanier, Marie-Noelle Lamy and Christophe
Reffay.
9
Download