Annex I - “Description of Work”

advertisement
COMPETITIVENESS AND INNOVATION FRAMEWORK
PROGRAMME
ICT Policy Support Programme (ICT PSP)
Theme 5 - Multilingual web
ICT PSP call identifier: CIP-ICT-PSP-2009-3
ICT PSP Theme/objective identifier: 5.2: Multilingual Web content management:
standards and best practices
Grant agreement for:
THEMATIC NETWORK
Annex I - “Description of Work”
Project acronym: MultilingualWeb
Project full title: Advancing the Multilingual Web, Thematic Network
Grant agreement no.: 250500
Date of preparation of Annex I (Version 2): 26 October 2009
Date of approval of Annex I by Commission: (to be completed by Commission)
Date of start of the project: 1st January 2010
Name of the Network coordinator: Richard Ishida
W3C – The World Wide Web Consortium
e-mail: ishida@w3.org
Tél:+ 1 6173951127
Name of the administrative and financial coordinator: Céline Bitoune
GEIE ERCIM (European Research Consortium for Informatics and Mathematics)
e-mail: celine.bitoune@ercim.org
Tél: +33 4 92385077
Version 3 (26/10/2009)
Page 1
of 62
Table of Contents
PART A of Annex I – Project Summary and budget breakdown .............................................. 4
A1. Project summary form (copy of A1 form of the GPFs) .................................................. 4
A2. List of beneficiaries ......................................................................................................... 5
Shortname corrected to match with those entered into NEF.................................................. 5
A3. Budget breakdown form (copy of A3.1 form of the GPFs) ............................................ 6
PART B of Annex I.................................................................................................................... 8
Project Profile......................................................................................................................... 8
B1. PROJECT DESCRIPTION AND OBJECTIVES ......................................................... 10
B1.1. Project objectives.................................................................................................... 10
Project background............................................................................................................... 10
Project objectives ................................................................................................................. 10
B1.2. EU and national dimension..................................................................................... 11
B2. IMPACT ........................................................................................................................ 14
B2.1. Expected impact and outcomes .............................................................................. 14
B2.2. Long term viability ................................................................................................. 16
B2.3. Availability of common results, consensus building, openness, sustainability...... 17
B3. IMPLEMENTATION ................................................................................................... 19
B 3.1. Consortium and key personnel .............................................................................. 19
1: ERCIM/W3C................................................................................................................ 21
2: Bioloom Group............................................................................................................. 21
3: CNR-ILC (I)................................................................................................................. 22
4: Facebook Ireland, ......................................................................................................... 22
5: The University of Applied Sciences (UAS) Potsdam .................................................. 23
6: Institut Josef Stefan (JSI) ............................................................................................. 23
7: Institutul de Cercetari pentru Intelegentia Articificiala (ICIA).................................... 24
8: The Language Technology Centre ............................................................................... 24
9: Lionbridge Belgium ..................................................................................................... 24
10: Microsoft Ireland........................................................................................................ 25
11: Opera Software........................................................................................................... 26
12: SAP AG...................................................................................................................... 26
13: The Translation Automation User Society (TAUS)................................................... 26
14: AALTO-KORKEAKOULUSAATIO........................................................................ 26
15: University of Oviedo (ILTO) ..................................................................................... 27
16: Universidad Politécnica de Madrid (UPM ................................................................. 27
17: The Language Resource Centre ................................................................................. 28
18: University of Economics, Prague............................................................................... 28
19: Transware Ltd (WeLocalize) ..................................................................................... 28
20: XML-INTL ................................................................................................................ 29
The European Commission's Directorate-General for Translation .................................. 30
The Localization Industry Standards Association (LISA) ............................................... 30
B3.2a. Chosen approach................................................................................................... 31
i) Workshops................................................................................................................ 31
ii) Associated ‘practical work items’ ............................................................................ 31
iii)
Risk analysis and contingency plan ..................................................................... 34
B3.2b. Work plan ............................................................................................................. 36
i) GANTT CHART...................................................................................................... 36
ii) Performance Monitoring Table to show success indicators..................................... 37
Version 3 (26/10/2009)
Page 2
of 62
Notes: ............................................................................................................................... 37
iii)
Workplan Tables tabular descriptions (registered online using NEF) ................. 38
WT 1: Work package list: ................................................................................................ 38
WT2: Deliverables list ..................................................................................................... 39
WT3: Work package descriptions WP01 ...................................................................... 41
WT3: Work package descriptions WP02 ...................................................................... 44
WT3: Work package descriptions WP03 ...................................................................... 47
WT3: Work package descriptions WP04 ...................................................................... 50
WT3: Work package descriptions WP05 ...................................................................... 53
WT4 –List of Milestones n/a......................................................................................... 56
WT5: List of Tentative Reviews ...................................................................................... 56
WT6: Summary effort table ............................................................................................. 57
B3.3. Project management ............................................................................................... 58
B3.4. Dissemination / Use of results ................................................................................ 60
B3.5. Resources to be committed..................................................................................... 61
Version 3 (26/10/2009)
Page 3
of 62
PART A of Annex I – Project Summary and budget
breakdown
Part A of Annex I is comprised of the following sections, which are generated automatically
by the NEF online tool from the information provided in the GPFs:
A1. Project summary form (copy of A1 form of the GPFs)
Version 3 (26/10/2009)
Page 4
of 62
A2. List of beneficiaries
Shortname corrected to match with those entered into NEF
List of participants:
Participant no.
1 (Co-ordinator)
2 (Participant)
3 (Participant)
4 (Participant)
5 (Participant)
6 (Participant)
7 (Participant)
8 (Participant)
9 (Participant)
10 (Participant)
11 (Participant)
12 (Participant)
13 (Participant)
14 (Participant)
15 (Participant)
16 (Participant)
17 (Participant)
18 (Participant)
19 (Participant)
20 (Participant)
Unfunded
Participant
Unfunded
Participant
Version 3 (26/10/2009)
Participant organisation name
Participant short Country
name
GEIE ERCIM
ERCIM/W3C
France
Bioloom Group
Bioloom
Germany
Consiglio Nazionale delle Ricerche
CNR
Italy
Facebook Ireland
Facebook
Ireland
Fachhochschule Potsdam
UAS Potsdam Germany
Institut Jozef Stefan
IJS
Slovenia
Institutul de Cercetari Pentru Inteligentia ICIA
Romania
Artificiala
Language Technology Centre Ltd.
LTC
UK
Lionbridge Belgium
Lionbridge
Belgium
Microsoft Ireland Research
Microsoft
Ireland
Opera Software ASA
Opera
Norway
SAP AG
SAP
Germany
J.D. van der Meer Beleggingen B.V.
TAUS
Netherlands
AALTO-KORKEAKOULUSAATIO
AALTO
Finland
Universidad de Oviedo
UO
Spain
Universidad Politécnica de Madrid
UPM
Spain
University of Limerick
University of f Ireland
Limerick
Vysoka Skola Ekonomicka V Praze
VEVP
Czech
Republic
Transware Limited
Welocalize
Ireland
XML-INTL Ltd.
XML-INTL
UK
European Commission DirectorateDGT
Luxembourg
General for Translation
The Localization Industry Standards
LISA
Switzerland
Association
Page 5
of 62
A3. Budget breakdown form (copy of A3.1 form of the GPFs)
Version 3 (26/10/2009)
Page 6
of 62
Version 3 (26/10/2009)
Page 7
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
PART B of Annex I
Project Profile
Information on the Thematic Network
Objectives
Set a foundation for improving support on the Web for languages of the European Union and its
trade partners, improving the efficiency of processes for creating and localizing content, both by
machine translation and more traditional methods, and improving support for multilingual content
and data on the Web.
Establish a network between stakeholders for the improvement of the multilingual Web, in order to
promote the adoption of current standards and best practices, explore the needs for future standards
work and best practices, and create a basis for long-term synergies between the participants, who
are coming from a variety of disciplines.
Improve content development in (X)HTML and CSS by helping content authors better understand
the standards and best practices they should be following.
Help user agent developers to identify and correctly enable support for multilingual standards and
best practices.
Activities and Outcomes
4 workshops, hosted by partners, open to public participation, with the following goals:
• Sharing of experiences and knowledge about existing standards and best practices.
• Discussion and recommendations about gaps that need to be addressed.
• The workshops allow for detailed discussion around general topic areas related to the
standards and best practices landscape, Web authoring, translation tool support, and another
area to be decided during the project by the partners.
• Community building, with the goal of establishing a long-term platform for working on
topics concerning the multilingual web
Minutes and recommendations to be published on the project Web site for public consumption.
Supported by archived, moderated mailing lists to allow public discussions to complement the
workshops, a wiki and a number of dissemination methods, including a twitter stream, blog
aggregation, conference presentations, etc. The high visibility of the W3C site and its prestige
for standards and best practices will also make a significant contribution to the visibility of the
results.
Practical work items, developed by the W3C with input from partners, and made available to the
general public from the W3C site:
• A publicly available tool for checking markup of web sites with regard to requirements of
the multilingual Web (like the W3C HTML validator) should help educate and inform
content developers about best practices and standards in an easy to use, and therefore
effective manner, and should improve the quality of language support in Web pages.
• A publicly available set of educational materials, developed by the W3C, that should also
help HTML and CSS content developers create language-friendly Web content.
• A set of test results, provided by partners, for tests on the W3C site should help developers
Version 3 (26/10/2009)
Page 8
of 62
CIP-ICT PSP-2009-3
•
Thematic Network - MultilingualWeb
of Web pages better understand what techniques are widely supported, and should also
inform the development of the previously mentioned training materials, but will also help
user agent developers to find and rectify gaps in support for multilingual features in their
products.
Face to face meetings of partners to provide input to and review of practical work items.
Consortium
The consortium includes a range of participants from 15 countries which have not cooperated yet
in a joint effort of this scale, from the areas that include standardisation, content development and
deployment, localisation, translation tools and machine translation, language technology
development, training, usability, social networking, browser development, digital libraries, etc.
Partners come from industry, standards bodies, user and industry representative bodies, research
and academia.
Since the Web is in the centre of the project, the W3C, as the main organisation for Web
technology standardisation, is a natural leader for this effort. The W3C will provide its expertise
in organizing working groups and workshops as coordinator for the Thematic Network. It will
also provide logistic support in terms of mailing lists, conference management systems, etc, and
offers the use of its high ranking and highly visible web site for dissemination of results, project
home page etc.
The participants will be expected to pool their knowledge of standards and best practices and help
arrive at recommendations for future work via the workshops and face-to-face meetings. They
will also provide program committee support for planning workshop agendas and choosing public
participants. In addition, all workshops and face-to-face meetings will be hosted by one of the
partners. Additionally partners will be called upon to help disseminate the information being
generated. They will also contribute advice for and review the practical work items involving
development work, and contribute test results for an internationalization test suite.
Impact
•
•
•
•
•
Contribution to the understanding and the relations between standards in the area of the
multilingual web, and improved visibility and use of existing standards and best practices.
A starting point for future, potentially long-term projects in the area of multilingual web
standardisation, best practices and tools development.
As a means to create such projects, creating a strong relation between the participants across
organisations, scientific disciplines and industrial application areas.
Improved use of multilingual standards and best practices in the creation of pages using
(X)HTML and CSS by content developers.
Improved support for multilingual features in Web user agents.
Version 3 (26/10/2009)
Page 9
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
B1. PROJECT DESCRIPTION AND OBJECTIVES
B1.1. Project objectives
Project background
Given the importance of the World Wide Web to communication in all walks of life and as the share of
English Web pages decreases and share of non-English languages used in the European Union, and
around the world, increases on the Web, the importance of ensuring the multilingual viability of the
World Wide Web is paramount.
The European community must address this with respect to the needs of the citizens within its borders,
but also with respect to its trade and relations with external communities around the world.
One of the overall objectives of the ICT PSP is to accelerate the development of an ‘inclusive
information society’, allowing access to the Web for all kinds of citizens. One of the hurdles it cites is
the lack of interoperability of solutions across the Member States.
The overview of Theme 5: Multilingual Web, states that overcoming language barriers by enabling
cross-lingual access to Web resources suffers from a lack of language-friendly Web conventions.
Standards and best practices are a means of increasing interoperability and encouraging coherence
across advances in ICT. They also provide targets that push applications intended for creation, display
and management of content to consider the requirements for supporting multilingual use of the Web.
Responding to the concerns expressed above will involve bringing people together from the multiple
disciplines that feed into the multilingual Web to understand what best practices and standards exist
currently, and to look forward to what advances will further reduce the barriers to an inclusive
information society.
Although important standardisation work on establishing a base for multilingual deployment of the
Web has been and is being addressed by organizations such as the W3C and the IETF, for example
use of Unicode in Web technologies, roll-out of Internationalized Domain Names, development of
standardised language tags, etc., people producing multilingual content for the Web feel that there
remain a number of barriers to full multilingual roll-out of information and tools, and these need to be
addressed.
These barriers, in a range of areas, reduce efficiency or prevent the work of those attempting to
provide a truly multilingual Web experience. They affect the ability to produce, localize, manage and
share information and applications on the Web.
Standards and best practices enable interoperability of data, which in turn maximises the potential for
access to information, ensures longevity and usability of data, and improves the efficiency of
processes for producing, localizing and disseminating information.
Project objectives
These objectives address the stated aim of Objective 5.2: Multilingual Web content management:
standards and best practices:
Version 3 (26/10/2009)
Page 10
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
“to promote Web standards, best practices and partnerships for multilingual Web content
management, in particular the authoring, versioning and maintenance of (parallel)
multilingual Web sites, portals or repositories.”
The purpose of this Thematic Network proposal is to establish both a cross-disciplinary network and a
forum for stakeholders drawn from different areas involved in the realisation of the multilingual Web,
in particular the authoring, versioning and maintenance of multilingual Web sites, portals or
repositories. The main objectives are:
1. To improve the transparency and usability of the World Wide Web across the European
Union and the rest of the world by encouraging and facilitating adoption of multilingual Web
standards and best practices.
2. By holding a number of workshops, to provide an opportunity for learning about and
discussion of what standards and best practices are currently available and what is currently
missing, and to do so by bringing together people from a range of fields involved in realising
the multilingual Web, including production of content, usability of hypertext, localization,
language tools, Web deployment and research.
Partners will bring together the concerns of publishers using large content management
systems, people working on archives, providers of tools for social networking, government
and official bodies, and also researchers and developers working on translation and language
technologies, markup specialists, browser developers, usability experts, and industry bodies,
and more, and the workshops will also be opened for public participation.
3. To use consensus to formulate recommendations for work on standards and best practices that
can then be taken up by relevant organizations. Relevant organizations may include
workshop attendees or other organizations throughout the industry, including future projects
organized by the European Commission.
4. To raise the visibility among the public of available standards and best practices, and
encourage their use, and promote involvement in development of new standards.
5. To stimulate the exploration of and solutions to issues outside the framework of the project,
and to promote cross-functional collaborations in addressing the problems.
6. To establish new relationships and partnerships between people and organizations, to form a
cross-domain network of contacts interested in furthering the work of producing standards
and best practices for the multilingual Web.
7. To support the development of specific tools for improving the multilingual Web that will be
developed at the W3C. These include a validator (similar to the W3C’s widely used HTML
validator) that content authors can use to check pages for internationalization issues in
markup and style sheets, training materials on internationalization topics related to Web page
design, and results for tests related to internationalization features of current browsers. Work
on these practical initiatives is not funded by the Thematic Network project, but participants
in the Thematic Network will meet to provide input on their development and support for
their use.
B1.2. EU and national dimension
Recognizing the opportunities and challenges that Europe’s diversity provides, the 2008 Commission
Communication on multilingualism available at
http://ec.europa.eu/education/languages/pdf/com/2008_0566_en.pdf sets out aims to both capitalise on
Version 3 (26/10/2009)
Page 11
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
the opportunities of and address the challenges of Europe’s multilingual heritage, in terms of enabling
greater multilingual skills among European citizens and supporting the use of regional and minority
languages in Europe.
The purpose of this Thematic Network is to explore and make recommendations for improving
standards and best practices related to the World Wide Web. The Web is today one of the most
significant technologies involved in enabling widespread use of regional and minority languages, the
understanding of other’s languages through conversion of one language to another, the coexistence of
multiple languages within a single technology or content, and opportunities for language-learning.
However, the Web is still young, and there are barriers to multilingual support that still need to be
addressed, and best practices that need to be followed more widely to maximise its multilingual
potential. In this respect, the proposed Thematic Network is central to the policies of the EU.
Section 7 of the Commission Communication refers specifically to translation and new technologies,
and to the need for language-aware technologies that promote content creation in multiple languages,
and the improvement of translation tools and processes. These are topics that will be given specific
attention during the work of this Thematic Network proposal.
The EU supports a number of projects related to multilingual or general language resources and the
consortium partners are associated with a number of highly relevant current projects, such as the
following:
The CNGL CSET project (www.cngl.ie) is a large research project funded by Science Foundation
Ireland (SFI). This research project integrates language technology, digital content management and
localisation research in a unique way. It has a research track focused on standards and guidelines for
global content creation and translation, which aligns well with the MultilingualWeb initiative.
Information sharing and potentially joint workshops between these two initiatives should benefit
both. Microsoft Ireland is a partner in CNGL.
CNGL has established some collaboration with the EuromatrixPlus project for MT. That would open
a further potential collaboration point for adoption of content standards and collaboration with MT
researchers to ensure that there is a good match and that MT systems are better able to handle Web
content - this would also be a topic for the CNGL project which includes an MT research track also.
The Rosetta Foundation was also established as a spin-off from the CNGL and the University of
Limerick in Ireland to provide a localisation technology platform and infrastructure that is accessible
and affordable. It is not-for-profit and is being supported by the CNGL, the LRC and industrial
partners. It will maintain and develop this platform and deploy it for not-for-profit organisations.
The EuroMatrixPlus Project is based on novel combinations of statistical techniques and linguistic
knowledge sources as well as hybrid machine translation architectures.
FLaReNet, CLARIN and LanguageGrid are examples of European (first two) or worldwide (Asian)
initiatives that are mainly focused on the standardization and the use of language technologies from
an academic LT perspective. The MultilingualWeb project brings in perspectives from the significant
web actors with more practical and short term requirements. The synergies result from trying to
accommodate the most advanced techniques in web technologies, the latest recommendations in web
document annotation and the newest NLP technologies able to process in due time the huge amounts
of web data.
In the FLaReNet Thematic Network, which is bringing together experts in the field of LRs and LTs,
standards are one of main themes of investigation and contact points between the two projects.
Discussion is going on about what standards and best practices are currently available and what is still
needed in the future.
The CLARIN project is an example of a community-specific project focusing on linguistic data from
a large number of languages. The proposed project will help to establish a common research
Version 3 (26/10/2009)
Page 12
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
infrastructure in the social sciences and humanities, and to lead to a better understanding of the
common needs for multilingual information in this domain.
The Japanese LanguageGrid initiative is an infrastructure, critically based on interoperability, aimed
at improving accessibility to multilingual content and providing language services to users, by
allowing new custom language services to be implemented via combination of existing ones.
Nicoletta Calzolari, of CNR-ILC, is the coordinator of FLaReNet initiative and Dan Tufis, of ICIA, is
also a member of FLaReNet. Nicoletta is also the Chair of the Scientific Board of the CLARIN
project. Dan Tufis is the Vice-Chair of the Scientific Board of the CLARIN project. Both of them are
in the expert group of the LanguageGrid initiative. They would act as liaison between
MultilingualWeb project and the respective initiatives for the benefit of all these projects.
The T4ME project is currently under negotiation and due to start in early 2010 and will provide a
reference structure for tools, a clearinghouse for useful language resources, lexica, corpora for
building language applications, tools like parsers and taggers, and will also incorporate service
concepts which will harvest resources from the Web and factor them into publicly available
resources. We expect to collaborate closely with the T4ME partners, particularly around the main
issue of standards and interoperability.
The Europeana project is a large scale initiative providing online access to European cultural heritage
and knowledge. It is particularly relevant since its results are only available online, and pose the
challenge of multilingual access.
The LTC is involved in the MORMED and OrganiK projects, and will report on progress of these
projects during workshops. We will also need to track and liaise with activities on the MONNET and
COSYNE projects. SAP is involved with the MONNET project.
THESEUS is a German national project supported by the BMWi (Federal Ministry of Economics and
Technology), with a European and global dimension, looking at business models for the World Wide
Web and other internet based networks such as the "Internet of Things" and the "Internet of Services."
The European and global dimension mainly concerns language and cultural related aspects regarding
the exchange and interchange of information and knowledge in automated use case scenarios that
must rely on new innovative communication techniques, protocols and workflow capabilities as well
as process management abilities. Examples areas are one-to-many cross-cultural translation
automation, and language data exchange that is beyond current approaches such as those employed by
the TAUS Data Association for translation memory content and terminology. Because some of the
Network partners have direct communication with THESEUS organisations, a smooth and fruitful
exchange is guaranteed.
Yet another source for the cultural exchange dimension is the MEDAR Specific Support Action under
the Seventh Framework Programme that deals with translation automation aspects between European
countries and the Mediterranean Arabic speaking countries.
UPM has applied for the project "Lingu@net World Wide" that has been selected for funding in
Action KA2 Languages Multilateral Projects 2009, Lifelong Learning Programme of the EC. UPM is
also responsible for the Technology Work-package in the project, which includes all the IT technical
work necessary to implement and integrate new languages and new resources in Lingu@net Europa.
Possible synergies between the MultilingualWeb network and Lingu@net World Wide include the
following:
•
Support of translation tools for the creation of multilingual web sites. Advancing the
integration of translations tools using XLIFF and TMX standards with CMS would facilitate the
process of adding more languages to Lingu@net World Wide. At the same time, the progress in that
integration could benefit from a practical case such as Lingu@net World Wide.
Version 3 (26/10/2009)
Page 13
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
•
Usability and accessibility of multilingual web sites. Getting to know the implications in
usability and accessibility when adding more languages to a site, or when having pages that combine
different languages.
As a follow-on to previous multilingual web projects and contracts, UPM also has students working
on the development of open-source software as part of their master thesis project. For example, one of
these projects will release a web based tool to facilitate the integration of translations in CMSs.
B2. IMPACT
B2.1. Expected impact and outcomes
The project Web site will be the main host for the information arising from the project.
The W3C site is central to the Web, and carries substantial credibility as a source of information. The
workshops will be announced on the W3C web site, which will guarantee high visibility. This will be
instrumental in increasing the impact of the project on those involved in engineering multilingual
Web solutions. In addition, although the project is focused on the European region, the W3C is
global in scope, and has the ability to relay the information generated by this project to people all
around the world.
The project will gather together experts who are involved in many different aspects of multilingual
web content development to discuss and share experiences with existing standards and best
practices. The knowledge that is pooled in the workshops will be reported in documents that will be
made available publicly and will be well publicised in a number of different ways. This will lead to
greater awareness of available standards and best practices, which should in turn lead to broader and
faster adoption of them.
One of the assets of this Thematic Network is the diversity that it brings to the table. This means that
it will have an impact that ranges over a wide range of stakeholders in the multilingual Web, and
creates synergies rather than silos in its outcomes.
The partners not only bring a wealth of experience to the table, but have the ability to influence and
spread the word among large numbers of people. Most are regularly involved in speaking at
conferences, participating in other workshops, and other means by which information is disseminated
throughout the industry. In addition, a number of other channels will be available to spread
information about the workshop outputs, as described in section 3.4
The workshops should also point out gaps in the standards and best practises and focus attention
on where the industry should be moving in order to further improve efficiency and effectiveness for
the multilingual Web. These topics should further address the reduction of overheads and increase
the ability to capitalise on the rich multilingual and multicultural contribution of online communities.
A number of partners represent organizations that have the ability to directly influence work on gaps
in the standards and best practices, and have at their heart a mission to improve the efficiency and
effectiveness of the creation and management of multilingual content. All partners are represented by
influential people in their respective fields who have the ability to effect change in the industry.
The breadth of experience on the partner list will also be a significant contribution to the success of
the project. Partners include organizations working with browser development, social networking
tools, localization tools, localization services, large content management systems, standards for Web
and multilingual technologies, language resources, computational linguistics, data mining and
archival, user and vendor forums, and more. Partner representatives are or have been associated with
Version 3 (26/10/2009)
Page 14
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
many important groups and initiatives, including examples such as OASIS, DocBook, RELAX NG,
XSL, ITS, ISO/IEC JTC1/SC34, OSCAR, OAXAL. FlaReNet, ISO/TC 37/SC4WG4, ISO TDG12,
ELRA, Unicode, Glossasoft, XLIFF, etc.
Furthermore, workshop discussions will be open to public attendees, not just the partners. This should
provide further representation of key industry players and points of view than the already diverse set
of partners who have volunteered to participate.
The partners and public participating in the workshops, and the recommendations arising from those
meetings will address the creation and management of multilingual Web sites and content, the
minimisation of costs and inefficiencies associated with producing multilingual content, and ways of
capitalising on the rich multilingual and multicultural contribution of online communities. The
workshops and their deliverables will also significantly raise the visibility and adoption of standards
and best practices in these areas.
In addition to the benefits brought to the project by the expertise of the partners, the thematic network
should carry benefit to the activities that the partners are involved in that dovetail with the aims of the
network. We include here some of the synergies with industrial partners.
Companies like Facebook have made a great deal of investment in technology and process to help
web developers in creating multilingual web applications. Facebook believes that the
MultilingualWeb project will enable the web community at large to benefit from such technologies
and processes, especially when complemented by the collective knowledge of like companies.
Facebook is very active in promoting open source initiatives. For a list of contributions, see
developers.facebook.com/opensource.php.
Browser developer, Opera is heavily involved in work with standards body W3C and would look to
communicate work on the multilingual web to other groups within Opera feeding in to specification
development within the W3C. Such collaboration would help to ensure that individual initiatives
were harmonised and do not conflict in any way.
Henny Swan's main remit at Opera is to work with developers raising awareness of web standards and
universal access. She contributes to the User Agent Accessibility Guidelines Working Group at the
W3C and works internally to ensure that Opera products are usable for people with disabilities. She
is also Co-Lead of the Web standards (WaSP) project Internationalisation Group (ILG). WaSP
actively promotes web standards globally, and the ILG chapter specifically promotes
internationalisation and localisation. A large part of the work being done currently is to localise
WaSP's InterAct web standards Curriculum for use in 21 countries. Opera has also developed a Web
Standards Curriculum for developers and designers to learn how to write better Web pages
(http://www.opera.com/company/education/curriculum/).
Lionbridge is also involved in other standards organization such as OASIS. For OASIS they are part
of the XLIFF technical committee, and believe that a liaison with XLIFF could facilitate the exchange
and management of translation.
Lionbridge is also currently working on the launch of its translation platform, as an SaaS commercial
offering. This will provide the opportunity to organizations (large or small) or independent translators
to benefit from the translation technology that Lionbridge has used to deliver its services. The
discussions about standards in the MultilingualWeb project will feed into that development.
TAUS has been enabling the sharing of language resources in a single repository, which will
stimulate innovation and automation in the translation industry. In pilot projects undertaken TDA
members have already proven that translation leveraging and MT performance using larger shared
corpora of parallel text can increase by 30% to 50%. The MultilingualWeb network will give TAUS
access to more organizations that can benefit from the shared data repository and the experiences of
TAUS members.
Version 3 (26/10/2009)
Page 15
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
TAUS is also starting a Tracker that will aggregate relevant data and experiences from all MT users.
This service will allow all participating organizations to quickly find specifications of MT systems,
use cases and best practices. TAUS is very open to collaborate with other parties in developing this
into a very valuable resource.
SAP believes that, due to its participants, the network will create a possibility for direct exchange
between many important players for the global web. Coverage reaches from the World Wide Web
Consortium as a central steward for core Web standards such as HTML, to the Localization Industry
Standards Association as a major umbrella organization related to localization and translation services
(with its own set of standards which for example with regard to Term Base eXchange have found its
way to the International Organization for Standardization).
The network's main SAP contact is working in the context of standardization bodies such as the
World Wide Web Consortium, and the Organization for the Advancement of Structured Information
Systems (OASIS). This involvement should provide good opportunities to carry information into both
directions: to the network and from the network.
In addition to the workshops, the project will provide support for the development of practical work
items that will be developed by the W3C. These are largely, but not exclusively, focused on
encouraging the uptake of standards and best practices in the main backbone of Web content, HTML
and CSS. The impact of these practical work items will be to actively spread best practices and
guidelines related to existing Web standards throughout large numbers of content developers and
authors, and promote the uptake of best practices and standards in major browsers.
One such work item is an internationalization checker, which will be used by content developers all
over Europe and the world to check Web page markup for conformance to standards and best
practices. A tool of this kind can provide a very attractive, and therefore effective, way for content
developers to learn about standards and best practices.
Another item is a set of training materials, which will be freely available for educators to teach people
about basic standards and best practices related to designing for the multilingual Web. This is in
addition to the fact that methodologies from different fields of knowledge will be shared and
compared throughout the project.
A third work item relates to browser tests. These tests have already been influential in helping major
browser developers recognise and improve support for multilingual users. For example, W3C tests for
complex script support in Web fonts recently played a role in improving font-linking features in
several major browsers. This is just one example of how the tests have directly impacted multilingual
support in major browsers.
Various indicators will be used to measure the impact of the project. These are described in detail
later in the proposal. They include the following:
-
Workshop audiences, in terms of quantity and quality
The number of hits for the documents containing workshop outputs
The number of hits for the internationalization checker
The number of test results provided for internationalization-related browser tests
B2.2. Long term viability
Project partners include bodies and groups concerned with improving the multilingual industry
through the development of standards and best practices and dissemination of information about
them. These include, for example, people involved with W3C, TAUS, LISA, LRC and FlaReNet.
Version 3 (26/10/2009)
Page 16
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
These players are well placed to foster adoption of and future work on the recommendations arising
out of the workshops.
The mailing lists described in the next section will continue to provide a forum for discussion for as
long as is needed after the end of the project.
The practical work items developed by the W3C with the input of the partners will be for the use of
the general public and will be developed under open source licences. The usefulness of these items in
assuring better support for the multilingual Web in content creation and browsers will have only just
begun by the end of the project, and will continue into the foreseeable future.
Reports and minutes will also be hosted and kept available on the project Web site, which, since it is
managed by the W3C, has a commitment to stable, long-term access by the public.
Several organizations involved in the network have the capability of taking the network further after
completion of this project. Towards the end of the project the partners will explore establishing some
continuing liaison/coordination mechanism.
Partners can also discuss whether they would like to continue the work under the umbrella of the
W3C as a special interest group within the Internationalization Activity.
B2.3. Availability of common results, consensus building, openness,
sustainability
Shortly after each workshop, the minutes and recommendations of the workshops will be publicly
available from the project Web site, which enjoys a prestigious position with regards to site ranking
and attracts a large number of visitors. In 2008 the site saw a daily average of 10.1 million page views
on www.w3.org. They will additionally be publicised via the sites of other partners, a twitter channel
will be set up to report news and events in real time, and other means of reporting information, such
as blog postings, tagged workshop photos on flicker, and other social networking channels. These
factors will assist in raising the visibility of the discussions that take place far above and beyond the
confines of the project partner organizations and the timeframe of the project itself.
The ongoing milestones in the development of the project will be announced on the W3C
Internationalization Activity home page (http://www.w3.org/International/) and will be available via
its related RSS feeds.
Workshop discussions will be supplemented by a minimum of three publicly-archived mailing lists,
hosted and maintained by the W3C. One mail list will be for coordination of workshop logistics,
admin-related activities, etc. and will be mainly for partners. Another mail list will be for general
communication on any of the topics addressed during the project, and will be a publicly viewable list,
that can also be subscribed to by the general public. A third mailing list will support discussion of the
practical work items. These mailing lists will be available for use throughout the project, and beyond.
The W3C has a great deal of experience in hosting community mailing lists of this type, and effective
methods for maintaining them, granting subscriber access, spam control, etc.
There will also be a publicly accessible project home page for coordinating work and to point to
workshop and face-to-face outputs, ongoing status of work, etc. This will be on the w3c site.
At any time it is needed a wiki can be set up, also hosted by W3C, to support collaborative working.
In addition to this, the workshops themselves will be open to participation by non-partners, and we
will specifically target other key players in the industry by invitations to ensure that they are also
involved in the discussion process.
Version 3 (26/10/2009)
Page 17
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Further dissemination of information by the partners will be strongly encouraged through conference
presentations, publications and specialised training courses.
Findings will also be disseminated via conference channels such as Unicode, LISA, Localization
World, and other major conferences.
The practical work items will be hosted on the W3C site, and will be for the use of the general public.
The work items do not constitute funded deliverables of the project, but their development will be
strongly influenced by the partners, and their contributions under the umbrella of the thematic
network will be recognised. These items will be developed under open source licences and are
resources that will be available to the general public. Partners will be expected to contribute support
in the expectation that their ideas will be freely available to the public, once incorporated into the
work items.
Version 3 (26/10/2009)
Page 18
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
B3. IMPLEMENTATION
B 3.1. Consortium and key personnel
Partners represent a wide spread of stakeholders in the multilingual Web and much of the value of the
project proposal lies in the quality of the partners represented. Public participation in the workshops
will further expand and strengthen this wide view of the issues. The fact that partners are drawn from
such a range of fields, considering the synergy and contacts that will be produced by such meetings,
constitutes a valuable aspect of the project.
The 20 funded partners and 2 non-funded partners represent entities from 15 different European
countries.
The following table summarises the specialities of the partners. Detailed information and personnel
information is given in the remainder of this section, after the table.
Program
committee
Workshop
attendee
World Wide Web
Standards
Semantic technologies,
Machine learning,
Language technology
Language resources and
technologies, FlaReNet
X
X
X
X
X
X
X
X
Social networking,
Localization
Information sciences,
Digital libraries
X
X
X
X
X
X
Machine learning, Data
mining
Computational
linguistics, Multilingual
web services
X
X
X
X
X
Translation tools,
Language services
X
X
X
Language services,
Translation tools
Localization, localization
tools, Browser
development
Browser development,
Localization
Translation services,
Content engineering
Translation technologies
and services user
community
Machine translation, Text
mining, Machine learning
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Partner
Area of activity
W3C/ERCIM
Bioloom Group
Consiglio Nazionale
delle Ricerche
(CNR-ILC)
Facebook Ireland
University of
Applied Sciences
(UAS) Potsdam
Institut Josef Stefan
Institutul de
Cercetari Pentru
Intelegentia
Articificiala (ICIA)
Language
Technology Centre
Lionbridge Belgium
Microsoft Ireland
Research
Opera Software
SAP
Translation
Automation User
Society (TAUS)
(AALTO)
Version 3 (26/10/2009)
Workshop
host
Practical
work items
X
X
X
X
Page 19
of 62
CIP-ICT PSP-2009-3
Universidad de
Oviedo (ILTO)
Universidad
Politécnica de
Madrid (UPM)
University of
Limerick (LRC)
Vysokaskola
Ekonomicka v Praze
(UoE)
Transware Ltd
(WeLocalize)
XML-INTL Ltd.
Localization
Industry Standards
Association (LISA)
(non-funded partner)
European
Commission's
Directorate-General
for Translation (nonfunded partner)
Thematic Network - MultilingualWeb
Multilingual web
technologies ,
multilingualism,
usability, transaltion
research
Multilinguism,
Accessibility, Usability
X
X
X
X
X
X
X
X
Localization research and
education
Markup, Electronic
publishing Systems
X
X
X
X
X
X
X
Language services,
Globalization consulting
Open standards-based
translation technology
Localization industry
association, Standards
X
X
X
X
X
X
Institutional translation
services
X
X
X
X
X
X
The roles for all partners are largely the same. All, except LISA, are expected to attend and
participate in the workshops and face-to-face meetings that constitute the core of the project. All are
also expected to participate in program committee work prior to each workshop to help create
workshop agendas, proposing themes and topics for discussion, and reviewing position papers for
selection of presenters at the workshop and reviewing the workshop deliverables.
Note that the workshops will be open to public participation, so the discussions will not be solely
driven by the partners. Members of the public who wish to participate in and present ideas at the
workshop will be asked to submit a brief position statement. The program committee will select
proposed public attendees on the basis of these position statements.
Several partners have also expressed a desire to provide facilities for the workshops. This being a
project related to multilingualism, it will be an advantage to hold meetings in diverse locations around
Europe as it will increase exposure of attendees to a variety of local people, cultures and issues, and
should also make it easier for local people to attend.
Partners will be encouraged to discuss or work on specific topics in more detail between workshops
by drawing together groups of interested individuals into task forces. The project mailing lists can be
used to support the task force discussions, and the W3C can make available teleconference facilities,
IRC channels, and such where needed. The use of a Joomla-based web site provides opportunities for
partners in such task forces to publish information to the site relevant to their discussions, and the
W3C can also provide wikis to support such collaborations where needed.
Partners will also be expected to review and provide feedback on the associated ‘practical work
items’. This involves participation in face-to-face meetings, but will also involve discussions on a
mailing list. It is also expected that some partners will submit test results to support the knowledge
provided on the W3C site about the behaviour of major browsers vis-a-vis multilingual support.
Partners should also look for and champion opportunities to provide training using the materials
developed by the W3C with the input of the partners.
Version 3 (26/10/2009)
Page 20
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
The W3C will provide resources and know-how to develop the internationalization checker tool, put
together the training curriculum and compile the results for the internationalization tests. These
deliverables build upon work already done at the W3C.
As mentioned previously, the W3C is able to provide support in terms of mailing lists, wikis,
teleconferencing, etc. It also has a lot of experience in conducting successful workshops and in
managing working groups.
Funded partners
1: ERCIM/W3C The coordinator for the MultilingualWeb Network is ERCIM, which is the
parent organisation for the European part of the W3C (World Wide Web Consortium). The W3C is an
organization of currently over 400 members worldwide from research and industry, headed by the
Web's inventor, Sir Tim Berners-Lee. It has considerable experience in hosting public workshops and
bringing together diverse constituents to formulate common plans. The project will be supported out
of the Internationalization Activity of the W3C, which has been involved since 1998 in producing
specifications, guidelines, and best practices, in doing internationalization education and outreach,
and in reviewing new Web technologies for internationalization issues. The W3C will be looking for
opportunities to develop standards or guidelines for Web internationalization from ideas that are
discussed during the project. It will also commit to development of practical work items associated
with the project.
Richard Ishida is the Internationalization Activity Lead at the W3C. He has worked with a
wide range of W3C technical Working Groups dealing with internationalisation issues in
emerging Web technologies, such as HTML, XHTML, CSS, SVG, SSML, XML, WAI, etc.
He also added guidelines, education and outreach to the work of the Internationalisation
Activity. He previously worked for Xerox and has a background in translation and
interpreting, translation tools, and internationalization consultancy.
Celine Bitoune is ERCIM Projects group financial coordinator. She has written two financial
guidelines to European Commission financial rules for participation in funded projects
(Frameworks 6 and 7), widely distributed within and beyond the ERCIM consortium. And
she has recently taken over the coordination of 2 World Wide Web Consortium (W3C)
projects: WAI-AGE and MobiWeb2.0 (CSA).
2: Bioloom Group was founded in 2001 as an independent ICT consulting organisation with a
focus on business intelligence and business process management in strong interaction with semantic
technologies, machine learning and advanced multilingual language technologies to deliver intelligent
solutions for global transcultural communication. The bioloom way also includes information bionics
where we learn from nature's information handling and processing, and investigate how these findings
can be effectively transformed into efficient technical and computational solutions. Bioloom Group's
customer base ranges from traditional ICT enterprises to organisations in the fields of life sciences
and bioinformatics. Within the MultilingualWeb network Bioloom Group will contribute and share
with the community its knowledge on transcultural communication and computational intelligence,
and work collectively on new designs and solutions for the evolution of a transcultural, multilingual
web generation.
Jörg Schütz is a computer scientist and IT philosopher with over 30 years of business
experience in different ICT fields ranging from databases through language technologies to
virtualization. He has consulted and lectured world-wide, is a member of several scientific
and industry associations, and advises the European Commission as an expert, evaluator and
reviewer. He was also active in standardisation bodies, and working groups on data and
knowledge exchange formats such as ITS, OLIF, SALT and TMF. Jörg has studied computer
science, mathematics and medicine, holds a PhD in AI and Machine Translation, and
received a Honorary Professorship for Machine Translation and for Information Sciences
from the University of the Saarland in Saarbrücken, Germany. In 2001, he founded the
Version 3 (26/10/2009)
Page 21
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Bioloom Group, an organization that consults on and develops solutions for the next
generation of intelligent computing.
3: CNR-ILC (I) - Istituto di Linguistica Computazionale “Antonio Zampolli” del CNR has a
recognised international and national leadership in the area of Language Resources and Technologies
under technological, scientific, political and strategic aspects. Its mission is to improve and foster
language technologies through new methods and techniques for managing digital content and
understanding human language. It promotes basic research in areas where the need of significant
innovations emerges, fostering the synergies among different disciplinary competences, and ensuring
synergies between basic and applied research. It is the promoter of innovative objectives in the
international community, by fostering new “paradigms” in the field. In synergy with European
initiatives, it is the pioneer of the vision of an open and distributed infrastructure of resources and
tools, to be integrated in various services and systems, and has launched the notion of Language
Resources as the central component of the linguistic infrastructure.
CNR-ILC has coordinated many EC projects, some in the standardisation area as EAGLES and ISLE,
and is represented in the most influential international/ national Committees, Boards, and
Associations. Currently, CNR-ILC is the coordinator of the EC eContentplus Thematic Network
“Fostering Language Resources NETwork” (FLaReNet), and is part of the ESFRI CLARIN and
KYOTO FP7 projects.
CNR-ILC will act in synergy with the major international initiatives for standards and
interoperability, such as those of various ISO Committees it is leading or is a member of.
Nicoletta Calzolari Zamorani. Director of research at CNR-ILC. Honorary PhD
(Copenhagen University). She has promoted internationally the fields of Language Resources
and Standardisation and has a long experience in the coordination of many international,
European and national projects and strategic initiatives, currently coordinating the FLaReNet
EC Thematic Network. Member of ICCL, Chair of the Scientific Board of CLARIN,
Convener of the ISO/TC 37/SC4WG4 and of ISO TDG12, member of ACL Exec, past VicePresident of the ELRA Board, chair of the ELRA PCom, founding member of the Italian
Forum for HLT, member of many International Committees and Advisory, Executive or
Editorial Boards (among which ELSNET, SENSEVAL, ECOR, SIGLEX, WRITE).
Conference Chair of LREC since 2004. General Conference Chair for COLING-ACL'2006.
Chief co-editor of the International Journal Language Resources and Evaluation (Springer).
Monica Monachini Senior Researcher at CNR-ILC. Field of expertise: computational
lexicology and lexicography, lexical semantics, methods and models for the developments of
lexicons and lexical architectures. Active in many standardisation activities for harmonising
information in lexica and corpora. Involved in international projects for Language
Engineering and standardisation initiatives. Member of the EAGLES, SIMPLE and ISLE
Working Groups. Technical responsible for the Pisa team in the eContent INTERA, and
LIRICS projects, Currently involved in the FLaReNet Thematic Network and FP7 KYOTO
project. UNI delegate for ISO/TC37/SC4; ISO TDG13 convenor.
Claudia Soria, PhD. Researcher at CNR-ILC. Field of expertise: computational lexicology
and lexicography, lexical semantics, methods and models for the developments of lexicons
and lexical architectures, representation of language resources, metadata. Active in many
standardisation activities for harmonising information in lexica and corpora. Involved in
international projects for Language Engineering and standardisation initiatives. Member of
the ISLE Working Groups. Involved in European projects INTERA, LIRICS, CLARIN,
FLaReNet, KYOTO. UNI delegate for ISO/TC37/SC4: WG4, TDG7 and TDG2 expert.
4: Facebook Ireland, founded in February 2004, is a social utility that helps people
communicate more efficiently with their friends, family and coworkers. The company develops
technologies that facilitate the sharing of information through the social graph, the digital mapping of
people's real-world social connections. Anyone can sign up for Facebook and interact with the people
Version 3 (26/10/2009)
Page 22
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
they know in a trusted environment. Facebook is a part of millions of people’s lives all around the
world. Facebook is in 57 languages, and that number is growing. Their ‘crowd-sourcing’ model for
localization will be of particular interest to the consortium.
Ghassan Haddad is the Director of Localization at Facebook, where he is responsible for
defining and implementing the company's globalization strategy, including its crowdsourcing model. Prior to Facebook, he was Director of Software Engineering and
Localization at PayPal where he was responsible for enabling PayPal as a payment solution in
almost two hundred countries, 30+ currencies, and 15+ languages. He has over twenty years
of experience in academia, language research & technology, management, and software
development. He was Assistant Professor at Iowa State University and has held several
middle and upper management positions at Intergraph, Berlitz, eTranslate, PayPal and now
Facebook. Ghassan holds a Ph.D. in Linguistics from the University of Illinois at UrbanaChampaign.
5: The University of Applied Sciences (UAS) Potsdam has a focus on
information sciences, which encompasses the areas of archivists, librarians and documentation
specialists. In all three areas, the Web has become a major part of education and research / industry
projects. In addition, the creation of digital libraries with multilingual content is a key work item for
the future of library and information sciences. The University will contribute to the proposed project
by bringing in the perspective of multilingual digital libraries and other cultural heritage institutions,
like: requirements and best practises for creating and accessing multilingual online library catalogues,
building bridges between monolingual library information systems (e.g. classifications, thesauri) with
(semantic) web technologies, and evaluating relations between standards in the library area vs. the
web area. The last topic will focus on standards about language identification, which currently are
applied in a heterogeneous way across library and web communities. The contribution in this area
will be complemented by implementations, based on existing work by the key personnel.
Felix Sasaki has more than 10 years experience in dealing with research and industry topics
related to the multilingual web. In his PhD and research work he examined the benefits of
web technologies for representing, exchanging and processing multilingual data, like
linguistic corpora as a basis for natural language processing. During his time in the
Internationalization Activity of W3C, he contributed to internationalization aspects of
emerging web technologies, and implemented several key internationalization technologies
himself. In his current position at University of Applied Sciences of Potsdam, he is building
bridges between the library and archives world and the web, with a focus on multilingual
applications.
6: Institut Josef Stefan (JSI) is the central research institution for natural sciences in
Slovenia. It consists of over 900 researchers within 25 departments working in the areas of
computer science, physics, and chemistry and biology. The Department of Knowledge
Technologies is one of the largest European research groups working in the areas of machine
learning and data mining. It has approx. 50 researchers covering different aspects of data
analysis with special emphasis on textual data, social networks/graphs, complex data
visualization, cross modal analysis, temporal (stream) data and in particular on scalability of
approaches and deployability of research results in real world environments. In the recent
years the research shifted towards semantic technologies, where the main goal is to combine
modern statistical data analytic techniques with more traditional logic based knowledge
representations and reasoning techniques. The department developed several software tools,
among others: Text-Garden suite of text mining tools, OntoGen system for ontology learning,
Document-Atlas for complex visualization.
Marko Grobelnik is expert in the areas of analysis and knowledge discovery in large
complex data bases. In particular, the areas of expertise comprise: Data Mining, Text
Mining, Semantic Technologies, Network Analysis, and Complex Data Visualization.
Version 3 (26/10/2009)
Page 23
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Marko collaborates with major European and US academic institutions and industries
such as Microsoft Research, New York Times, Accenture, Nature and British
Telecom. Marko is author of several books in the area of machine learning, data
mining and semantic technologies and authors of many scientific papers. He is also
W3C AC representative for JSI, CEO of the company Quintelligence and associated
with the company Cycorp Europe.
7: Institutul de Cercetari pentru Intelegentia Articificiala (ICIA) is one of the
centres of excellence in Romania with a high scientific reputation, built in its 15 years of existence by
an active presence in the international research community. It is also the internet service provider for
the institutes of the Romanian Academy (62 institutes). The results obtained at ICIA in POS tagging,
chunking, word alignment, parsing, word sense disambiguation, information retrieval, language
learning, question answering in open domains, wordnet development, ontology based language
processing have received international recognition been reported in major conferences and used by
various researchers all over the world. They developed a platform of multilingual web-services
(SOAP & UDDI & WSDL), widely used (for Romanian and English) by various partners in Europe
and USA.
Professor Dan Tufis is the director of the Research Institute for Artificial Intelligence of the
Romanian Academy and professor of Computational/Statistical Linguistics and Machine
Translation at the University "A.I. Cuza". He participated in more than 25 international
projects and authored more than 200 papers, published in peered reviewed conferences and
journals.
Dr. Radu Ion is senior researcher at ICIA and he leads the research in the area of question
answering for open domains. He developed several largely used NLP tools (tokenizer, POS
tagger, lemmatizer, word aligner, etc). He is also responsible for internet service provision by
our institute.
8: The Language Technology Centre (LTC) possesses more than 16 years of extensive
experience in evaluating, developing and implementing advanced language technology solutions.
LTC has an excellent international reputation as a language service provider, software house and
consultancy. LTC prides itself on its high profile multilingual service team that offers and coordinates a multitude of language services. Also, LTC offers consulting services in multilingual
process optimization, business information and workflow management systems. In addition, LTC has
a comprehensive range of computer-assisted language services including computer-assisted
translation, software and website localization. LTC provides these services to commercial
organisations as well as EU bodies.
LTC has been offering an array of ICT services around its products such as product customization,
Software as a Service, training and support since it began selling software. The use and promotion of
best practices and multilingual content management on the Web of multilingual sites has been a
business driver for LTC for many years.
Dr. Adriane Rinsche, Managing Director of LTC, founded the company in 1992 and
continues to lead design for the company’s products, which include LTC Worx and LTC
Communicator. She designed the first and most mature business information system for the
language industry, known today as LTC Organiser. Rinsche has a PhD in Computational
Linguistics from Bonn University in Germany
Philip McConnell has a MBA, an honours degree in Computer Science and over 25 years in
the software industry, is currently the company’s Head of Software Engineering responsible
for maintenance of numerous multilingual Web sites, and portals.
9: Lionbridge Belgium draws on the expertise of over 4000 employees and 10,000 linguistic
resources, to cover over 100 languages and has a long history of managing large-scale translation and
Version 3 (26/10/2009)
Page 24
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
language-related projects in areas as varied as IT, automotive, eLearning, software localisation, life
sciences, entertainment, telecommunications, aerospace, and public affairs. Lionbridge also
specialises in face-to-face, over-the-phone, and simultaneous interpretation as well as interpreter
training and testing. Finally, translating for the European institutions and other EU or national public
bodies and NGOs is another core business of our company.
Lionbridge has adopted a number of integrated tools (content management system, web-based
translation memory platform, etc.) that accelerate production and ensure consistency of global content
from authoring through translation and publication. Because i18n is a critical first step in producing a
globally acceptable software product, our goal is to educate our customers on how to develop globalready applications that are ultimately easier and less expensive to localise.
Jeff Knorr is currently the Director of Internationalization Services for Lionbridge. He has
been involved in the globalization industry for over 16 years, with 10 of those years spent
leading the Lionbridge I18n Services team. Having a formal education and degree in Electrical
Engineering, Jeff began his globalization career in 1993 as an engineer for International
Language Engineering (ILE). Since that time he has been involved in many different areas of
localization and internationalization services.
Joachim Schurig is the translation memory architect and Senior Technical Director for
Logoport. He joined Lionbridge in February 2005 with Lionbridge's acquisition of Logoport.
Prior to Lionbridge, he was the managing director of Logoport Software in Berlin, Germany, a
company he founded in 2000 to develop Logoport and introduce Internet translation memory
services to the industry.
Eric Blassin is currently the vice president of language technology for Lionbridge. Eric has
20 years experience in the localization and technology industries. At Lionbridge he has
managed several regional operations, worldwide technology and IT systems and, most
recently, the successful rapid deployment of Logoport. He began his career with Digital
Equipment Corp. as engineer in the International Systems engineering group and was involved
in the internationalization architecture of several platforms.
Jim Compton is currently Worldwide Director of Technical Services Excellence, leads the
global technical services team and focuses on: empowering Lionbridge Operations to drive its
process and technology offering along a vector of constant improvement, developing
Lionbridge's culture of innovation and collaboration, and addressing emerging areas of need.
10: Microsoft Ireland Research, founded in 1975, Microsoft (Nasdaq “MSFT”) is the
worldwide leader in software, services and solutions that help people and businesses realize their full
potential. The Microsoft European Development Center in Ireland (Microsoft Ireland Research), is
focused on engineering excellence. It conducts the full lifecycle of software development from
research and development, to engineering and localisation across many of Microsoft's different
business groups. EDC is involved a wide array of different projects from working on the Windows
Media Centre, to Windows Live development, security, anti-virus research, and the localisation of
over one hundred products and services, from Microsoft Office to MSN and the Xbox 360, into over
30 languages. The teams at EDC collaborate with sister centres based in Denmark, India and China
and are part of the Microsoft Product Group R&D organisation.
Dag Schmidtke is a Senior International Project Engineer responsible for localization
language technology in the Microsoft Office International Product Group. He joined
Microsoft in 1991 and has extensive experience in software and content localization. He has
completed Masters degrees in linguistics and computational linguistics, and currently works
with web adoption and deployment of machine translation and other language services, as
well as Global English authoring guidelines and related tool support initiatives within
Microsoft.
Version 3 (26/10/2009)
Page 25
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
11: Opera Software develops the Opera Web browser, a multi-lingual, high-quality, multiplatform product for a wide range of platforms, operating systems and embedded Internet products –
including Mac, PC and Linux computers, mobile phones and PDAs, game consoles, and other
devices like the Nintendo Wii, DS, Sony Mylo, and more.
Opera’s vision is to deliver the best Internet experience on any device. Opera’s key business objective
is to earn global leadership in the market for PC/desktops and embedded products. Opera’s main
business strategy is to provide a browser that operates across languages, devices, platforms and
operating systems, and can deliver a faster, more stable and flexible Internet experience than its
competitors.
Henny Swan, Web Evangelist, is also co-lead of the Web Standards Project International
Liaison Group and has a strong background in mobile and web accessibility. She also
contributes to the W3C User Agent Accessibility Guidelines working group. She blogs at
www.iheni.com, www.opera.com/developer and www.webstandards.org. She will be
contributing to this project together with Pal Eivind Jacobsen Nes who manages
internationalisation within Opera.
12: SAP AG (www.sap.com) delivers products and services that help accelerate business
innovation for its customers. Today, customers in more than 120 countries run SAP applications –
from distinct solutions addressing the needs of small businesses and midsize companies to suite
offerings for global organizations. SAP Language Services (SLS) is SAP’s central translation services
unit. It draws on a network of Localization Service Providers and various technologies to provide
services for over 30 languages.
Christian Lieske, one of SAP's Knowledge Architects, is a member of SLS. He works on
internationalization, and translation approaches with a focus on content engineering and
process automation. He is a long time contributor to the OASIS XLIFF Technical Committee,
and co-editor of the W3C Recommendation "Internationalization Tag Set".
13: The Translation Automation User Society (TAUS) is a community of users
and providers of translation technologies and services. The ambition of the TAUS community is to
translate a manifold of content in an increasing number of languages through technology adoption,
service innovation and cross-industry collaboration. TAUS focuses on the whole spectrum of
authoring, translation and globalization processes and technologies. Simply by enabling organizations
to share relevant information, identify good practices, find technologies and experts, benchmark
processes and leverage their buying influence, TAUS helps companies to save management time,
avoid the risk of mistaken decisions, and save money. And by exchanging experiences and insights,
TAUS members cut back dramatically on the learning and implementation costs of new translation
technologies.
Jaap van der Meer is a language industry pioneer and visionary who was involved with
pioneering term extraction and translation memory software and standards initiatives for the
localization industry in the 80s. He inspired and funded the founding meetings of the LISA
organization. He was President and CEO of ALPNET from 1997 till 2004. In 2005, he
established the Translation Automation User Society (TAUS), along with founding members
from most of the leading IT companies.
14: AALTO-KORKEAKOULUSAATIO (the merger of three Finnish universities: The
Helsinki School of Economics, Helsinki University of Technology and The University of Art and
Design Helsinki). Department of Information and Computer Science (ICS) is the leading computer
science department in Finland with 8 professors and about 100 staff altogether. The department has
been selected for 1994–1999, 2000–2005 and 2006–2011 as one of the Finnish national Centres of
Excellence in Research. The department is an active member in the PASCAL and PASCAL2 networks
Version 3 (26/10/2009)
Page 26
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
of excellence under the EC 6th and 7th framework programmes. The Computational Cognitive
Systems group at AALTO/ICS conducts research on artificial systems that combine perception,
action, reasoning, learning and communication. A specific focus is in adaptive language technology.
Central research themes of the group include adaptive machine translation, modeling of conceptual
spaces, automatic and language independent extraction of terminologies, and learning ontologies from
examples.
Dr. Timo Honkela, Chief Research Scientist, is the head of the Computational Cognitive
Systems research group at AALTO. In the 1980s, Honkela was responsible for semantic
processing in the Kielikone project that was developing a large-scale natural language
interface for Finnish. He served at VTT Information Technology as a project manager in
Glossasoft project (EU Telematics, 1993-95) that developed methods and tools for dealing
with internationalisation and localisation in software development. Recently, Honkela has
been responsible director in the EU-funded MedIEQ project that developed quality labeling
methods for web-based medical text resources using semantic technologies. He had an
initiating role in the development of the Websom method for visual information retrieval and
text mining. Honkela has served as a professor both at AALTO and at the UIAH Media Lab.
He has published approximately one hundred scientific articles in the areas of language
technology, knowledge engineering, statistical machine learning and cognitive modeling. He
is a former chairman of the Finnish Artificial Intelligence Society. Honkela is currently the
chair of the IFIP working group on knowledge representation and reasoning (WG 12.1) and a
member of the executive board of European Neural Network Society.
15: University of Oviedo (ILTO) has international expertise in two fields relevant for the
thematic network: web technologies and languages and translation. As regards the first area of
knowledge, the Master in Web Engineering provides specific training on web architecture, design and
standards to produce usable and accessible websites. Besides, some researchers from the field of
Philology and Translation Studies have been studying the translation of websites, the new textual
model of hypertexts and the implications of the choice of languages and other cultural aspects in
websites. Recently a new research group, ILTO (Internationalization, Localization, Translation,
Oviedo) has been formed in the University of Oviedo from the convergence of researchers and
lecturers from these disciplines, among which are the participants in the thematic network, Cristina
Valdés, José Emilio Labra, César Acebal and Alberto Fernández Costales.
Cristina Valdés is a lecturer in the University of Oviedo, where she teaches British Culture,
Translation and English Language. She has completed a PhD on advertising translation and
has developed her research towards website translation and communication. Now she
coordinates the research group ILTO (Internationalization, Localization, Translation, Oviedo)
in the University of Oviedo. She has experience in international training and research
programmes on translation and inter-cultural communication.
16: Universidad Politécnica de Madrid (UPM) is the oldest and largest of the Spanish
Technical Universities. UPM currently has more than 3,000 faculty members, around 38,000
undergraduate students, and 6,000 postgraduate students. UPM’s Schools cover most Engineering
disciplines, including Telecommunications and Computer Science. UPM has a strong commitment to
R&D and Innovation. It ranks first among Spanish universities in European Union R&D funding,
having around 15% of the total number of European Union funded projects. The contribution of the
university to knowledge creation through its scientific publications is also very relevant.
The UPM members engaged in this project are willing to share their experience in multilingualism,
accessibility and usability in web design and development, acquired through projects such as
Lingu@net Europa (www.linguanet-europa.org). They are committed to proposing and discussing
themes for the workshops, reviewing and contributing position papers and attending the workshops,
with the aim to find solutions for Multilingual CMSs based on the application of standards and best
practices. They are also volunteering to host one of the meetings or workshops.
Version 3 (26/10/2009)
Page 27
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Encarna Pastor is a full professor specialising in the field of networking, multimedia
applications, Internet technologies, multilingualism, accessibility and usability in web design
and collaborative work environments. She has undertaken research projects in these fields
and acquired an extensive research and development experience through her involvement in a
number of European Commission-funded projects as well as national-funded projects in
collaboration with research and industrial organizations.
Luis Bellido is an associate professor specialising in computer networking, multimedia
applications, and Internet technologies. In particular, he has extensive experience in
multilingualism, accessibility and usability in web design and development acquired through
his participation in a number of European Commission-funded projects as well as nationalfunded projects in collaboration with research and industrial organizations.
17: The Language Resource Centre (LRC) is a research centre of the University of
Limerick, established in 1995 at the Department of Computer Science and Information
Systems. It is the focal point and the research and educational centre for the localisation
community worldwide. The LRC works with worldwide digital publishers and their partners
who are interested in future technologies and processes for globalisation, internationalisation,
localisation and translation (GILT). The LRC provides relevant well researched content rich
information on future trends and technologies. The Centre is a unique industry and academic
collaboration which provides an unparalleled network of expertise. As part of its activities,
the LRC maintains a strong online presence (www.localisation.ie) and publishes the only
dedicated, peer-reviewed and scientifically indexed localisation journal Localisation Focus –
The International Journal of Localisation (since 1996). The LRC also organises an annual
conference (since 1996) and a summer school (since 2001). Internationally, the LRC
collaborates with the W3C, OASIS and industry associations and support structures such as
LISA and Localization World.
Reinhard Schäler has been involved in the localisation industry in a variety of roles since
1987. He is the founder and editor of Localisation Focus - The International Journal of
Localisation, a founding editor of the Journal of Specialised Translation (JosTrans), a former
member of the editorial board of Multilingual Computing (Oct 97 to Jan 07, covering 70
issues), a founder and CEO of The Institute of Localisation Professionals (TILP), and a
member of OASIS. He has published more than 50 articles, book chapters and conference
papers on language technologies and localisation. He has been an invited speaker at EU and
international government-organised conferences in Africa, the Middle East, South America
and Asia. He is a Principal Investigator in the Centre for Next Generation Localisation
(CNGL), a lecturer at the Department of Computer Science and Information Systems (CSIS),
University of Limerick, and the founder and director of the Localisation Research Centre
(LRC) at UL, established in 1995.
18: University of Economics, Prague - Department of Information and Knowledge
Engineering specializes teaching and research in the area of knowledge representation and
processing, data mining, AI, linguistics and Web technologies.
Jiří Kosek will represent the university on this project. He has more then 10 years experience
in providing XML consultancy and training in Czech Republic and world-wide as well. His
special focus is on documentation and electronic publishing systems. Jiří is an active member
in several standardization bodies, including OASIS (DocBook TC and RELAX NG TC),
W3C (XSL WG and ITS WG) and ISO/IEC JTC1/SC34.
19: Transware Ltd (WeLocalize) was founded in 1997, and is a privately held, venture
backed company. In order to increase in-house translation capacity, the company acquired Bits
Translations in Germany in 2000. A merger with GSSI of Portland, OR in 2002 provided expanded
geographic coverage, and enhanced both the depth and breadth of services offered. In December of
Version 3 (26/10/2009)
Page 28
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
2005 Welocalize completed the acquisition of Connect Global Solutions in Dublin, Ireland. In 2006
Welocalize made 2 further acquisitions with M2 in the US and Transco in China. In 2007 Welocalize
acquired TechIndex (Japan) and Localize (USA). Finally in 2008 Welocalize acquired Ireland based
Transware for the strength of their marketing based clients and translation management workflow
technology (GlobalSight).
Welocalize combines specialised language services, globalization consulting, internationalization and
testing solutions with a well-organized localization methodology required to translate software, ebusiness applications, web sites, documentation, multimedia, eLearning services, mobile applications
and other electronic content and business systems into foreign languages. Welocalize specialises in
the Enterprise Applications, eLearning, Life Sciences, Media and Telecommunications industries.
David Clarke currently works with Welocalize, Ireland as a Technical Manager. David has
twelve years experience in the localisation industry having held engineering, senior
engineering & resource management positions with Language Management International,
BiTS Übersetzungen (later acquired by Welocalize), Bowne Global Solutions and Connect
Global Solutions (acquired by Welocalize in 2005).
Current responsibilities include technical ownership of enterprise-level client programs,
coordination of engineering and publishing production through specialised production
business units, localisation workflow planning, training and technical sales-support.
20: XML-INTL is an organization dedicated to providing leading scalable Web 2.0 Computer
Assisted Translation tools based on Open Standards and an Open Architecture in order to reduce the
cost of translation and improve interoperability with other systems.
Andrzej Zydroń, CTO of XML-INTL, has sat on numerous OSCAR and OASIS Open
Standard Technical Committees. He worked on GMX-V (Global Information Management
Metrics eXchange) standard, as well as xml:tm (XML based text memory) and heads up the
new OASIS OAXAL (Open Architecture for XML Authoring and Localization) reference
architecture technical committee. He has worked in IT since1976 at Xerox, SDL, Oxford
University Press, Ford of Europe and DocZone in the fields of content management systems,
document imaging, terminology systems and localization.
Version 3 (26/10/2009)
Page 29
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Non-funded partners
The following partners are not eligible to participate as funded partners, but they are keen to be
involved, and should bring valuable knowledge and experience to the project.
The European Commission's Directorate-General for Translation is one of
the largest translation services in the world. Located in Brussels and Luxembourg, it has a permanent
staff of some 1,750 linguists and 600 support staff, and also uses freelance translators all over the
world. Known as the DGT after its English initials, the service translates written text into and out of
all the EU's official languages, exclusively for the European Commission.
Spyridon Pilos and Manuel Tomas Carrasco Benitez will represent the "Language
applications sector of the informatics unit".
The Localization Industry Standards Association (LISA) represents the concerns
of the globalization industry: companies involved in the adaptation of business activities, products,
and services for multiple markets. LISA develops open standards for the representation of linguistic
assets used in globalization and the related processes of localization and internationalization. LISA
will represent the concerns of commercial and governmental bodies active in globalization in the
development of the CE Thematic Network and will help publicize the results. Since they do not
receive funding for travel, they may not participate in the workshops in person, but they will
contribute to the program committee work.
Arle Lommel is director of standards for LISA. He has worked for LISA since1998 and
holds degrees in linguistics and folklore studies. He has led the standards development effort
at LISA since 2003 and has also worked on business data gathering and analysis for LISA.
Version 3 (26/10/2009)
Page 30
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
B3.2a. Chosen approach
The project will last for two years. It will bring together stakeholders from a range of affected areas to
discuss where we are currently in terms of providing for a truly multilingual Web, and where we need
to concentrate effort as we move forward.
i) Workshops
The strategic objectives will be achieved with the aid of four workshops, led by the partners but open
to public participation. Minutes and recommendations from the workshops will be available to the
public.
The first workshop will range widely over the issues of the multilingual Web, to establish a context
and a general picture of the landscape. Partners will share their experience with standards and best
practices, describe initiatives in which they are currently involved and establish initial proposals for
topics to be treated during subsequent workshops.
Note that it will be possible to continue discussions begun in the workshops using the mailing lists
that will be set up for the project.
The subsequent workshops will focus on standards and best practices related to more specific areas,
and the areas where further standardisation or best practices are needed. The second and third
workshops will focus on authoring and translation, respectively. The topic of the final workshop will
be decided during the project. This enables us to focus on issues that become clear after the start of
the project, and enables the group to decide which issues are of highest priority out of the many that
could be discussed.
The coordinator will produce an on-line document summarising the outcome of the workshops,
accompanied by the workshop minutes. These will be available to the public from the W3C site, and
will be announced and brought to the attention of the public in a number of ways, from news items on
sites to twitter.
There will also be two face-to-face meetings organized for partners, during which we will hold
general project review discussions and discussions related to the practical work items described
below.
As Network Coordinator, the W3C has a substantial amount of experience in the running and
facilitation of successful workshops and working groups. In addition, it provides world-class tools to
support collaborative work, including, for example, IRC channels with command-bots for meeting
management, minute-taking, etc, that can be used for face-to-face and teleconference meetings. The
W3C also provides support for management of publicly-archived mailing lists, wikis and other
collaborative tools as part of its normal business.
ii) Associated ‘practical work items’
Although one of the major goals of the project is to establish a network of stakeholders, it will also
provide an opportunity for partners to actively support the development, in ways described below, of
some practical initiatives that will be carried out in parallel at the W3C. These initiatives are aimed at
supporting the development of the multilingual Web by providing an online internationalization
validator, training materials related to web internationalization, and a set of test results for
internationalization-related features of major browsers.
The actual work on those initiatives is not funded by the Thematic Network; however they provide a
channel for the network to extend its influence in an additional way. These items provide the network
Version 3 (26/10/2009)
Page 31
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
with an opportunity to contribute to a practical outcome in addition to the strategic discussions
involved in the core workshops. They also introduce an element to the network which will extend
beyond the time constraints of the current project, since the items in question will continue to be
refined and used long after the project terminates.
The main vehicle for partners to contribute to the development of these items will be the face-to-face
meetings described in the list of work packages. These will provide opportunities to review, suggest
ideas for, and contribute feedback on the practical work items. Partners will also be able to contribute
test results and support the delivery of training. In addition to the face-to-face meetings, there will be
a dedicated mailing list to allow for discussion at any point during or after the project.
The following is a description of the practical work items:
[1] Internationalization checker tool: HTML authors are largely unaware of defects in their markup
or style sheets related to internationalization. There is a need for a tool that authors can run on content
to assess its international readiness. The tool should not only point out problem areas, but should also
point to advice on what to do if a problem is reported.
This will be an online tool similar to the W3C's HTML Validator and the MobileOK checker, but
aimed at checking web pages for internationalization issues. For example, it will report errors,
warnings and other advice to page authors on a range of topics that will include such things as
character encodings and declarations, language declarations, use of directional markup, nonnormalized class or id names, byte order marks, navigational links, etc. Feedback on issues identified
will explain the issues in simple terms, and link to existing guidelines and best practices as well as
further reading, so that it acts as an educational tool, rather than simply listing errors and warnings.
Any member of the public will be able to submit any X/HTML or CSS file for checking by specifying
the URI. The tool may also address other technologies, such as SVG.
This work will leave behind a durable and widely useful legacy from the project work. The partners
will be involved in providing ideas for included features, and reviewing and testing the checker as it
is developed. (Some partners may wish to provide assistance in developing the tool. This would be
by agreement with the W3C.)
The tool will be available to the general public on the w3c site under open source licences (at a
minimum the W3C software licence), and promoted along with the W3C's similar tools. It is expected
that the tests will eventually also be integrated into the existing validators.
An initial version of the internationalization checker will be available for use by the public from an
early stage in the project and features and bug fixes will be added via regular updates throughout the
project.
[2] Internationalization training: The W3C will develop a training package for web content
developers that will be made available for delivery by and to the public at large. The partners will
assist in the development by providing suggestions for content and by providing review feedback at
face to face meetings. Discussions may also take place on the email lists. (Some partners may wish to
provide support for development of the training. This would be by agreement with the W3C.)
It is expected that project partners will also be able to help by providing facilities and organising
attendees for the delivery of a certain number of courses.
The training package will address such things as encoding & language declarations, composite
messages, dealing with text expansion, navigation, etc., an overview of Unicode and related concepts,
ITS, and so forth.
Training materials will be made available for download from the w3c site under open source licences
(at a minimum the W3C document licence).
Version 3 (26/10/2009)
Page 32
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
[3] Internationalization test suite results gathering: Tests for internationalization features on Web
user agents are extremely useful for highlighting areas where user agent support can be improved and
alerting content authors to features that may not work interoperably. The W3C already has a number
of internationalization-related tests in various test suites, including tests on the i18n area of the site
(http://www.w3.org/International/tests/), and is developing more on an ongoing basis. Several of the
i18n tests have been picked up by browser development teams to internally promote new features and
test prior to deployment. Reporting the results of the tests also significantly heightens the visibility of
these features on the part of user agent developers, but also serves to educate and inform content
developers.
This work package will bring together partners to review tests and provide test information on test
results for user agents on a range of platforms and devices. (Some partners may also wish to assist in
the development of the tests themselves. This would be by agreement with the W3C.) During the face
to face meetings, partners will also have the opportunity to suggest and discuss additional tests that
might be useful.
The test results will be made publicly available on the W3C site. The work begun in this way is
expected to continue after the end of the project, as further tests are developed. The first results
should be added early in the lifecycle of the project, and additional results added throughout the life
of the project.
Version 3 (26/10/2009)
Page 33
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
iii) Risk analysis and contingency plan
Risk
Lack of external participants in
Workshop
Evaluation &
Description[1]
Provenance;
Probability;
Impact
external;
medium;
medium
Lack of internal participants in
Workshop
internal;
low;
high
Withdrawal of a partner
due to a lack of available
resource
external;
low;
medium
Withdrawal of a partner due to
lack of interest
internal;
low;
medium
2nd F2F (general discussions
about practical work items):
delay in the W3C work in
producing the deliverable
expected to be reviewed at this
stage
internal;
low;
medium
Failure to deliver practical work
items that provide acceptable
impact
internal;
low;
medium
Version 3 (26/10/2009)
Contingency Plans
The call for papers will be issued 3 months
ahead of the Workshop days, allowing for
frequent pre-checking of the registrations and
to decide in time to change the date of the
event or to adapt it according to the audience
(focused on interests of internal partners)
Before issuing the call for paper a
consultation between partners will be
launched (via email and conf call). Thus
planning and interest within the consortium
will be discussed at an early stage and allow
the Network Coordinator to take the
appropriate decisions to maximise attendance
among partners.
For the Workshop attendance the partners are
expected to spend 10 days; thus it is unlikely
that they won’t be able to allocate the
resource. The Network Coordinator will work
with the partner to try to find an alternative
solution.
The Network Coordinator has paid careful
attention in the choice of the partners with
regards to their interest in the domain. At this
stage none of the participants are at risk.
It has already been agreed with the W3C that
the work will be done. Work on the
internalization checker work will be weighted
towards the early stages of the Thematic
Network. This should considerably reduce the
delay in providing the expected deliverable in
M14 in which the meeting is scheduled.
Nevertheless and according to the outcome of
the 1st review, the Network Coordinator will
be able to postpone this meeting. And if no
dates can be found to do so, the several
communication tools such as interactive
communication tools (such as IRC and wikis)
and mailing lists could be used to hold the
discussion and provide the minimum required
amount of input to the partners for the review
of this work.
The development of practical work items is
based on existing experience and approaches
at the W3C. These have produced highly
successful initiatives in the past and the
expectation is that that will help significantly
here. In addition, initial versions of the work
Page 34
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
items will be produced as early as possible in
the project and made available for public use,
and then incrementally improved during the
course of the project. This will allow for early
feedback, and corrective action where
necessary.
[1] Evaluation is expressed through three keywords characterising the provenance (internal vs
external), the probability (low, medium, high), and the impact level (low impact, medium impact,
high impact) respectively.
Version 3 (26/10/2009)
Page 35
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
B3.2b. Work plan
i) GANTT CHART
The following Gantt chart shows the timing of tasks and deliverables related to the achievements of the projects’ goals.
6m
1st year
2nd year
MultilingualWeb
ACRONYM:
Total Person days distribution 1
Person days distribution in %
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Duration 24 months
Workpackage 1:
Overall Project Coordination
Task 1.1
Task 1.2
Project Administrative and Financial Coordination
Project Network Coordination
Workpackage 2:
Workshop “The landscape of multilingual Web standards and best practices”
Task 2.1
Task 2.2
Task 2.3
Workshop (local organiser: UPM, location: Madrid, Spain)
Dissemination
Practical work items: 1st f2f meeting (local organiser: RACAI, location: Romenian)
Workpackage 3:
Workshop: “Authoring the multilingual Web”
Task 3.1
Task 3.2
Task 3.3
Workshop (local organiser: CNR-ILC, location: Pisa, Italy)
Dissemination
Practical work items
Duration 6 months
Duration 6 months
Duration 6 months
Workpackage 4:
Workshop: “Translation tool support”
Task 4.1
Task 4.2
Task 4.3
Workshop (local organiser: ILTO, location: Oviedo, Spain)
Dissemination
Practical work items: 2nd f2f meeting (local organiser: LRC, location: Limerick, Ireland)
Workpackage 5:
Final Workshop
Task 5.1
Task 5.2
Workshop (local organiser: EC DGT, location: Luxembourg)
Dissemination
Duration 6 months
Members General Assembly
Project reviews
Periodic Payments
NB: reporting of task 2 activities under each WP is merged in the same deliverable than task 1 activities
Version 3 (26/10/2009)
final
18 m
12 m
Page 36
of 62
Workshop
F2F Meetings
Deliverables
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
ii) Performance Monitoring Table to show success indicators
Indicator
No.
1
2
3
4
5
6
7
8
9
10
11
Objective/expected
result
Number of attendees at
workshop 1
Number of attendees at
workshop 2
Number of attendees at
workshop 3
Number of attendees at
workshop 4
Number of hits on
announcements of
workshops on W3C I18n
home page
Hit counts for project
website
Views of workshop reports
on project web site
External
citations/mentioning of
project in papers,
conferences, talks, etc.
I18n checker in use
Subscribers to the public
mailing list set up for
discussion of standards and
best practices.
Number of tweets sent out
on the W3C Twitter channel
Indicator name
Wkshop 1
attendees
Wkshop 2
attendees
Wkshop 3
attendees
Wkshop 4
attendees
I18n page
announcements
Year 1
80
Expected Progress
Year 2
40
40
80
2000
+2000 (=4000)
Website hits
15,000
+20k (=35,000)
Report views
200
+300 (=500)
Project mentions
10
+12 (=24)
I18n checker
First version of checker
produced for FTF
discussions, and checker
available for use by the
public.
50
New version of checker
produced.
+20 (=70)
24
+24 (=48)
Mailing list
subscribers
W3C Tweets
Notes:
Items 1-4: These are target figures. The final numbers may be slightly above those stated.
Item 5: Unfortunately it is not possible to obtain figures for the number of views of announcements on the W3C
home page, so the indicators will use the number of hits on news items on the W3C Internationalization home
page (for which figures are available from the b2evolution logs). The expectation is that the number of hits on
the W3C home page will easily be more than double the total number of hits on the I18n home page.
Item 9: We will also track and report the number of hits on the i18n checker; but we cannot commit to list these
hits as an indicator since the development of practical items is undertaken outside the project.
Item 11: The W3C Twitter channel currently reaches around 5000 people, with interests ranging across the web
space. The proposal is to post on average 2 posts per month in this forum related to the topics covered by the
multilingualWeb project.
Version 3 (26/10/2009)
Page 37
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
iii) Workplan Tables tabular descriptions (registered online using NEF)
These work plan tables (WT1 to WT6) specify the main elements of the work plan. They are also
registered online using NEF:
- WT1 Work package list
- WT2 Deliverables list
- WT3 Work Package Descriptions
- WT4 List of Milestones – not applicable
- WT5 List of tentative Reviews
- WT6 Summary effort table
WT 1: Work package list:
List of work packages
WP
Number
WP Title
WP01 Overall Project Coordination
WP02 The landscape of multilingual Web
standards and best practices
WP03 Authoring the multilingual Web Project
WP04 Translation tool support
WP05 Final Workshop
Lead
beneficiary
number15 1
Person
months 2
Start month
End month
3
4
1
1
5,5
9
1
1
24
6
1
1
1
Total:
8,9
9
4,7
37,1
7
13
19
1
12
18
24
24
1 Number
of the beneficiary leading the work in this work package.
total number of person-months allocated to each work package.
3 Relative start date for the work in the specific work packages, month 1 marking the start date of the project,
and all other start dates being relative to this start date.
4 Relative end date, month 1 marking the start date of the project, and all end dates being relative to this start
date.
2 The
Version 3 (26/10/2009)
Page 38
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT2: Deliverables list
NOTE for Thematic Networks: Thematic Network Proposals specify the effort in PersonDays. In the technical annex however effort is specified in Person Months.
For simplicity the calculation 1 Person month = 20 Person days can be used, e.g. 4 person
days would be entered in NEF as 0.2 person months.
List of deliverables – to be submitted for review to EC
Delivera
ble
Number
Deliverable Title
D01.1 Detailed overall management
and bodies management,
including the quality assurance
plan
D01.2 Report on internal and external
communication tools
D02.1 First Workshop conference
(program, minutes and key
findings)
D02.2 Proceedings of First workshop
D02.3 First F2F Reports (Minuted
recommendations)
D01.3 Annual Progress Report
including a simplified Summary
Financial Report
D03.1 Second Workshop conference
(program, minutes and key
findings)
D03.2 Proceedings of Second workshop
D03.3 Practical work items: Report on
First implementation of
internationalization checker
D04.3 Second F2F Reports (Minuted
recommendations)
D04.1 Third Workshop Conference
(program, minutes and key
findings)
D04.2 Proceedings of Third workshop
D01.4 Annual Progress Report
including a simplified Summary
Financial Report
D05.1 Final Workshop conference
(program, minutes and key
findings)
D05.2 Proceedings of Final workshop
1R
WP
number
Lead
beneficia
ry
number
Estimated
indicative
Person
months
Nature
1
Dissem
ination
level 2
Deliver
y date
01
1
0.30
R
CO
2
01
1
0.30
R
CO
3
02
1
0.20
R
PU
6
02
02
1
1
0.20
0.20
R
R
PU
PU
6
6
01
1
0.20
R
CO
12
03
1
0.20
R
PU
12
03
03
1
1
0.20
0.50
R
R
PU
PU
12
12
04
1
0.20
R
PU
15
04
1
0.20
R
PU
18
04
01
1
1
0.20
0.20
R
R
PU
CO
18
24
05
1
0.20
R
PU
24
05
1
Total:
0.20
3.50
R
PU
24
= Report, P = Prototype, D = Demonstrator, O = Other
Version 3 (26/10/2009)
Page 39
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
2 PU = Public
PP = Restricted to other programme participants (including the Commission Services)
RE = Restricted to a group specified by the consortium (including the Commission Services)
CO = Confidential, only for members of the consortium (including the Commission Services)
21 Month in which the deliverables will be available. Month 1 marking the start date of the project, and all
delivery dates being relative to this start date.
Version 3 (26/10/2009)
Page 40
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT3: Work package descriptions
WP01
One form per work package
Work package
Number
Work package title
Start month
End month
Lead beneficiary
number
WP01
Overall Project Coordination
1
24
1
Objectives
The first objective of this work package will ensure the overall management, providing the legal,
financial and administrative management framework for the project. The second objective is dedicated
to the coordination of the Network activities, including a quality assurance process to ensure the
monitoring and assessment of the progress and results across the Network.
Description of work and role of partners
Legal, Financial and Administrative objectives:
• Act as the project interface with the European Commission and manage all administrative and
financial issues in full compliance with the work plan and the EC Grant Agreement;
• Identify and implementation of indicators to measure the achievement of the Network towards its
objectives (Network activity and efficiency)
• Preparation and maintenance of consortium agreement;
• Coordinate the preparation of the Progress Report and organise the project reviews.
Networking Coordination objectives:
• Supervision, support and assessment of the technical and scientific work performed by the WP’s;
• Set-up the appropriate collaborative tools to support the communication exchange between partners;
• Monitoring of the time schedule and implementation of any appropriate actions to correct delays;
• Implementing the recommendations of EC.
Task 1.1: Project Administrative and Financial Coordination [ERCIM]
• Contractual management (implementation of the Grant Agreement, consortium evolution and
conflict resolution) and EU reporting;
• Monitoring of the central budget dedicated to the Network activities and distribution of the
Community contribution;
• Production of the quality plan in liaison with the NCO including monitoring of risks;
• Preparation and management of the consortium agreement, which includes:
- list of pre-existing know-how among the partners,
- list of knowledge generated by the partner.
• Meeting organisation and support (annual EC review meeting and member general assembly;
• Day-to-day support to the Network Coordinator.
Task 1.2: Network Coordination [W3C]
This task will be lead by Richard Ishida, Internationalization Activity Lead at the W3C. The main
responsibility of this task it to ensure the achievement of the work plan throughout the entire project
duration and will include:
Version 3 (26/10/2009)
Page 41
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
•Identifying, addressing and mobilising the relevant communities to ensure balanced and
representative attendance of workshops
• Establishment of intra-project communication tools (mailing list, shared workspace);
• Coordination of the dependencies across all work packages;
• Stimulating and supporting of the exchange within the project on the scientific point of view;
• Preparing the Progress Reports to inform the Commission annually on the progress of the objectives
of the project, including achievements and identification of any deviations compared to the work plan.
Description of WP Deliverables
D01.1) Management and quality assurance plan: This deliverable will describe the internal
management organisation and include in annex the consortium agreement signed between the
partners. The section about the quality plan will define and describe (i) quality processes (e.g.,
deliverable preparation, review preparation and post-review follow-up, Workshop-specific processes
such as call timelines), (ii) method and indicators which will be set-up for the tracking of the risks and
appropriate type of risk response. The document will be prepared by the AFC. [month 2]
D01.2) Communication and mobilisation plan: The document will outline the overall approach to
mobilise the community, and the use of various means and tools for communication to achieve this
goal. The W3C will set up the mailing lists and create the project web page. A document will
summarize the mailing lists name and participants, provide information on the archive process at
W3C (readable by the public and open to public subscription) and screen shots of the project website
home page. The document will be prepared by the AFC, and signed off by the NCO. [month 3]
D01.3) Annual Progress Report including a simplified Summary Financial Report: Progress report:
W3C will provide a summary of the Network activities run during the period including achievements
and attainment of any milestones, progress indicators and deliverables identified in Description of
Work. [month 12]
D01.4) Annual Progress Report including a simplified Summary Financial Report: Progress report:
W3C will provide a summary of the Network activities run during the period including achievements
and attainment of any milestones, progress indicators and deliverables identified in Description of
Work. [month 24]
Version 3 (26/10/2009)
Page 42
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Person Months per participant
Person months per participant
Participant number
Participant short name
Person-months per participant
1
ERCIM
Total
5.5
5.5
Schedule of relevant milestones
Milestone
number
Milestone name
M01.1
M01.2
Advance payment N°1, N°2, final
Website set-up, communications
and collaborative tools available
Project reviews
M01.3
Lead
beneficiary
number
1
1
1
Delivery
date from
Annex I
M1, 14, 27
M2
Comments
M13, 25
List of WP Deliverables
Delivera
ble
Number
Deliverable Title
D01.1 Management and quality
assurance plan
D01.2 Communication and
mobilisation plan
D01.3 Annual Progress Report
including
a simplified Summary Financial
Report
D01.4 Annual Progress Report
including
a simplified Summary Financial
Report
Lead
beneficiary
number
Estimated
indicative
person
months
Nature
Dissemin
ation
level
Delivery
date
1
0.30
R
CO
2
1
0.30
R
CO
3
1
0.20
R
CO
12
1
0.20
R
CO
24
Total
Version 3 (26/10/2009)
1.00
Page 43
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT3: Work package descriptions
WP02
One form per work package
Work package
Number
Work package title
Start month
End month
Lead beneficiary
number
WP02
The landscape of multilingual Web standards and best practices
1
6
1
Objectives
To survey the overall situation with regard to standards and best practices for the
multilingual Web in Europe, identifying what is available and what gaps exist, and make
preliminary plans for future workshop topic areas. Participants will share information about
what they are working on in this area, and we will look for opportunities for synergy.
To launch the work on the practical work items, and solicit input from partners on proposed
plans for their development.
Description of work and role of partners
Task 2.1: Workshop [Task leader & host: Partner UPM]
The main component of this work package is a 2-day workshop with the theme “The
landscape of multilingual Web standards and best practices”. During the workshop, partners
and any other participants (the latter subject to acceptance of position papers) will share
their experience with standards and best practices, describe initiatives in which they are
currently involved and establish initial proposals for topics to be treated during subsequent
workshops.
This workshop will look at the landscape from a high level and range widely, whereas followon workshops will be more restrained in terms of subject area, and more focused on
uncovering issues. Participants will also make an initial proposal for the theme of the fourth
workshop.
Since this workshop is largely about sharing information, efforts will be made to attract an
audience of around 80 attendees. An upper limit for attendees will be decided by discussion
with the partner hosting the event. Partners will be asked to assist in identifying, inviting and
encouraging attendees. Details of how we will attract attendance at the workshops will be
provided in the communication and mobilisation plan, but are likely to include approaches
that have worked well for W3C workshops in the past. In addition to targeted invitations and
announcements by partners, these include announcements on the W3C, W3C Offices, and
Internationalization Activity home pages and in the events section of the W3C site. This
provides very high visibility for the workshop. In addition, announcements can be made on
various lists related to internationalisation and localisation and using lists of contacts from
organizations such as LISA, TAUS, LRC, the Commission, etc. Also we can make use of
Facebook, Linked-In and other social media, such as the W3C Twitter channel (with around
5000 readers). We may also consider press releases for this workshop and/or the final one.
Version 3 (26/10/2009)
Page 44
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Attendees will have free entry into the meeting, but will need to self-cater and support their
own travel costs.
Task 2.2: Dissemination [Task leader: W3C.]
The coordinator will produce and announce minutes of the workshop and a summary of the
conclusions of the workshop. These documents will be hosted on the project Web site.
Task 2.3: Practical work items: 1st f2f meeting [Task leader: W3C. Host ICIA]
Initial work will begin on the internationalization checker at the W3C and a curriculum for
training materials will be drawn up for discussion.
Partners will meet face-to-face to discuss the proposed features for the internationalization
checker and the proposed curriculum.
At the face-to-face, participants will also be introduced to the W3C internationalization test
suite, in preparation for contributions to the results pool. Participants will have the
opportunity to volunteer to provide results for particular platforms. The work of providing the
result information will be ongoing throughout the project.
Description of WP Deliverables
D02.1) First Workshop conference (program, minutes and key findings): The deliverable will
analyse the attendance and presentations, reproduce the minutes and the recommendations.
In addition, a list of the documentation related to the organisation of the event will be
provided with this document. The minutes, recommendations and presentation slides will be
made available from the web site. The document will be prepared by the NCO assisted by the
AFC. [month 6]
D02.2) First F2F Reports (Minuted recommendations): The deliverable will analyse the
attendance, reproduce the minutes and summarise discussions. It will clearly list the input
received from the participants. The document will be prepared by the NCO assisted by the
AFC. [month 6]
Version 3 (26/10/2009)
Page 45
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Person Months per participant
Person months per participant
Participant number
Participant short name
Person-months per
participant
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ERCIM
BIOLOOM
CNR
FACEBOOK
UAS POTSDAM
IJS
ICIA
LTC
LIONBRIDGE
MICROSOFT
OPERA
SAP
TAUS
AALTO
UO
UPM
UNIVERSITY OF
LIMERICK
VEVP
WELOCALIZE
XML-INTL
Total
18
19
20
1.20
0.40
0.40
0.40
0.40
0.40
0.50
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.50
0.40
0.40
0.40
0.40
9.00
Schedule of relevant milestones
Milestone
number
Milestone name
M02.1
M02.2
Workshop event
F2F meeting
Lead
beneficiary
number
1
1
Delivery
date from
Annex I
M5
M3
Comments
List of WP Deliverables
Delivera
ble
Number
Deliverable Title
D02.1 First Workshop conference
(program, minutes and key
findings)
D02.2 First F2F Reports (Minuted
recommendations)
Lead
beneficiary
number
Estimated
indicative
person
months
Nature
Disseminati
on level
Delivery
date
1
0.20
R
PU
6
1
0.20
R
PU
6
Total
Version 3 (26/10/2009)
0.40
Page 46
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT3: Work package descriptions
WP03
One form per work package
Work package
Number
Work package title
Start month
End month
Lead beneficiary
number
WP03
Authoring the multilingual Web Project
7
12
1
Objectives
Workshop participants share their experiences with standards, guidelines, best practices and
initiatives related to authoring content for the Web, and discuss areas needing attention.
Produce a set of recommendations for work that is not being adequately addressed, with
prioritization.
Continued work on the internationalization checker and training course development.
Put in place a mechanism for partners to provide results for tests in W3C internationalization
test suite, and begin to incorporate the first results.
Description of work and role of partners
Task 3.1: Workshop [Task leader & host: Partner CNR-ILC]
The main component of this work package is a 2-day workshop with the theme “Authoring the
multilingual Web”.
During the workshop, partners and any other participants (the latter subject to acceptance of
position papers) will share their experiences with standards, guidelines, best practices and
initiatives related to authoring content for the Web, and discuss areas needing attention.
There will also be the opportunity for a small number of selected subject matter experts to
present short educational sessions relating to practical techniques for authors (eg. latest
developments in language tagging, character and language declarations in HTML5, current
status of IDNA, issues with content management systems and authoring tools, etc.)
The workshop should include authoring of corporate content using content management
systems and organization websites, but also personal authoring in such things as blogs and
social networking environments.
Topics can include authoring practices related to automated checking of character encoding,
language and other declarations, CSS styling features, translatability issues, navigating
around multilingual sites, use of language tagging, IDNA, authoring for mobile devices, etc.
This workshop will be open to the public. The minutes and a summary report of the findings of
the workshop will be made publicly available on the project Web site. All position papers and
Version 3 (26/10/2009)
Page 47
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
slides presented will also be available. In particular, any presentations from subject matter
experts sharing practical best practices will be made available on or linked to from the W3C
site.
Unlike the first workshop, this workshop will be organized more along the lines of a roundtable, where the main objective is for participants to discuss topics together with a view to
uncovering and listing issues and opportunities for future work. Consequently, this workshop
will be smaller than the first workshop. The workshop will be announced widely, using most
of the means described for WP2, but the role of partners in identifying and inviting specific
attendees will be particularly important, as those thus targeted will probably constitute the
majority of attendees besides the partners themselves. The aim will be to restrict the size of
the workshop to around 40 participants – many more than this will make discussions
unmanageable. Targeted attendees will be discussed by the partners during program
committee teleconferences. Other applicants will be admitted to the workshop only if the
program committee agrees, based on their position paper submission. For more details, see
the forthcoming communication and mobilisation plan.
Task 3.2: Dissemination [Task leader: W3C]
The coordinator will produce and announce minutes of the workshop and a summary of the
standards and best practices discussed, and recommendations for new work in this area.
These documents will be hosted on the project Web site.
Task 3.3: Practical work items [Task leader: W3C]
The coordinator will continue implementing the internationalization checker and training
materials, taking into account the discussions during the face-to-face meeting in WP2, and
produce versions that can be reviewed during the next work package.
Description of WP Deliverables
D03.1) Second Workshop conference (program, minutes and key findings): The deliverable
will analyse the attendance and presentations, reproduce the minutes and the
recommendations. In addition, a list of the documentation related to the organisation of the
event will be provided with this document. The minutes, recommendations and presentation
slides will be made available from the web site. The document will be prepared by the AFC
and the NCO. [month 12]
D03.2) Practical work items: Report on First implementation of internationalization checker :
This report will summarise progress on the internationalization checker and training
materials to date to provide input to the face-to-face meeting planned for WP4. The document
will be prepared by the NCO. [month 12]
Version 3 (26/10/2009)
Page 48
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Person Months per participant
Person months per participant
Participant number
Participant short name
Person-months per
participant
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ERCIM
BIOLOOM
CNR
FACEBOOK
UAS POTSDAM
IJS
ICIA
LTC
LIONBRIDGE
MICROSOFT
OPERA
SAP
TAUS
AALTO
UO
UPM
UNIVERSITY OF
LIMERICK
VEVP
WELOCALIZE
XML-INTL
Total
18
19
20
1.20
0.40
0.50
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
8.90
Schedule of relevant milestones
Milestone
number
Milestone name
M03.1
Workshop event
Lead
beneficiary
number
1
Delivery
date from
Annex I
M11
Comments
List of WP Deliverables
Deliverable
Number
Deliverable Title
Lead
beneficiary
number
Estimated
indicative
personmont
hs
Nature
Disseminati
on level
Delivery
date
D03.1
Second Workshop conference
(program, minutes and key
findings)
Practical work items: Report
on First implementation of
internationalization checker
1
0.20
R
PU
12
1
0.50
R
PU
12
D03.2
Total
Version 3 (26/10/2009)
0.70
Page 49
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT3: Work package descriptions
WP04
One form per work package
Work package
Number
Work package title
Start month
End month
Lead beneficiary
number
WP04
Translation tool support
13
18
1
Objectives
During the workshop, partners and any other participants (the latter subject to acceptance of
position papers) will share their experiences with standards, best practices and techniques
related to enabling efficient and effective translation of Web based content, and discuss areas
needing attention.
Partners will review work on the practical items and input further suggestions for their
development. The coordinator will produce a new version of the internationalization checker
and the training materials.
Description of work and role of partners
Task 4.1: Workshop [Task leader & host: Partner UO (ILTO)]
The main component of this work package is a 2-day workshop with the theme “Translation
tool support”.
Relevant topics will include the W3C's Internationalization Tag Set (ITS) specification,
standards supporting internationalized content creation, XLIFF, TMX, and other standards
supporting management of translation information, translation technologies and language
tools, and so on.
Unlike the first workshop, this workshop will be organized more along the lines of a roundtable, where the main objective is for participants to discuss topics together with a view to
uncovering and listing issues and opportunities for future work. Consequently, this workshop
will be smaller than the first workshop. .Partners will play a key role in inviting attendees for
this workshop. The remarks about this in the description of WP3 also apply here.
This workshop will be open to the public. The minutes and a summary report of the findings of
the workshop will be made publicly available on the project Web web site. All position papers
and slides presented will also be available. In particular, any presentations from subject
matter experts sharing practical best practices will be made available on or linked to from the
project Web site.
During this workshop, the group will also need to take a firm decision on the theme for the
4th and final workshop.
Task 4.2: Dissemination [Task leader: W3C]
The coordinator will produce and announce minutes of the workshop and a summary of the
standards and best practices discussed, and recommendations for new work in this area.
These documents will be hosted on the project Web site.
Version 3 (26/10/2009)
Page 50
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Task 4.3: Practical work items [Task leader: W3C. Host UL LRC.]
Partners will meet face-to-face to review the internationalization checker and the proposed
curriculum.
The coordinator will then produce a second version of the internationalization checker and
training materials, taking into account the discussions during the face-to-face meeting.
Partners supplying results for the internationalization test suite will continue to do so as new
tests are written or new user agent versions are introduced.
Description of WP Deliverables
D04.1) Third Workshop conference (program, minutes and key findings): The deliverable will
analyse the attendance and presentations, reproduce the minutes and the recommendations.
In addition, a list of the documentation related to the organisation of the event will be
provided with this document. The minutes, recommendations and presentation slides will be
made available from the web site. The document will be prepared by the AFC and the NCO.
[month 18]
D04.2) Second F2F Reports (Minuted recommendations): The deliverable will analyse the
attendance,reproduce the minutes and summarise discussions. It will clearly list the input
received from the participants. The document will be prepared by the NCO assisted by the
AFC. [month 18]
Version 3 (26/10/2009)
Page 51
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Person Months per participant
Person months per participant
Participant number
Participant short name
Person-months per
participant
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ERCIM
BIOLOOM
CNR
FACEBOOK
UAS POTSDAM
IJS
ICIA
LTC
LIONBRIDGE
MICROSOFT
OPERA
SAP
TAUS
AALTO
UO
UPM
UNIVERSITY OF
LIMERICK
VEVP
WELOCALIZE
XML-INTL
Total
18
19
20
1.20
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.40
0.50
0.40
0.50
0.40
0.40
0.40
9.00
Schedule of relevant milestones
Milestone
number
Milestone name
M04.1
M04.2
Workshop event
F2F meeting
Lead
beneficiary
number
1
1
Delivery
date from
Annex I
M17
M14
Comments
List of WP Deliverables
Deliverable
Number
Deliverable Title
Lead
beneficiary
number
Estimated
indicative
personmont
hs
Nature
Disseminati
on level
Delivery
date
D04.1
Third Workshop Conference
(program, minutes and key
findings)
Second F2F Reports (Minuted
recommendations)
1
0.20
R
PU
18
1
0.20
R
PU
15
D04.2
Total
Version 3 (26/10/2009)
0.40
Page 52
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT3: Work package descriptions
WP05
One form per work package
Work package
Number
Work package title
Start month
End month
Lead beneficiary
number
WP05
Final Workshop
19
24
1
Objectives
Workshop participants share their experiences with standards, guidelines, best practices and
initiatives related to another specific topic, to be decided during the project, and discuss
areas needing attention.
Produce a set of recommendations for further standards and best practices that are not being
adequately addressed, with prioritization.
Description of work and role of partners
Task 5.1: Workshop [Task leader: Partner W3C. Host EC DGT Luxembourg]
The main component of this work package is a 2-day workshop with a theme that will be
decided by participants of previous workshops.
The current plan is to divide the workshop into two parts. One part will involve small group
discussion on a topic to be decided during the project, as per WP3 and WP4. The other part
of the workshop will be open to a larger audience, in the same way as the workshop in WP2,
and will aim to disseminate information about the findings of the project.
During the discussion-oriented part of the workshop, partners and any other participants (the
latter subject to acceptance of position papers) will share their experiences with standards,
best practices and techniques related to the chosen theme, and discuss areas needing
attention.
It may be used to extend discussions started in earlier workshops, to provide some continuity
and follow-on for those ideas, or one or more new topics may be introduced. Examples could
include such things as the following, or other ideas agreed upon by the partners between the
first and third workshops:
1. Navigating localized sites, content negotiation and other approaches to delivering to the
user the content that they need.
2. Meeting the needs of minority languages. Could be held in conjunction with the Digital
World project, looking at minority languages in Europe, but also how Europe should apply
it's experience and knowledge to support cultures trying to introduce the Web outside Europe.
3. Review and feedback on standards and specifications currently in development.
Version 3 (26/10/2009)
Page 53
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
4. Experience sharing about barriers and lessons in handling deployment of multilingual
information
5. Wrap up for the project and discussion of ways to continue the work after the end of the
project
This workshop will be open to the public. The minutes and a summary report of the findings of
the workshop will be made publicly available on the project Web site. All position papers and
slides presented will also be available. In particular, any presentations from subject matter
experts sharing practical best practices will be made available on or linked to from the
project Web site.
For information about plans to attract audiences for the workshop, see the notes in the
previous WP descriptions.
Task 5.2: Dissemination [Task leader: W3C]
The coordinator will produce and announce minutes of the workshop and a summary of the
standards and best practices discussed, and recommendations for new work in this area.
These documents will be hosted on the project Web site.
Description of WP Deliverables
D05.1) Final Workshop conference (program, minutes and key findings): The deliverable will
analyse the attendance and presentations, reproduce the minutes and the recommendations.
In addition, a list of the documentation related to the organisation of the event will be
provided with this document. The minutes, recommendations and presentation slides will be
made available from the web site. The document will be prepared by the AFC and the NCO.
[month 24]
Version 3 (26/10/2009)
Page 54
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Person Months per participant
Person months per participant
Participant number
Participant short name
Person-months per
participant
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ERCIM
BIOLOOM
CNR
FACEBOOK
UAS POTSDAM
IJS
ICIA
LTC
LIONBRIDGE
MICROSOFT
OPERA
SAP
TAUS
AALTO
UO
UPM
UNIVERSITY OF
LIMERICK
VEVP
WELOCALIZE
XML-INTL
Total
18
19
20
0.90
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
0.20
4.70
Schedule of relevant milestones
Milestone
number
Milestone name
M05.1
Workshop event
Lead
beneficiary
number
1
Delivery
date from
Annex I
M23
Comments
List of WP Deliverables
Deliverable
Number
Deliverable Title
Lead
beneficiary
number
Estimated
indicative
personmont
hs
Nature
Disseminati
on level
Delivery
date
D05.1
Final Workshop conference
(program, minutes and key
findings)
1
0.20
R
PU
24
Total
Version 3 (26/10/2009)
0.20
Page 55
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT4 –List of Milestones n/a
Milestones are points where major results have successfully been achieved as the basis for the
next phase of work, or are control points at which decisions are needed; for example a
milestone may occur when a major result has been achieved, if its successful attainment is a
pre-requisite for the next phase of work. Another example would be a point when a choice
between several technologies will be made as the basis for the next phase of the project.
List of milestones
Milestone number
M01.1
M01.2
M01.3
M02.1
M02.2
M03.1
M04.1
M04.2
M05.1
Milestone name
WP numbers
Advance payment N°1, N°2, final
Website set-up, communications
and collaborative tools available
Project reviews
Workshop event
F2F meeting
Workshop event
Workshop event
F2F meeting
Workshop event
1
1
1
1
1
1
1
1
1
Lead beneficiary
number
M1, 14, 27
M2
Delivery date from
Annex I 1
Comments
M13, 25
M5
M3
M11
M17
M14
M23
WT5: List of Tentative Reviews
Reviews should ideally be synchronised with ends of project reporting periods – which may
coincide with the major milestones of the project. A tentative planning has to be indicated
using the following template table:
Tentative schedule of project reviews
Review number
RV 1
Tentative
timing
After month 12
RV 2
After month 24
Planned venue of review
Comments, if any
European Commission,
Luxembourg
European Commission,
Luxembourg
1 Month
in which the milestone will be achieved. Month 1 marking the start date of the project, and all
delivery dates being relative to this start date.
2 Month after which the review will take place. Month 1 marking the start date of the project, and all dates being
relative to this start date.
Version 3 (26/10/2009)
Page 56
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
WT6: Summary effort table
This table indicates the number of person months over the whole duration of the planned work, for each work package (WP) by each participant.
Project effort by beneficiary per work package
Beneficiary
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
shortname
ERCIM
(W3C)
Bioloom
CNR
Facebook
UAS
Potsdam
IJS
ICIA
LTC
Lionbridge
Microsoft
Opera
SAP
TAUS
AALTO
UO
UPM
LRC
VEVP
welocalize
XML-INTL
Total
Version 3 (26/10/2009)
WP1
WP2
WP3
WP4
WP5
5,5
1,2
1,2
1,2
0,9
0,4
0,4
0,4
0,4
0,5
0,4
0,4
0,4
0,4
0,2
0,2
0,2
0,4
0,4
0,4
0,2
0,4
0,5
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,5
0,4
0,4
0,4
0,4
9
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
8,9
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,4
0,5
0,4
0,5
0,4
0,4
0,4
9
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
0,2
4,7
5,5
Page 57
of 62
Total per
Beneficiary
10
1,4
1,5
1,4
1,4
1,4
1,5
1,4
1,4
1,4
1,4
1,4
1,4
1,4
1,5
1,5
1,5
1,4
1,4
1,4
37,1
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
B3.3. Project management
The project management is designed to support the implementation of the network’s work plan and the
completion of its contractual obligations vis-à-vis the European Commission, in compliance with the
ICT-PSP detailed rules and procedures. To this end, the MultilingualWeb Network will rely on a
simple efficient management structure, described in the figure hereafter.
MultilingualWeb
General Assembly
One representative from every beneficiary
Chaired by the NCO
EC liaisons
Project Coordination Team
Administrative and Financial
Coordinator (AFC)
Network coordinator (NCO)
Workpackages 2 to 5
It is composed of the following organs:
• General Assembly (GA);
• Project Coordination Team composed of the Administrative & Financial Coordinator (AFC) and
the Network coordinator (NCO).
The composition, objectives, tasks, and resources of these bodies are detailed hereafter:
General Assembly (GA)
The General Assembly is the Network contractual decision-making board, chaired by the NCO. Every
beneficiary shall be entitled to send one voting representative to the General Assembly and its
decisions are legally binding to all partners. The General Assembly will meet at least once a year,
during the 2nd and 4th Workshops, and if needed, electronic vote will be organised to address particular
issues when they arise.
The General Assembly is responsible with regards to the Network contractual obligations:
(i)
All contractual changes related to the Beneficiary of the project and to the Consortium
Agreement signed in parallel of the Grant Agreement;
(ii)
Distribution and management of the EC grant;
(iii)
Validation of the annual work performed by the partners and implementation of
corrective actions if required;
And will also be sought to:
(iv)
Assist in the evaluation and validation of the progress of the work packages, approve
all official deliverables and propose corrective actions in case of problems;
(v)
Will act as a joint organisational committee to define the workshop agenda, the
targeted attendance, to optimise fallouts of each event, and to validate the workshop
reports and recommendations.
Version 3 (26/10/2009)
Page 58
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
All decisions will be taken seeking consensus. If required, decisions shall be taken by a majority of
two thirds of the General Assembly Members present.
Administrative and Financial Coordinator (AFC)
ERCIM will ensure the administrative and financial coordination of the network and will be to official
contact point for the network vis-à-vis the European Commission. Mrs Céline Bitoune will be the
ERCIM representative appointed as AFC. Its detailed roles and responsibilities are described in work
package 1.
The AFC will be responsible for ensuring that the project is carried out efficiently and in accordance
with the contractual obligations agreed with the European Commission. Internally, the AFC shall
report and be accountable to the General Assembly. In addition, she will provide regular assistance
and support to the network coordination (deliverable process, templates, liaisons with partners related
to the organisation of the different workshops, etc…).
Network coordinator (NCO)
The (NCO) has the responsibility of ensuring the overall coordination of the project, in line with the
strategic decisions of the General Assembly. The NCO will monitor work at project level and liaise
with related projects and initiatives.
In addition, the NCO will establish links inside this community and supervise the work packages
accordingly. As such, he is also responsible for the successful organisation of the different workshops,
define the workshop agenda, the targeted attendance and for the validation of the workshop reports
and recommendations. He will drive the different work packages towards their objectives and also
sign off the deliverables produced.
The NCO will provide a project home page hosted by W3C for coordinating work and to point to
workshop and face to face outputs, ongoing status of work, etc. and set up a communication
environment dedicated to the project to support collaborative activities, (WIKI, mailing lists).
Communication Tools
As Network Coordinator, the W3C will provide world-class tools to support collaborative work,
including, for example, IRC channels with command-bots for meeting management, minute-taking,
etc, that can be used for face-to-face and teleconference meetings. The W3C also provides support
for management of publicly-archived mailing lists, wikis and other collaborative tools as part of its
normal business.
Risks Analysis
Considering the specific objectives of this Network whose success will rely on the regular attendance
at the Workshops and F2F meeting, a section has been dedicated to list the potential risks classifying
them according three criteria: provenance, probability and impact level.
Conflict Resolution
As a reminder, the AFC is responsible for ensuring that the project is properly carried out and that the
agreements are fulfilled (both those stated in the Consortium Agreement and in the EC rules and
guidelines). Effective conflict resolution within the project begins with an understanding that partners
report to the Network coordinator, who in turn reports to the AFC. Should the conflict remains, the
situation should be reported to the General Assembly to solve the issue through a vote of the Partners’
representative.
Dissemination
All partners agree that dissemination materials, deliverables and other foreground produced by the
Project will be made available under an attribution, non-commercial, share-alike creative commons
license. The non-funded partners will have to sign a form when participating, which is intended to
provide a similar license.
Version 3 (26/10/2009)
Page 59
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
Consortium Agreement (CA) and IPR
A Consortium Agreement will be signed by all Beneficiaries before the entry into force of the grant
agreement between ERCIM and the European Commission. It will state the management rules (rules
of participation and funding distribution) and mechanisms of the project (role and responsibilities of
the General Assembly). With regard to the IPR issues, the Consortium will identify if exclusion of
background is required by partners and include in the CA the provisions for protecting Pre-Existing
Know-How. However, considering the type of instrument and the objectives of the project, the results
and outcomes should be reported in the public domain.
B3.4. Dissemination / Use of results
We are currently envisaging the creation of a Joomla-based web site on the ERCIM server. This
would allow partners to submit news items (rather like blog pages), or perhaps other types of
information, directly to the site using a username+password and a blog-like content submission
form. The W3C has some experience in Joomla-based sites for other EC projects (for example,
http://www.primelife.eu/,
http://gridcomp.ercim.org/,
http://www.vph-noe.eu/, and
http://net-wms.ercim.org/
We have already acquired the multilingualweb.eu domain name for this site. The advantage of that
will be that if one of the partners or someone else would like to continue to manage the site after the
end of the funded project, we can easily transfer the data and the domain name to that organization,
who would then run it from their own servers - all this with no disruption to users.
The current expectation is that one of the main components of the site home page would be a news
stream, where we add information about the project, announce workshops and workshop outputs,
but also mention external developments that are related to multilingual web standards. We expect to
be able to also aggregate into the same stream useful blog posts from other people/organisations, and
a twitter stream dedicated to the project. (We have already acquired the twitter id multilingweb for
this purpose - multilingualweb was not available, but is also getting a little long for a twitter id).
All this information would then be available to anyone via RSS feeds. This would mean that
someone could follow the news in their RSS feed aggregator (eg. Bloglines, Outlook, etc), and we
could
also
aggregate
the
information
in
things
like
Planet
I18n
Web
(http://www.w3.org/International/planet/) and a Facebook page, which would further widen our
ability to reach people with news.
Of course, the home page would also point to other useful resources. We can also, using relevant
keywords, point to search streams for twitter, google and bloglines, and some delicious tags, as we
do currently on the i18n web planet page (top right).
Setting up a Facebook page to aggregate information, such as the news items from the home page,
photos of meetings from flickr, etc, would enable people to bring the information into their own
Facebook streams. Facebook is one of our partners, and will assist in the setup of this page.
The practical deliverables developed in parallel to the project will be publicly available, and pointed
to from a variety of places, including key W3C pages.
The workshops themselves will provide a means for dissemination of results among subject experts
and the general public. The workshop results will be publicly available on the Web, and announced
in a number of places, including key pages on the W3C site.
Version 3 (26/10/2009)
Page 60
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
B3.5. Resources to be committed
In accordance with the specific funding mechanisms for Thematic Networks, the MultilingualWeb
initiative will be granted a lump sum for partners and through flat rates based on scale-of-unit costs for
the Network coordination. The total funding amounts to 414K Euros for the full project duration for
20 Beneficiaries. This grant aims to cover both the implementation of the Network and the attendance
of Network meetings.
Contractual Members will perceive a fixed grant of 16K euro for two years (8K euro
annually composed of 5K euro for travel and 3K euro related to the work plan implementation costs).
The overall European Commission grant is broken down as follows:
Flat rate (based on
scale of unit cost) for
Coordination costs
Full Duration
Coordinator (Partner 1)
Partner 2-10
Partner 11-20
1
9
10
6000
54000
40000
Lump sum for
Lump sum for
attendance of meetings
implementation costs
costs
54000
60000
10000
90000
100000
TOTAL NETWORK GRANT
414000
The Coordination budget is a flat rate based on scale-of-unit costs composed of 3000€ per year per
beneficiary for the first 10 beneficiaries and 2000€ per year per beneficiary from beneficiary 11th to
20th. The detailed budget breakdown per activity can be presented as follows:
Apportionment of Network Funding
4%
3%
18%
Coordination costs
Dissemination
SCO Netw orking activities
all partners Netw orking activities
75%
The above chart shows that 96% of the EC grant will serve the main activities of the Network
(implementation of the Network and the attendance of Network meetings) and dissemination
(Workshop announcement and material). It includes in particular the costs related to the logistical
organisation of the Workshop (rental room, facilities, catering, attendees’ materials...) and the Support
team efforts.
The management costs represent only 4% of the total funding, and are linked to the contractual
obligations of the Network agreed with the European Commission.
Additional Resources
The funding of this type of instruments is made on the basis of lump sums and flat-rates (based on
scale of-unit costs rather than a reimbursement of actual eligible costs. As a consequence the real
estimate real costs planned to be incurred by the Beneficiaries and related to their real efforts, are
expected to reach roughly two times this grant (taking into account real salaries of person-months and
real overhead). In order words, the primary source of funding is the partner’s own resource.
Version 3 (26/10/2009)
Page 61
of 62
CIP-ICT PSP-2009-3
Thematic Network - MultilingualWeb
On the other hand ERCIM/W3C will make a significant contribution of resources for the development
of the practical work items: the internationalization checker, the training curriculum, and the
compilation of test results are all activities that fall outside the funding for the Thematic Network.
Hereafter are listed the other own resources which will be brought to the project:
Participant No.
Participant
short name
own resource
•
1
ERCIM (W3C)
7
LTC
Version 3 (26/10/2009)
Intra-project communication tools (mailing list, shared
workspace
• Audio conferencing facilities
web based multilingual business and linguistic process
automation
Page 62
of 62
Download