Software Translation Process using Alchemy Catalyst

advertisement
Software Translation Process using
Alchemy Catalyst
Version 2.0
Kris Kniaz
February 9, 2006
Copyright © by Kris Kniaz
Translation Process
Contents
No table of contents entries found.
2
Translation Process
Revision History
Version
Date
Author(s)
Description
1
12/5/2005
Kris Kniaz
Initial structure.
2
2/9/2006
Kris Kniaz
Resource file translation scenarios added
Filename: Translation Process.doc
3
Translation Process
Introduction
Translation is a crucial part of the software localization process. Every organization which
wants to provide global software or internet services must deal with this aspect of the
international rollout. In the complex environment of the software project this relatively simple
task (if the application is properly designed and built for localization) causes additional
dependencies and productivity issues for the team:
1) Translations should be done in context, that is: a language specialist who is able to fully
understand how the text appears on the screen in the context of the user action invariably will
do a better translation.
2) Applications (especially those originally written for the English market) are very often not
ready for the translation: sentences are broken into separate strings put together by code
instructions, strings are reused in different context (for example fax as a noun and fax as a
verb) and application functional content is not separated from the code (for example error
messages are often located as string consts or variables in the code)
3) Translation must integrate with standard engineering build, test and deployment processes.
Functional content such as names of buttons, menu options, error messages etc … obviously
requires translation but it is connected to code (and very often is the code). In Microsoft.NET
localized program strings should be contained in the resource files (a.k.a resx files). Those
files are compiled with the code and deployed as satellite assemblies by the engineers during
the build process.
Most organizations start from inefficient, “manual” processes that usually involve exporting
functional strings to excel spreadsheets or word files and then importing translated files
manually or through scripts.
Mature organizations implement translation and localization management software from one of
the leading vendors such as Trados, Idiom Technologies or Alchemy.
The translation software uses pattern of Translation Memory, which is defined as “collection of
units of associated text strings in language pairs from previous translations which can be
suggested to translators translating similar content and language pair document.”
Management of the localization process revolves around storing, indexing and reusing
Translation Memories. In the typical case translation engineer creates a project for base
(neutral) language that contains base to base TM’s. Subsequently the base project is then
used to create base to target projects. As a result one typically ends up with the collection of
parent to children TM projects.
This paper presents typical translation processes for the .NET application implemented using
alchemy Catalyst product from Alchemy software (www.alchemy.ie).
4
Translation Process
Overview of the Catalyst product
Main features of the Catalyst are summarized below:
Feature
Explanation
TM’s storage
In Catalyst Translation Memories are stored in the project file
(extension ttk). This has benefits and drawbacks: because one can
easily share TM information by simply emailing the project, on the
other hand lack of centralized repository (like a data base offered
by other vendors) creates file management overhead.
Comparison Expert
The Comparison Expert is used to compare two application files. It
detects missing, added and modified resources and records these
changes in the Results Toolbar and an optional XML report file.
This Expert is useful in determining the scope of change between
revisions of software.
Leveraging
Alchemy CATALYST provides Translation Memory technology called
ezMatch™. ezMatch allows translators to re-use previous
translations. The Leverage Expert guides users through using
ezMatch technology and is designed to maximize the amount of
translations that can be leveraged from multiple file formats.
Pseudo Translation
Pseudo Translation simulates the effects of translation on the
application files. It does this by substituting vowel characters in
the source files with diacritical or accented characters. PseudoTranslation can be used early during product development cycle to
determine if an application can be translated easily. For example,
one may use it to determine if a product crashes if a series of
strings are translated, or if a series of strings will fit if during
translation they expand 15%
Validation
Validation automates the detection of common localization errors
normally introduced during the translation process. The Validate
Expert also has a companion technology, Runtime Validation
Expert, which allows user to validate applications as they run on
the Windows desktop.
Power Translation
Power Translation is used to automate the lookup and translation
of translation units in the active Project TTK file. It operates on
translation units and helps the translator locate and translate
matching terms in active translations memories.
Spell checking
Custom dictionaries
Data Base Connector
Ability to create TM’s via reading content directly from data base
through a provided connector
License Management
Either through additional License Manager or through built in
ability to borrow license for up to 2wks.
5
Translation Process
ezParse
Ability to define custom XML schemas for non standard resource
files
Command line interface
Most of the features of Catalyst can be called from the command
line. This feature could be used during the build process.
Free translation version
Catalyst has a free, stripped down version of its software. This
version could be used by translators that do not require advanced
features such as leveraging, power translating etc…. The free
version reduces the overall TCO for the product.
6
Translation Process
Creation of Master Translation Project file
The first task in the localization process is creation of the base application – this may or may
not be “real” (that is deployed or even deployable) application especially if your website offers
different features to different user profiles. You should create you base application using
neutral culture (using Microsoft.NET localization terminology); a decent compromise is English
locale:
Root
import
Site Folder
A
MASTER.ttk
en/en
*.resx
Site Folder
B
*.resx
Figure 1
In this process (seeFigure 1 ) resource files (resx and custom xml) are imported into the
Catalyst. For the custom localization files (custom XML schema etc…) the import rules must be
entered into Catalyst via the easyParse interface before the import.
There are two options of importing files: file import and file/folder import. The latter option is
preferable because it preserves the structure of the source directories and Catalyst is able to
replicate the source file structure during export.
Error! Reference source not found. shows the initial view of the project after import.
Catalyst parses each file and extracts strings. Even though the system treats each string pair
as an independent unit of work (Translation Memory) it remembers the relationship between
strings, parent files and folders. This is shown in the navigational area on the left.
The initial project settings specify English as a source and target languages. This project needs
to be treated as a master for all other projects. In case of a new country rollout we will use it
as a basis for translation to a specific language.
7
Translation Process
Figure 2
8
Translation Process
New country rollout
First step in the new country rollout process is the creation of the Catalyst project file for the
specific TM pairing. The source language should always be left as your basic locale (for
instance en-GB), the target language needs to be set in the locale navigational area.
The Project file name should reflect specific language pair. Therefore our naming convention
should be: [project actual name].[Source locale].[Target locale].ttk.
For example:

Website.en-GB.en-GB.ttk is a master file for UK code base

Website.en-GB.de-DE.ttk contains translation into German
The locale specific project file is shipped to translators (via quickship option which essentially
creates a self-extracting executable of the catalyst project file) who work on the file using the
translators/lite version of the system. Before the file is shipped to translators the translation
process owner should “lock” certain strings i.e. prevent them from being translated. Generally
this should be rare; however it particularly applies in cases where resource files are used to
store business logic.
Root
translation
export
Site Folder
A
MASTER.ttk
en-GB/en-GB
DE translated.ttk
en-GB/de-DE
*.de-DE.resx
Site Folder
B
*.de-DE.resx
Figure 3
After receiving the translated file from the translator a senior language resource should
approve all translations within the tool (which would change the visual status for every string
from en eye to checkmark) and give it back to engineers.
9
Translation Process
Translation
CR
Code
Complete
Review ttk file
Engineer creates
Locale ttk
Translator
Works with the ttk file
Translation Manager
Approves translation
Engineer exports
Resource file
Ready
For
Language
QA
Build Manager
Build the devstage
site
Language QA
Checks translation
issues
Rework
If necessary
Prep work
Request
Enter Language
QC’s
Ship section
Figure 4
In the last part of the process locale specific resx files are extracted and included in the next
build. Alternatively the approval of the translated strings should happen after the
development-stage or development-integration (depending on your process) build has been
approved. After the build the language QA takes place on the translation testing environment.
In case of issues, defects are entered and the translation process repeats.
After launch you must maintain the locale specific TM’s – most appropriately in the SCM
system since they represent our reference data. This puts an overhead of keeping the TTK
files in sync between the master and localized versions. You should automate this process
using Alchemy’s command-line interface
10
Translation Process
Parallel translation process
For very large websites or applications the process of translation will not use single unit but it
will require chunks of work that could be handled by several translators at the same time.
Catalyst supports this scenario with the “section” concept. The idea is to export parts of the
site into “subprojects” to be sent to translators. After translation the sections must be
imported into the main projects and section projects should be discarded
folderA.ttk
DE
translated.ttk
en-GB/deDE
Site Folder
A
Root
Translator A
*.en-gb.resx
Site Folder
A
*.en-gb.resx
Export /
Import
folderB.ttk
Site Folder
B
Site Folder
B
*.en-gb.resx
*.en-gb.resx
Translator B
Figure 5
Please note that original names (and extensions) of resource file that were loaded into the
master project will be preserved in the localized projects. This not a big problem because files
with correct names and extensions can be always exported from that file. Alternatively we
could maintain a neutral version of resx file (with no language extension) and use it as
master. Figure 5 shows the parallel translation process in detail. Depending on the project
timeline we should export the main project into as many section projects as it appropriate.
Those section projects are handled by translators and translator managers. Translation QA of
the parallel translation process should follow the main workflow illustrated on the Figure 4.
Your translation build schedule should be aligned with delivery of translated chunks.
Figure 6 and Figure 7 show screens for importing and exporting sections from Catalyst.
11
Translation Process
Figure 6
Figure 7
12
Translation Process
Maintenance – adding new page or section
In this scenario we are assuming that there is an existing master translation project as well as
one or several projects containing TMs for live non-english sites. As it was mentioned before
we need to maintain the master and localized versions in sync. Therefore we would add a new
folder/ new page.en-GB.resx file to all ttk projects at the same time and start parallel
translation efforts (via exporting sections) when appropriate and congruent with the release
schedule.
13
Translation Process
Maintenance – Adding new string to existing resource file
Catalyst manages translation of the application TM’s. It does not manage application strings
themselves; therefore one cannot add or remove strings inside Catalyst. If we need to add a
new application string to the resource file we need to follow a slightly different process
illustrated on the Figure 8. A developer needs remove the old resx file from the master project
and re-import the new version. Subsequently all localized projects need to be recreated (by
changing the target language and saving master with localized names according to our naming
convention). Engineers need to then leverage (see Figure 9) translated TM’s from the old
localized projects into new versions. Non-translated strings need to be then exported to
translators and after receiving the translated TM’s re-imported into the new projects. Finally
local resx files must be exported, application rebuilt and QAd.
Figure 8
14
Translation Process
New
Index.en-GB.resx
Master
Root
New de-DE
Root
Replace
Root
Site Folder
A
Site Folder
A
Save as
*.en-gb.resx
Site Folder
A
Leverage
New index.enGB.resx
Index.en-GB.resx
Site Folder
B
Old de-DE
index.en-GB.resx
Site Folder
B
Site Folder
B
*.en-gb.resx
*.en-gb.resx
Export / Import
Section
folderA.ttk
Site Folder
A
Translator A
*.en-gb.resx
Figure 9
Admittedly the overhead of this process is huge, therefore we should add some number of
“spare” strings to every resource file to avoid this.
15
Translation Process
Issues with Using Alchemy in Translation Projects
No software tool is perfect and we have to deal with limitations and compromises made by the
designers and architects. Alchemy is no exception so when working with it be mindful of few
pitfalls:
1. Leverage once: The biggest efficiency in applying Alchemy on projects is its ability to
reuse translation memories in the process called “leveraging”. Once certain key phrases
or dictionaries are translated the resulting TMs could be used to translate the same
phrases occurring in other documents. In this way we can achieve both significant
efficiency increases and consistency in translating the terms. Alchemy supports this
process well; unfortunately if the rework of “base” is required the model breaks
because in Alchemy leveraging can be done only once. Hence changes in base require
that you discard already translated sections and leverage everything from the
beginning
2. Workflow history is often lost: the translation process is workflow driven so preserving
the history of who translated and approved particular TM is an important feature. It
appears that the workflow history is maintained by the Alchemy data based so when
merging sections back into the bigger data base the individual history if often lost.
3. Alchemy data bases can grow big: Translated TMs should be kept in the SCM system as
a “trusted” source. This works relatively well for applications with limited amount of
strings. For applications or websites with large amounts of strings the Alchemy data
bases grow to many MBs thus presenting logistical challenges. A centralized data base
driven system might do a better job.
16
Download