Presentation - Localization World

advertisement
The value of Post Editing - IBM Case Study
Frank X. Rojas, Jian Ming Xu, Santi Pont Nesta, Álex Martínez Corrià, Salim Roukos, Helena Chapman, Saroj K. Vohra
June 2011
100 years of progress and innovation
© 2011 IBM Corporation
IBM Case Study – MT Post Editing
 Introduction
 MT Innovation
 Process Overview
 Findings
 Conclusion / Recommendations
2
© 2011 IBM Corporation
IBM World Wide Translation Operations
Marketing Material
Machine Translation
Legal/Safety/
Contracts
Multimedia
Publications
Overall
End to End
Process
Management
Francization
Cultural Consultancy
Product Integrated
Information
Centralized DTP
Web
Process ~2.8 B Words ~60 language pairs
Translate ~0.4 B Words
 24 Centers World Wide
~115 Translation Suppliers
One Stop Shop for all Translation Services
3
© 2011 IBM Corporation
IBM Professional Translation Services
2
Consistent Quality Standards
Global Brand Identity
Professional Quality Standards
250
200
150
100
50
2001 2002 2003 2004 2005 2006 2007 2008 2009
1
Unit Cost
>50% Reduction
Traditional
Technology
Process
Mgmt
0
2001
2002
2003
2004
2005
2006
2007
2008
2009
Professional Memory
72%  85% Re-Use
3
Human
Skill
 Future:
– Ability to reduce cost using conventional methods reaching limits
– Business pressure for additional cost elimination
– Looking to MT Technology as next wave to reach business goals
4
© 2011 IBM Corporation
Historical Perspective
2010 MT piloting
Pilot: SPA, ITA, FRE, GER
------------------------------------New E2E process
Partnership: WWTO/n.Fluent
8.6 M words
Initial n.Fluent/WWTO
Spanish MT pilot
------------------------------------Improve efficiency of
professional translators
es
gin
n
TE
2012
id
br
Hy
M
2011
2011 MT Training
Pilot: GER, BPR, JPN, CHS
------------------------------------MT payment profiles ready
n.Fluent customized with
WWTO translation memories
2010
es
RTTS introduced in 2006
as platform for speech and
text translation, developed
by IBM Research
M
al
tic
s
i
t
Sta
2009
2008
s
ine
dM
se
a
B
le
u
R
2007
2006
5
in
ng
E
T
16.0 M words target
RTTS licensed
to IBM partners
ng
TE
- MT portal
- Generic crowdsourcing
- Text translation services
June 2008
eSupport (www)
“Translate This Page”
JPN pilot /
rule engine
eSupport
“Translate This Page”
switch to n.Fluent
© 2011 IBM Corporation
MT Critical Success Metrics
 Necessary and sufficient condition to measure success
– 5.0 M words sampled
– Minimum of 3 languages
– Net Contribution to ROI by MT Engine:
10% of payable words should be MT
– No more than 5% adverse impact to Overall Quality Index
– No more than 5% impact to Customer Satisfaction
 Lack of industry metrics and guidance.
– Active research on MT technology... no guidance on operational impacts
– A business vacuum existed on how to integrate MT services
– No operational process had been defined for MT services
6
© 2011 IBM Corporation
Recent Digital Innovations with Biggest Impact in the Business World*
 IBM’s Watson Q&A computer
 Google’s autonomous car
 Technologies to understand and produce natural
human speech
 Instantaneous, high-quality machine translation
 Smartphones / App phones in the developing world
7
*Andrew McAfee is a principal research scientist in the MIT Sloan School of Business
© 2011 IBM Corporation
Real-Time Translation Server (RTTS) & n.Fluent
IT HELP DESK
Real Time Translation Server (RTTS)
 IBMs MT Engine
 RTTS provides machine translation for n.Fluent & other applications
 APIs allow other applications to access these translation services.
 Customization tools – Domains, chat-specific models, …
 Commercially licensed to IBM partners

Language Pairs to/from English:
‫العربية‬
中文

Deutsch
Français
English
日本語
Italiano
•BLEU
Quality

•0.5
•0.45
•0.4
•0.35
•0.3
•0.25
•0.2
•0.15
•0.1
•0.05
•0
한국어
Português
Base 29k
180k
350k
Words
Español
Русский
n.Fluent
 IBMs MT translation application
 Providing machine translation services for:
 Text, web pages, and documents (Word, Excel, …)
 Instant Messaging chats (via IM plug-in)
 Mobile translation application (BlackBerry and others)
 Enabled with LEARNING via crowdsourcing (internal 450K IBMers)
 Deployed for eSupport self serving tech support (external)
8
© 2011 IBM Corporation
Historical Perspective
2010 MT piloting
Pilot: SPA, ITA, FRE, GER
------------------------------------New E2E process
Partnership: WWTO/n.Fluent
8.6 M words
Initial n.Fluent/WWTO
Spanish MT pilot
------------------------------------Improve efficiency of
professional translators
es
gin
n
TE
2012
id
br
Hy
M
2011
2011 MT Training
Pilot: GER, BPR, JPN, CHS
------------------------------------MT payment profiles ready
n.Fluent customized with
WWTO translation memories
2010
es
RTTS introduced in 2006
as platform for speech and
text translation, developed
by IBM Research
M
al
tic
s
i
t
Sta
2009
2008
s
ine
dM
se
a
B
le
u
R
2007
2006
9
in
ng
E
T
16.0 M words target
RTTS licensed
to IBM partners
ng
TE
- MT portal
- Generic crowdsourcing
- Text translation services
June 2008
eSupport (www)
“Translate This Page”
JPN pilot /
rule engine
eSupport
“Translate This Page”
switch to n.Fluent
© 2011 IBM Corporation
MT Post Editing End to End Workflow
English
TM Pre-Process
TM
Match
Analysis
Shipment
Editing Session
100%
Exact Match
New /
Changed
TM
MT
Model &
MT
CAT Translation
1.Show best choice
vs
vs
2.Select best choice
(Post Edit rules)
TESTING
QUALITY
Trans.
3. Commit language
MT Pre-Process
 Upfront & on-going MT tuning via IBM TM professional translations
– Professional translation = Best context
 Matching methods
– Traditional TM
– Machine TM
– breaks down content @ segment level
– breaks down segments @ block level using MT models
– reconstructs segments preserving formats/mark-up tags
 MT service level integration
10
= Localization Kit (NLV Folder)
© 2011 IBM Corporation
MT Pre-processing
ALL segment
“no match segments”
Domain specific
parallel training
corpus
100%
Exact Match
New /
Changed
TM
New /
Changed
MT
initial corpus
Build dynamic,
domain specific
MT model
TM
MT
General
parallel training
corpus
Localization kit
100%
Exact Match
MT
 Initial MT corpus
– done before start of project
11
18-sept.-08
Translation of
no match
segments
© 2011 IBM Corporation
TM Editing Environment
TM Environment
Xxx xxx xx xxx xxx xxx. The application
unprotects files before exporting them. Yy yyy
yyy
Translation Memory
0 - The application unprotects files before
exporting them.
1[m] – La aplicación desprotege archivos
antes de exportarlos.
2[f 85%] - La aplicación protege los archivos
antes de exportarlos
[Ctrl + 1]
MT
TM
Translator options
Ignore fuzzy and MT
Post edit MT
Post edit fuzzy
Two Seconds Rule:
Translators are trained on
several strategies to make a
quick choice
TM Environment
Xxx xxx xx xxx xxx xxx. La aplicación
desprotege los archivos antes de exportarlos.
Yy yyy yyy
Typed
12
18-sept.-08
© 2011 IBM Corporation
Each event
Productivity Measurements
 Start segment
– Choose action
 End segment
1. accept match [~0 time]
2. edit match
[X time]
3. reject match [manual translation]
 MT productivity evaluation log (MTeval Log)
– N events
– Words | Time | Existing Proposal | Used Proposal | ...
EM : Exact
RM : Replace
FM : Fuzzy
MT : Machine
NP : No Proposal
A) = “best” Existing Proposal
B) = “alternative” Existing Proposal
C) = reject all Existing Proposal, 100% human labor
 Examine productivity per payment category
– SUM(Words) / SUM(Time)
– Use of IBM Business Analytic Tool (SPSS)
– Trim events that fall into 5% (slowest) and 95% (fastest) percentile
13
© 2011 IBM Corporation
Single Shipment EXAMPLE
Used MT
MT
SEGMENTID
Count
1-EM
WORDS
Sum
0
NO MT
TIME
Sum
.
Prod_W_T
Median
SEGMENTID
Count
WORDS
Sum
TIME
Sum
Prod_W_T
Median
.
.
1350
10593
3022
2.00
2-RM
4
18
43
.42
239
3905
3085
1.50
3-FM
129
1419
3870
.46
334
5610
9466
.71
5-MT
111
1777
4071
.50
0
.
.
.
6-NP
133
697
3393
.20
9
131
412
.33
Total
377
3911
11377
.37
1932
20239
15985
1.67
 Total # events : 2,309 (377+1,932)
Key metrics
 Total words: 24,150
– 3,911 w/ MT match
– 20,239 w/o MT match
14
Total time: 27,362
11,377 w/ MT match
15,985 w/o MT match
 MT impact to productivity
– MT : 0.44 words/sec [1777 words / 4071 sec]
– NP
• 0.21 w/ MT match
• 0.32 w/o MT match  Baseline (placebo)
 MT Leverage : 71.8% [1777 / (1777+697)]
rate(MT) / rate(NP): 1.37
i.e. Translator can
complete 37% more
words in the same
time.
© 2011 IBM Corporation
MT Impact on Fuzzy Match : 4Q10 Findings

When FM & MT matches exist simultaneously

Productivity: rate(MT) / rate(NP):
a. Case : Translator edits FM
b. FM-MT Combined case
c. Case: Translator edits MT
8.00
Overall
– Machine matches not as
good as professional
(fuzzy) matches
–
No statistical impact to
fuzzy productivity to
include MT matches.
• SPA highest sample
case
7.00
Productivity ratio

6.00
5.00
FM
FM-MT
MT
4.00
3.00
2.00
1.00
0.00
FRE
FM-MT Pick Rate:
15
28.6%
GER
4.4%
** Findings subject to change with additional sampling.
ITA
SPA
57.6%
46.9%
© 2011 IBM Corporation
MT Key Metrics: 4Q10 Findings
MT
MT
Words
# Events New/Changed (% of NP) Leverage
FRE
20417
209347
2.87
68.9%
GER
36634
250238
1.32
5.4%
ITA
78483
715557
2.70
46.2%
SPA
783238
7424298
1.74
55.2%
Total
918772
8599440
 8.6 M words sampled in real time translation service.
 SPA : Qualified MT engine 4Q10
 ITA : Qualified MT engine 4Q10
 FRA : Qualified MT engine 1Q11
• While rate(MT) / rate(NP) is high, the findings were not statistically significant in 4Q.
 GER : Insufficient productivity from MT engine
16
** Findings subject to change with additional sampling.
© 2011 IBM Corporation
Overall Savings Assessment
 Overall savings %
– Word savings due to MT efficiency
• Convert time savings  MT payment factor %
– MT payment factor X [MT % words + NP % words]
• Results in less payable words.
 MT productivity savings drives a overall savings
– These are not the same due to MT % distribution.
 Supply chain has to consider cost of MT services
17
** Findings subject to change with additional sampling.
© 2011 IBM Corporation
Pay for MT Words Translated not MT Matches

We pay for final results (MT payable words) not MT matches
– MT matches considered “opinion” until chosen by a human
– Too many opinions & opinions by immature MT models are less efficient.

Actual MT payable words have value beyond the specific project
– Post Edited words are reused in future and unknown MT context

Engine has to deliver consistent MT payable words
– Minimum needed to quality an MT engine for compensation
• High MT productivity
[rate(MT) / rate(NP)]
• High MT leverage
[% of MT matches used]
– Compensation to be based on MT payment factor
18
© 2011 IBM Corporation
Variance across Languages
 There is no single maturity path when modeling MT engines across many languages.
 IBM Pilot: each trained MT engine is a unique asset.
– Some languages require more modeling/tuning than others.
– Language pairs that service “Loose -> Structured” languages are struggling
• German requires more effort than Spanish
 Are there limitations to statistical MT engines?
– New thinking may need to be explored?
 Each MT engine will have separate MT payment factors.
19
© 2011 IBM Corporation
Perspective of MT Post Edit Pilots
Domain
Specific
Professional
Translation Services
(Professional LSP)
Community
Translation Services
(Controlled Social Crowd)
HIGHER
All IBM
external/internal
Pubs / UI
external
(2011 Pilots)
Volunteer
Translation Services
(General Crowds)
internal IBM
Free Services
(Individual)
internal IBM
WWTO
“human”
New
n.Fluent
“machine”
Quality / Reliability
Memory Assets
General
Translation Service Hierarchy
LOWER
MT Post Editing has impacts across entire Translation Service Hierarchy
20
© 2011 IBM Corporation
MT Post Editing Project – Key Lessons
1.
Professional (Human) memories are the best assets and deliver the highest quality.
2.
Professional memories are a key asset for MT success.
3.
All Memory assets need to be protected and managed.
4.
Flow of memories between Professional and Machine must be properly balanced.
5.
Dynamic modeling offers significant advantage over static modeling.
6.
Continuous business analytics is needed to optimize machine assets.
7.
A single cost model per language is needed, independent of MT services/engines.
8.
An aggressive yet cautious approach is warranted to go forward.
MT Post Editing does improve productivity and efficiency
of a localization supply chain.
21
© 2011 IBM Corporation
Download