A HAND-HELD MULTIMEDIA TRANSLATION

advertisement
A HAND-HELD MULTIMEDIA TRANSLATION AND
INTERPRETATION SYSTEM FOR DIET MANAGEMENT
Albert Parra Pozo†, Andrew W. Haddad‡, Mireille Boutin‡ and Edward J. Delp†
†Video and Image Processing Lab (VIPER)
‡Computational Imaging Lab (CIL)
School of Electrical and Computer Engineering
Purdue University, West Lafayette, Indiana, USA
ABSTRACT
We propose a system for helping individuals who follow a medical
diet maintain this diet while visiting countries where a foreign language is spoken. Our focus is on diets where certain foods must
either be restricted (e.g., metabolic diseases), avoided (e.g., food intolerance or allergies), or preferably consumed for medical reasons.
However, our framework can be used to manage other diets (e.g.,
vegan) as well. The system is based on the use of a hand-held multimedia device such as a PDA or mobile telephone to analyze and/or
disambiguate the content of foods offered on restaurant menus and
interpret them in the context of specific diets. The system also provides the option to communicate diet-related instructions or information to a local person (e.g., a waiter) as well as obtain clarifications
through dialogue. All computations are performed within the device
and do not require a network connection. Real-time text translation is a challenge. We address this challenge with a light-weight,
context-specific machine translation method. This method builds on
a modification of existing open source Machine Translation (MT)
software to obtain a fast and accurate translation. In particular, we
describe a method we call n-gram consolidation that joins words in
a language pair and increases the accuracy of the translation. We developed and implemented this system on the iPod Touch for English
speakers traveling in Spain. Our tests indicate that our translation
method yields the correct translation more often than general purpose translation engines such as Google Translate, and does so almost instantaneously. The memory requirements of the application,
including the database of picture, are also well within the limits of
the device.
Index Terms— computational linguistics, statistical learning,
multimedia systems
1. INTRODUCTION
Diet plays an important role in health management. For example,
the symptoms or risk factors of many diseases can be decreased by
diet modification. In some extreme cases, for example peanut allergies, the consumption of even a minute amount of certain nutrients
can have devastating health consequences. In other cases, such as
diabetes or inborn errors of metabolism, the consumption of certain
nutrients must be carefully monitored and limited in order to maintain an individual’s health.
This is partially supported by the U.S. Department of Homeland Security’s VACCINE Center under Award Number 2009-ST-061-CI0001. Address all correspondence to E. J. Delp (ace@ecn.purdue.edu).
978-1-61284-350-6/11/$26.00 ©2011 IEEE
In unfamiliar settings, in particular when traveling to a foreign
country, maintaining a medical diet can be a challenge. Indeed gastronomy often varies from region to region and so tourists naturally
expects to be confronted with unknown dishes and ingredients. But
while many consider sampling the local gastronomy an important
part of the travel experience, people who must follow a medical diet
are often reluctant to embark on such journeys for fear of putting
their health at risk. Medical diets are especially difficult to deal with
when traveling to a region where a foreign language is spoken. Indeed, without the ability to understand menus, it is impossible to
make informed food choices.
A device capable of automatically translating menus in real-time
could thus be a useful tool for people following a medical diet. Unfortunately, the best electronic translators typically rely on a remote
network-connected server to obtain the translation. Moreover, few
automatic translators are able to give context specific information,
let alone medical diet specific information. Indeed, the problem of
maintaining a medical diet in a foreign setting goes beyond translation: it is about interpretation, disambiguation, and communication.
For example, the short text descriptions of the food items offered in
a menu leave a lot of room for interpretation, even for a person who
is fluent in the local language. This is because the name of a dish
may not be descriptive, or the ingredients used to prepare the dish
may vary in a particular region or even at a particular restaurant.
Furthermore, certain medical diets involve strict preparation guidelines (e.g., to avoid cross-contamination), and determining whether
these guidelines are/can be followed necessitates a non-trivial dialogue with the staff in charge of preparing and serving the food.
We propose a system to address these issues. The system, which
is summarized in Figure 1, is based on the use of a hand-held multimedia device such as a PDA or a mobile telephone. The system operates in the following manner. The user types the desired
dish/ingredient (e.g., arroz a la cubana) into a prompt in the Graphical User Interface (GUI). The text is then translated using a modified (context specific) machine translation engine. The best possible translations are then listed in the order that they are removed
from the database, along with multimedia information (e.g., picture).
The same list of results could be weighted based on the results of
the translation and the n-best-list. The user can then browse the
multimedia database to obtain more information about the dish or
the ingredients. When appropriate, informations/questions aimed at
a waiter or other knowledgeable foreign speaking person are suggested; the answers of that person are translated back to the user.
Before leaving for a foreign country, the user downloads a region and language specific configuration and database. From then
on, the system can operate without a network connection. First, the
user must set the parameters of the medical diet under consideration.
This can be done by selecting from a list of pre-defined diseases and
conditions. It can also be done by selecting from a list of ingredients/nutrients to either avoid or favor. When considering a given dish
or food item on a foreign menu, the user then enters the text of the
menu describing this item (in the local language). The text entered
is then translated and interpreted automatically by the device using
text, audio and still images. When appropriate, information or instructions to be transmitted to the waiter are suggested by the device
(e.g. ”Please note that I am on a restricted low-protein diet because
of a metabolic disease. Therefore my meal must only consist of fruits
and vegetables.”). When selected by the user, a translated version of
the text (in the local language) is displayed on the screen and can be
shown directly to the waiter. When needed, questions for the waiter
are also be suggested by the device in order to disambiguate the ingredient of a dish (e.g., ”Does this salad contain croutons?”). When
selected, these questions are displayed in the local language along
with a choice of answers to be selected by the waiter. (e.g., “Yes.”,
“No.”, or “Let me ask and I will get back to you.”).
Real-time text translation using a hand-held device is a challenge
because of the limited amount of data storage, memory (RAM),
and power available. We address this challenge with a light-weight,
context-specific machine translation method that leverages the use of
pictures and text descriptions. In this method, a step we call n-gram
consolidation is used with the database (phrase-tables) obtained following training using statistical machine translation. This allows us
to dramatically reduce the size of the database while increasing the
search speed. This also contributes to an efficient use of the device energy. Furthermore, by decreasing the number of entries in the
database the accuracy is also enhanced. The efficiency is also enhanced by displaying the output as a rated list of best translations, to
let the user decide which result better fits the context. Another key
is the use of a browsable multimedia database, which provides additional information to the user, such as images or ingredients. As a result, our application can be used in a real-time network-independent
environment and produce highly accurate results.
The computational complexity and memory requirements of our
methods are low enough so that the system can be implemented using many commercially available hand-held devices. To demonstrate
this, we developed and implemented this system on the iPod Touch
for English speakers traveling in Spain. Our tests indicate that our
translation method yields the correct translation more often than general purpose translation engines such as Google Translate, and does
so almost instantaneously. The memory requirements of the application, including the database of picture, are also well within the limits
of the device.
A review of the existing diet management tools is presented in
Section 2.1. A summary of state of the art in Spanish-English “machine translation” (MT) systems is given in Section 2.2. An explanation of the modification to Moses and the first test version is outlined
in Section 3. The experimental results are shown in Section 4. We
outline the implementation of the method in Section 5. Some conclusions and thoughts on future improvements are given in Section
6.
2. EXISTING METHODS
2.1. Diet management
Up until very recently, medical diets have been managed primarily
using (printed) diet-specific food databases. In many cases, a pencil, some paper, a scale and a calculator were also required. In this
Fig. 1: Block diagram of our proposed menu translation and
interpretation system.
traditional scenario, the individual reads the nutrition facts and ingredient list printed on the label in order to analyze the content of
the food with the help of the database; when the intake of certain
nutrients must be restricted or precisely recorded, the scale and calculator are then used to determine how much of these nutrients has
been consumed; the amount is then recorded on a piece of paper for
later analysis by a trained dietician.
Without access to the nutrition label (for example in a restaurant), the user must question the people who prepare the food in
order to obtain the required information. This typically involves a
dialogue between the chef/food preparation staff and individual.
With the widespread availability of smartphones and other multimedia hand-held devices, many electronic tools are now available
to assist individuals who must follow a medical diet [1]. For example, text messaging can be used to send reminders diabetes patients.
One can also build on the Bluetooth capabilities of such devices to
remotely record and monitor blood pressure analysis readings. With
the higher resolution pictures, improved memory capacity, and faster
processors of the recent versions of these devices, it may even be
possible to automatically identify and measure the food consumed
from ”before and after” pictures of a plate [2, 3].
2.2. Translation
Machine translation (MT) methods fall into two main categories:
Rule-Based MT (RBMT) and Statistics-based MT (SMT). RBMT
provides a text translation based on the grammatical, morphological
and syntactical rules of the languages in question. The advantage
of this method is that it deals with grammar rules and lexicon, and
any variation of an input can be handled. The disadvantage is that
an extensive knowledge in both the source language and the target
language is required in order to build the rules, and a lot of effort has
to be invested in the database creation and modification. SMT provides a text translation based on probabilistic correlations between
large corpora or bilingual texts. It usually relies on large databases
(billions of words) to provide a good quality translation, but it has the
advantage that it only needs a source language and a target language
corpora.
The results presented at the 2009 Workshop on Statistical Machine Translation for the open source systems indicate that Apertium
(RBMT) and Moses (SMT) [4] are candidates for this work. However, we found Moses a better choice for food related items, given
that its database is easier to manipulate than Apertium.
Moses is a SMT system that uses a phrase-table built from a
given parallel corpora for translation. Its main features are confusion
network coding [5] and word lattices [6], allowing the translation
of ambiguous inputs. It also uses factored translation models [7],
adding part-of-speech tags or lemma information to the phrase-table.
Moses uses external tools for word alignment (GIZA++ [8]) and language modeling (SRILM [9]). It uses a beam-search heuristic algorithm to find quickly the highest probability translation among the
exponential number of choices (roughly similar to [10]). This environment has also been previously tested with success in different
contexts. The main advantage of the SMT method is that it only
needs a pair of translated texts (source and target languages). The
different training tools are in charge of the data preparation and further training, and provide a phrase-table with all the probabilities
and extra information needed for translation [11]. The training data
has to be large enough to allow an efficient training. This is one of
the inconveniences of working with restaurant menus in that there is
not any reliable Spanish-English cuisine parallel corpora available,
we had to build ours manually. However, the database just needs
food-related vocabulary and grammar, so it should be adequate with
a 7,000 line database. As Moses inputs are simple translated texts
(not formatted), it was easy to find at least 70% of the data online,
and the rest can be manually added.
Fig. 2: Schema for the ingredients and conditions relationship.
3.2. Profile management
Before using the application, the user has to create a diet profile.
A list of conditions/allergies/lifestyles is shown through the GUI,
as well as a list of all the ingredients in the database. Figure 5a
shows an example of the screen on an iPod. This is useful when the
user is following a personalized diet. Once the options are selected,
the database is flagged as explained in Section 3.1. Therefore, after
a dish is selected, if it contains one of the ingredients in the user’s
profile, it appears flagged, and the user can decide whether to choose
another dish or request to remove the flagged items.
3. PROPOSED METHOD
3.1. Multimedia diet management
Multimedia in diet management offers a great advantage over old
methods. Images and Text-To-Speech (TTS), as used in our system, provide much needed guidance in a foreign country. Images
can disambiguate the result of a translation or verify for the user that
an ingredient is not wanted while TTS provides a much needed interpreter for the user in a foreign country and facilitates a dialogue
between the native speaker and foreign user. We show examples of
the use of images and TTS in Figure 5.
We have outlined a scenario where a user with diabetes has been
surprised by the inclusion of fried banana in the dish arroz a la
cubana. The user is shown a list of dishes and ingredients when
searching for arroz (rice) in Figure 5b. In this list of results we
see red flags next to all dishes containing ingredients that a user has
flagged. In this particular instance, the search resulted in no flagged
ingredients, only dishes. Similarly, in Figure 5c we have an ingredient and the list of dishes containing this ingredient have been marked
because they contain ingredients which the user has chosen to avoid.
In Figure 5d the user has chosen a dish which contains a flagged ingredient. At this time the user can show a dialog with a translation
explaining their medical condition or lifestyle choice and request to
have the ingredient(s) remove from their meal (Figure 5e). If the
native speaker chooses, the TTS of the text can be heard.
The scenario outlined above is made possible by utilizing a relational a database. The database contains dishes, ingredients, images
and the relationships between all of the aforementioned entities. The
database also includes a list of medical conditions or lifestyles, and
relationships between the conditions/lifestyles and the ingredients
that affect the condition or lifestyle. When a user wishes to add a particular ingredient for a personalized diet, a one-to-one relationship is
added to the flagging table - shown by the (NULL, 7) in Figure 2.
Fig. 3: Block diagram for the Profile Configuration module.
3.3. Modifications to Moses
SMT engines work best with large databases, up to gigabytes of
memory, which is too large for most hand-held devices. However,
in our case we can reduce the size of the database by considering
the context. First, the vocabulary can be focused on restaurant menu
translations. Second, menu items are not usually sentences, but just
phrases or simply a couple of words. For example, if the average
Spanish sentence’s length is 18 words, the database can be reduced
by up to 80%. Therefore, it is possible to make Moses work accurately using a small databases. The only drawback is the work
involved in creating this reduced size database. Once the database is
created Moses has to be trained with it.
To build the database a clear understanding of the SMT paradigm
is crucial [12]. The main idea is that the probability that a string
e in the source language is the translation of a string f in the target language -p(e|f )- is proportional (applying Bayes Theorem) to
p(e)p(f |e)
. Since p(f ) is independent of e, finding the estimation ê is
p(f )
the same as finding e so as to make the product p(e)p(f |e) as large
as possible. Then, the Fundamental Equation of Machine Translation appears: ê = arg maxe p(e)p(f |e). The system has to perform
an exhaustive search by going through all the strings e in the native
language. This is why the database has to be manipulated carefully
so to make the search easier and faster for the decoder. In addition, a
n-gram linguistic model approximates the language model [13]. Its
objective is to predict the next item in a sequence of n words. For
example, the phrase calamares a la romana would produce a 3-gram
containing calamares a la, a la romana and la romana #, plus the respective 1-gram and 2-gram. Then, it is important to determine the
value of n that optimizes the results. As Spanish dishes are not always literally translated to English (e.g., arroz a la cubana → rice
with fried eggs and banana fritters) a logical equivalence is needed,
in order to not confuse the decoder. There are two possible solutions: 1) standardize a translation (e.g., arroz a la cubana → Cubastyle rice) or 2) work with multiple phrase-tables, some trained and
some not. In this project both of them are used, depending on the
complexity or the clarity of the translation.
Another aspect to bear in mind is the accuracy of the translation when more than one translation is possible. Our solution for
these ambiguities is to use a pattern repetition in the phrase-table so
to modify the probability tables for some words or phrases. For instance, if the phrase-table contains comida → food and comida →
meal, comida will be translated as food with 0.5 probability, and the
same will happen with meal. But if the phrase-table looks contains
comida rápida → fast food and comida basura → junk food as well,
pcomida→food = 0.75, and pcomida→meal = 0.75. In a similar way, training can be avoided for those phrases that can lead to confusion and
decrease the translator accuracy (i.e., Spanish items with little to no
relation with their English form, like fixed price menu ↔ menú del
dı́a). They can be put in separate phrase-tables with a forced 1.0
probability. This was, they do not interfere with the training: at the
same time they still have a high enough probability to be eligible
for a translation. On the other hand, if the decoder finds a similar
structure (e.g., the plural fixed price menus) the string will not be
recognized. Therefore, both singular and plural (and other forms)
have to be manually added to the database.
The training database has to be split into two files: one containing the Spanish words/phrases (one per line), and the other one
containing the English words/phrases (one per line). The one-to-one
databases are single files with both the Spanish and the English pairs
(one per line) and their probability manually set to 1.0. The training files are used to build the n-gram language model with SRILM.
Restaurant menu items do not usually have more than four words,
and they can be easily split and separately translated if they are more
complex. That is why the order does not have a great impact in
training, and a 3-gram language model provides enough information. Once trained, the main database has its own automatically estimated weights. Moses is configured to provide an n-best-list of
results (multiple output), and the user can use his/her best judgment
to determine the most appropriate result (or use the pictures shown
on the device to obtain clarification from the waiter).
Increasing the databases size with new dishes is a way to increase the accuracy, but there is another way to obtain better accuracy while maintaining or even reducing the database. The idea
is to match n-grams to their respective translations to increase the
probability of success. Table 1 shows some examples of translations and their corresponding position indices, indicating word relationships. For instance, arroz(SPA0 ) ↔ rice(ENG1 ), and a la
cubana(SPA1,2,3 ) ↔ Cuba-style(ENG2 ). Moses takes all the SpanishEnglish pairs and check the possible combinations, depending on the
n-gram set during the training process. The most frequent combination in the phrase-tables are assigned the largest probability. However, as seen in Table 1, there are lots different dishes with the string
...a la... and different position indices.
Spanish
arroz a la cubana
crema a la menta
pato a la naranja
cordero a la miel
English
Cuba-style rice
mint cream
duck à l’orange
lamb with honey
position indices
0=1, 1=0, 2=0, 3=0
0=1, 1=0, 2=0, 3=0
0=0, 1=1, 2=2, 3=3
0=0, 1=1, 2=1, 3=2
Table 1: Examples of position indices.
There is not a defined structure for these cases, hence there is not
a defined translation for the Spanish string ...a la.... This problem
would be solved by training the system with a very large database,
forcing the phrase-table size to increase to gigabytes. As this is not
an acceptable solution for hand-held devices, it is worth studying
carefully how the overall system works, in order to reduce the possible combinations and thus increasing the probabilities of a good
translation.
We propose to reduce long n-grams by putting multiple words
together, thus balancing the indices in both the Spanish and English sides. For example, instead of trying to find a general translation for ...a la..., it is better to focus the probability of known
dishes including the translation ...a la.... For example, arroz aXlaXcubana ↔ Cuba-style rice. Therefore, a possible 4-gram (16 possible matches) is reduced to a simple balanced (0=1, 1=0) 2-gram (4
possible matches). By doing this, the 3-gram count on the database
is reduced by 2.77% (now 1,018), while the 2-gram and the 1-gram
counts are increased 4.01% (now 6,456) and 8.46% (now 1,475) respectively. This decresases the number of words in the database by
27.57% (now 17,527) and increase its number of lines/entries by
4.16% (now 5,533). This method, which we call “n-gram consolidation”, uses special strings to join words, so they cannot be misinterpreted. Table 2 shows some default translations and their probabilities, for cases where a specific translation for the input has not been
found in the database.
String
...de... ↔ ...of...
...al... ↔ ...au...
...aXla... ↔ ...à la...
...en... ↔ ...with...
Prob.
1.0
0.5
0.5
0.5
String
...al... ↔ ...with...
...aXla... ↔ ...with...
...en... ↔ ...in...
...del... ↔ ...of the...
Prob.
0.5
0.5
0.5
1.0
Table 2: Default translations and probabilities for version 2.0.
4. EXPERIMENTAL RESULTS
We have analyzed the performance and storage bottlenecks (available data storage, memory (RAM) and processing power) of a mobile device. We tested the speed and accuracy of both the version
with n-gram consolidation(v2.0) and the one without (v1.0). A 500
entry list of random Spanish restaurant items was used as input, and
the output was evaluated by a Spanish speaker to determine its accuracy. Table 3 shows the results. The n-best-list option was used
in v2.0, and n was set to 3. All incorrect translations in v2.0 are due
to non-existing words in the database. The errors obtained with v2.0
also included gender and number grammar errors. The same list was
also tested using the Google Translate engine. Google Translate is
Fig. 4: Block Diagram for The Translation Module.
not focused on any particular context, and its output can sometimes
be literally correct, but incorrect in a food-related context. For example, andrajos, which is a Spanish kind of stew, is translated to
rag.
The general accuracy (i.e., when the correct translation is one
among the first 3 best) of v2.0 is 86.8%, while that of v1.0 is 75%.
When v2.0 found the correct translation, it was in the first position
(i.e., the most likely translation) 95.8% of the time.
Engine
v2.0
Google
v1.0
Correct
434
365
375
1st
416
-
Incorrect
66
135
125
Accuracy
86.8%
73%
75%
1st
83.2%
-
Table 3: Translation accuracy for the two versions of our system
and for Google Translate.
We tested the speed of the method using various food related
items ranging from one to ten words, with a median of four words.
We tested both existing items in the phrase-tables and one for nonexisting ones. The computation time was around 0.5s for all of them.
The short computation time is only partly due to the small size of our
database (17,527 words). For example, by replacing our database
with the first 10,000 words from the Spanish-English WMT08 News
Commentary database the computation time actually increases one
second per expression on average (see Table 4).
Database
1/4% of WMT08
1/8% of WMT08
1/16% of WMT08
v2.0
1/40% of WMT08
1/160% of WMT08
Phrase-table size
100,000 words
50,000 words
25,000 words
17,527 words
10,000 words
2,500 words
Speed
8.5 s
4.5 s
2.5 s
0.5 s
1.0 s
0.0 s
Table 4: Translation speed comparison for various databases.
Our application requires 17.52 MB of physical memory, including the main executable, the language model file and the phrasetables (i.e., without the images and ingredient list).
5. SYSTEM IMPLEMENTATION
We implemented our translation method on a second generation iPod
Touch (ARM11 533 MHz, 128 MB DRAM). To do this, we first de-
signed a relational database and used it to store images and ingredient lists along with relational information about Spanish dishes and
their ingredients. Second, we created a semantic model and used it
to represent the data during runtime. Third, we developed a parser
to automate the population of the menu database with images and
relational data. Finally, we developed a Graphical User Interface
(GUI). The user’s interaction with the GUI is bidirectional, since all
the data is internally connected and the user can switch dishes and
ingredients, and access additional information. Figure 5d illustrates
an example of the dish browsing screen. Any ingredient may be
tapped on in order to obtain further information.
We implemented the n-gram consolidation version of our translation software (v2.0) with one trained phrase-table and six one-toone phrase-tables. Moses gives the option to memory-map the language model (LM) and the phrase-tables, a recommended procedure
for large data sets or devices with minimal RAM, which is the case
with mobile devices. In other words, only the phrase-table pairs required to translate the input are loaded into memory.
We measured the CPU time needed to translate various menu
items after loading the software. The translation times for the Spanish food item arroz a la cubana in five tests under the same conditions showed almost instantaneous results, with an average of 0.09
seconds. Similar times were obtained for different dishes and word
combinations.
The total memory size of the phrase-tables plus the LM file is
2.6 MB, and the size of the entire application in the iPod, including
the executable and the image database, is 9.56 MB. This is less than
the Desktop version due to the fact that the portable version does
not need the SRILM tool; the Moses engine’s internal LM is used
instead.
The database used for the tests contains 155 images of dishes
and ingredients. The total size of the image database is 5.24 MB. Assuming linearity of growth, increasing the database to 1,000 images
would increase the memory to 37.82 MB, still a reasonable value; the
translation speed would not be affected because it is not dependent
on the size of the database.
6. CONCLUSIONS
We have proposed a system that can aid medical diet management
in foreign countries. The system relies on the use of a hand-held
portable device such as a mobile telephone or PDA to translate, interpret and disambiguate restaurant menu items in a diet-specific fashion.
The profile management system allows the user to personalize a
diet as made necessary by a medical condition or a lifestyle choice
(e.g., vegetarianism). An accurate translation and interpretation of
the restaurant menu item description is obtained in real-time using a
context-specific Machine Translation (MT) engine. This MT engine
was obtained by modifying an existing open source system. The
modifications include the use of a context specific database which
provides an n-best list of possible translations, and a browsable multimedia database.
In our tests, Google Translate yield the correct translation 73%
of the time. In contrast, our system output the correct translation in
first position 83.2% of the time. Moreover, the correct trabslation
was within the first three top ranked translations 86.8% of the time.
It would be possible to further increase the accuracy of our system
using the proposed framework, b’ut there is a trade-off between accuracy and the size of the translation tables.
Ambiguities and translation errors are mitigated through the use
(a)
(b)
(c)
(d)
(e)
Fig. 5: Snapshots of our GUI on the iPod
of a browsable database of pictures and ingredients along with disambiguation dialogues.
A proof-of-concept system has been implemented in a non-network
dependent environment using a second generation iPod Touch. The
real time translation is fast (0.09 seconds on average) and the application has a memory size of 9.56 MB, including the multimedia
database.
The context-driven, unique, food-related phrase-tables have reduced the size of the usual statistical-based database from several
GB to a few MB. Our proposed “n-gram consolidation” step allows
us to prune the database, which further decreases the memory requirements while increasing accuracy. One could further build on
this proof-of-concept system to make it a tool for individuals with
special diets by combining it with a database of nutritional information.
7. REFERENCES
[1] K. Patrick, W. G. Griswold, F. Raab, and S. S. Intille,
“Health and the mobile phone,” American journal of preventive medicine, vol. 35, no. 2, pp. 177–181, August 2008.
[2] F. Zhu, M. Bosch, I. Woo, S. Kim, C. J. Boushey, D. S. Ebert,
and E. J. Delp, “The use of mobile devices in aiding dietary
assessment and evaluation,” IEEE Journal of Selected Topics
in Signal Processing, vol. 4, no. 4, pp. 756–766, August 2010.
[3] B. Six, T. Schap, F. Zhu, A. Mariappan, M. Bosch, E. Delp,
D. Ebert, D. Kerr, and C. Boushey, “Evidence-based development of a mobile telephone food record,” Journal of American
Dietetic Association, pp. 74–79, January 2010.
[4] C. Callison-Burch, P. Koehn, C. Monz, and J. Schroeder,
“Findings of the 2009 workshop on statistical machine translation,” Proceedings of the Fourth Workshop on Statistical Machine Translation, ser. StatMT ’09, Stroudsburg, PA, USA,
2009, pp. 1–28.
[5] L. Mangu, E. Brill, and A. Stolcke, “Finding Consensus in
Speech Recognition: Word Error Minimization and Other Applications of Confusion Networks,” Computer Speech and Language, vol. 14, no. 4, pp. 373–400, 2000.
[6] R. W. Tromble, S. Kumar, F. Och, and W. Macherey, “Lattice
minimum bayes-risk decoding for statistical machine translation,” Proceedings of the Conference on Empirical Methods in
Natural Language Processing, Stroudsburg, PA, USA, 2008,
pp. 620–629.
[7] P. Koehn and H. Hoang, “Factored Translation Models,” Proceedings of the 2007 Joint Conference on Empirical Methods
in Natural Language Processing and Computational Natural
Language Learning (EMNLP-CoNLL), pp. 868–876.
[8] F. J. Och and H. Ney, “A Systematic Comparison of Various Statistical Alignment Models,” Computational Linguistics,
vol. 29, pp. 19–51, March 2003.
[9] P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico,
N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer,
O. Bojar, A. Constantin, and E. Herbst, “Moses: Open source
toolkit for statistical machine translation.” ACL. The Association for Computer Linguistics, 2007.
[10] C. Tillmann and H. Ney, “Word reordering and a dynamic programming beam search algorithm for statistical machine translation,” Computational Linguistics, vol. 29, pp. 97–133, 2003.
[11] R. Zens and H. Ney, “Efficient phrase-table representation for
machine translation with applications to online MT and speech
translation,” Proceedings of Human Language Technologies
2007, Rochester, New York, April 2007, pp. 492–499.
[12] P. F. Brown, V. J. Pietra, S. A. D. Pietra, and R. L. Mercer,
“The Mathematics of Statistical Machine Translation: Parameter Estimation,” Computational Linguistics, vol. 19, pp. 263–
311, 1993.
[13] P. F. Brown, V. J. D. Pietra, P. V. deSouza, J. C. Lai, and R. L.
Mercer, “Class-based n-gram models of natural language,”
Computational Linguistics, vol. 18, pp. 18–4, 1990.
Download