Contextual Inquiry and Task Analysis for Group 3 Group members

advertisement
Contextual Inquiry and Task Analysis for Group 3
Group members
Jeremy Syn - Worked on various parts (Problem and Solution Overview, Task Analysis Questions, Storyboard
Sketching)
Michael So - Did the list of 6 tasks, did storyboard sketching, and took part in interviews. Also, did the footnote.
Henry Su - Did the Contextual Inquiry - Interview Descriptions section, and one of the story boards, was there for
the interviews (co-conducted 2 of them)
How-Kil "Eric" Chung - Was there for interviews (like everyone was), also worked on Description of Users,
Analysis of Approach, and parts of Interface Description.
Everyone worked a bit on each others' parts.
Description of Users
Megaman56 - This user is a junior in college. He is a psychology major, who is interested in the major to help in
the topic of social situations that come about, especially ones based partially as a result of personal stubbornness.
He was born in the United States, with English being his first language, and is concurrently learning Mandarin.
His likes include listening to music and watching movies. His tech level is that of being familiar with computing
for everyday use (such as email) and that for research (for college, for example). He has a somewhat basic usage
plan for his cellular phone: he uses it as a phone and for texting. He doesn't use it for games, web surfing, and
doesn't use it for much picture taking, either. This user had gone on vacation to Europe (Italy and France) so this
user has a tourist point of view, which is one of our key target users.
KirbySuperstar1995 - This user is a third year in college, dual majoring in Math and Music. He was born in the
United States, so he uses English as his primary language. He did learn Spanish in high school, but doesn't have
a strong grasp of it since it was just a rudimentary high school course taken such a long time ago. He doesn't
know a secondary language besides his high school Spanish. He has many family members living in foreign
countries such as France and Spain. Some of his likes are music, math, women, and travel. He feels
uncomfortable being American when visiting overseas countries, because he feels out of place and he gives off an
obvious touristy aura. He has a pretty good background in terms of technological usability and even knows a bit
of programming (C++). He owns an average cell phone with nothing special in particular. He has visited his
family in France before so his experiences will provide us with significant information.
Charmander♥ - This user is a senior in college. The tech level of this person is higher than a normal user. He
does not keep up with the latest gadgets and technology but has done stuff like making Facebook applications. He
has also grown up bilingual and was born in Hong Kong. His hobbies include watching movies, TV, and playing
video games. As a user, he dislikes things such as repeating oneself. For example, when people try to say
something, they sometimes have to repeat themselves because they are not understood. Going along with this, in
general, he dislikes communication issues, such when people who don't understand a language try and use a
dictionary; not only is it slow and disruptive but it is often wrong as well, because often the first definition/word
in the dictionary will be used without regard to meaning. The rationale for choosing this user (along with the fact
that he was someone we had access too), is that he is someone who has dealt with (and deals with)
communication issues on a more frequent basis and for a larger variety of contexts, as opposed to one who is
simply traveling. This user gives us a bit of a fresh look at the topic.
Footnote:
Our Target Users in General:
We have a diverse set of potential target users for our language translation application. The following is a list of our target users and our
rationale behind them:
 ESL (English as a Second Language) people: Because ESL people are learning a new language (English to be exact), so having
a translation device will ease their learning experience. Our translation application can help them in their studies. They can
take pictures of words, or sentences they do not know, which appear in places such as textbooks, chalk boards, or shirts, and
get a translation quickly and easily.
 Vacationers (American visiting some other country, or even visiting a Chinatown, etc): Because a vacationer may not
understand the foreign country's language. So to understand the foreign language, a vacationer needs a translation device.
 English speaker/reader who wants to read a restaurant menu written solely in a foreign language (common in Chinatown as
well): Because they cannot read the foreign language printed on the menu, a translation device will make those words
understandable and they will be able to successfully order their preferred menu item(s).
 People who watch foreign drama, but don't necessarily understand the language completely: Because those people do not
understand the foreign words coming out of the actors' mouths. Usually if those people do not understand from listening to
the words, they put on the subtitles. But if the only subtitles available are foreign also, a translation device of the subtitles
would be awesome.
 The traditional translation device user (for various purposes): Because obviously those translation devices users are in need of
translations.
 Everyday people who want to communicate in any area with foreign language (it doesn't even have to be specific areas, like
Chinatown, but even within a city (like Charmander♥).
Problem and Solution Overview
The problem that some people have when traveling overseas is that they do not have a strong grasp of the native
language. They may feel really lost and confused without the availability of proper language aides. Without the
ability to read signs or communicate with the natives, one may feel insecure traveling in that kind of
environment. This problem could even extend to visiting local areas and shops where the locals or workers do
not speak fluent English. When looking at a menu with no helpful translations next to it, the person won't know
what to order. The solution we are proposing is a mobile application that allows you to take a picture of some
text, such as menu items or signs, and even symbolic signs such as road signs, and the application will translate it
into a language of your choice. So say for example, you travel to China; you can take a picture of a sign and your
mobile application will tell you what the sign means for your convenience. This solution is an approach to create
a comfortable way to travel in an unknown, foreign area.
Contextual Inquiry - Interview Descriptions:
First of all, due to the nature of the problem our application solves, it was not possible to do true contextual
interviews nor get to all of our potential target users. The reason is that we would either need to find a person
who cannot read English but can read another language (perhaps a very recent immigrant), or, we would have to
take an interviewee to a foreign country (where the interviewee does not know the language). Clearly, setting up
such contextual interviews in a matter of a week or two is nearly impossible, not to mention the capital required,
if we were to go to a foreign country. Thus, instead, we used the "recall" strategy mentioned in the article, to give
our interviews a contextual flavor. This means that we try to avoid having the interviewee summarize their
experiences with foreign languages, but instead ask them to talk about specific instances, while we ask questions
that would help them recall the finer details. Also, because these interviews were not truly contextual, they were
done in restaurants over a meal. All three interviewees were friends of different group members, so to make the
interviews more objective, we had the people who didn't know the interviewee conduct the interview.
Some similarities among the interviewees were that none of them were seriously dependent on translation
devices or human translators. That is, they either only need them during their travels (pleasure, not business), or
they were already bilingual to some extent. However, these "light-duty" users are still an important subset of the
user group, because we expect that many of our users would either use the application for recreational purposes
or for infrequent references. Another shared aspect among the three interviewees was that all of them are college
students, albeit from different technological backgrounds. Because younger people tend to have less trouble
learning new languages, similar interviews with older subjects may reveal additional information.
The first interviewee, MegaMan56, talked about his vacation to France and Italy. He went with friends who
did not speak or read French or Italian, but they had a tour guide. In one instance, the tour guide let them loose.
MegaMan56 and his friends wanted to go to the Notre Dame Cathedral, but did not know how to get there. He
asked a street officer, who was in a bad mood because she was being pestered by English-speakers, and she did
not know English very well. He noted that to avoid this uncomfortable situation, he would have needed to
research the information before leaving, as many signs were not in English.
The second interviewee, KirbySuperStar1995, talked about his trip to France with his family, to celebrate a
wedding of an extended family member. Only one of his family members was bilingual in English and France,
so he sometimes had to figure things out on his own. For example, he went to a barbershop where the hairdresser
did not speak any English at all. He noted that because no translator was present, hand gestures proved to be
useful. He also mentioned that it would be nice if he could write down his intentions, and have it translated
automatically to French. However, he didn't carry around a translation device or dictionary, because firstly, he
didn't have one, and secondly, he didn't feel the need to get one. He felt that it would only draw negative
attention because it would make him seem more tourist-like to the local French. He did, however, carry around
some "learn-it-now" CD's, to quickly learn some commonly used French phrases. Upon reflecting on his travel,
KirbySuperStar1995 realized that different situations varied in difficulty. For instance, eating at a restaurant was
relatively easy because most waiters knew some English, and you can always point and look at pictures. Riding a
taxi was also relatively trouble-free because he could just point to a map and gesture, "go here". However, using
the subway system was very difficult, because there were no English translations, and the maps were pretty
complicated to begin with. Lastly, the interviewee mentioned that as a whole, it is feasible to travel around a big
foreign city without a translator (human or machine), but in the more rural areas, where there is less bilingualism,
a translator may be necessary.
The third interviewee, Charmander♥, talked about several experiences dealing with the language barrier. In
one instance, he wanted to ask a restaurant owner for some information, to help him build a Facebook
application. However, the restaurant owner did not know English, and so just shooed his team out. Clearly, a
translator, be it person or device, would have prevented this outcome. Charmander♥ also talked about his
immigration experience to the US from Hong Kong, when he was in elementary school. Although Hong Kong
was bilingual as well, there were many words Americans used that weren't used in Hong Kong. This situation
was most often encountered when he was reading textbooks. The interviewee thus had to carry around an
electronic translation device, to translate those new English words to Chinese. Unfortunately, the device was only
capable of translating one word at a time, so sometimes, the meaning of a sentence still isn't clear because by
translating a word at a time, the translator could not put the words in context. At times, the teacher was helpful
and had other bilingual students translate for him. However, he noted that the student translators (and even adult
translators, for that matter), were imperfect, as they often cut out "non-essential" words for the sake of
conciseness, at the expense of an authentic translation. As he grew up, oftentimes he found himself in the
opposite situation: now, he often has to translate from Chinese to English. In this case, the classical electronic
translator would not be as helpful--unless you knew how to type in the Chinese characters, perhaps using pinyin
or zhuyin. He circumvented this problem by guessing the [English] answer, and checking if the Chinese
translation given matched the character in question.
List of Tasks
Easy
* Choosing from a variety of possible translations - There are many instances where a word can have multiple
translations in another language. The ability to view these multiple translations is supported whereas certain
other translations devices give the user only one translation.
* Selecting language options - This encompasses not only what languages you want to translate to and from, but
also the ability to customize what language dictionaries they want in their phone. This task is supported because
a user usually has languages he or she wants translations from and languages the user will never bother to have
translations from. For instance, a user may frequently visit Japan and China, so the user will most frequently use
the Chinese and Japanese translations provided by our application. Having a French dictionary would therefore
be useless to that particular user. But if the user happens to want to take a vacation to France, the user has the
ability to download a French dictionary into their phone. And when the vacation is over, the user can remove that
dictionary.
Moderate
* Provide meaning for symbols and signs - In foreign areas, such as overseas countries and shopping malls,
there are many unique and foreign pictorial signs that prove to be unfamiliar to the user. Our application will be
able to communicate to the user what these foreign pictures mean, with a short verbal description. The signs do
not even need to be in a foreign country. It could be a sign used in the user's native country, but the user does not
know what the sign means. The user will finally know what it means thanks to our translator application.
* Save result for future reference - There may be words that the user frequently sees or uses, so saving those
words would eliminate the need and hassle of repeatedly translating the words. For example, if the user has an
unfortunate small bladder such that the user goes to the bathroom often, and if the user has bad memory such that
the user will not remember the translation for bathroom even if the user did a translation of it already, having
saved a translation of a bathroom sign is helpful. This is also useful for users who wish to learn the foreign
language; the saved results feature can be used like flash cards.
Hard
* Selecting region of picture to translate (should not be necessary for 75% of cases) - For example, there
could be many signs next to each other, but the user only wants to translate a specific sign. Our application
supports the ability to take a picture, and then select the desired region in the scene for translation, eliminating the
undesirable parts of the scene.
* Advanced options such as multi-shot mode, where every x seconds, a picture is taken (may be good for
TV shows) - For example, the user is watching a foreign show with foreign subtitles. Chinese dramas, for
instance, have Chinese subtitles on screen and the user watching them does not know how to read or understand
Chinese. The foreign subtitles usually stay on the screen for several seconds, and then a new subtitle comes on
the screen, and so on. With the multi-shot mode, the application will automatically take multiple pictures in
succession every x seconds (x being specified by the user). So the user with multi-shot mode will be easily
having pictures of the different subtitles that change about every x seconds. And with those pictures, the user will
have the translations done via our application.
Task Analysis Questions
1. Who is going to use the system?
The users are people who travel to overseas countries or other areas in general whose main language is
foreign to them. This system could also be useful for those who want to start learning a new language.
All of the users we interviewed have some sort of experience in one of these areas.
2. What tasks do they now perform?
The tasks the users perform now are utilizing dictionaries, electronic translation devices, and human
translators.
3. What tasks are desired?
Well, the main task that is desired is being able to read languages that you would otherwise not be able to
read without having learned it. The point of this system is to remove that language barrier and to allow
the user to communicate with the foreign environment with ease.
4. How are the tasks learned?
The tasks will be learned through a simple instructions manual, which will be pictorially based to convey
the usage easier to the user. There will be brief short descriptions on each step of the process to go with
the picture to provide only the most essential information to the user. These tasks should not be difficult
to execute.
5. Where are the tasks performed?
These tasks are performed in areas where the user is not familiar with the surrounding language. When
the users goes to a foreign land, such as in the case of KirbySuperstar1995 when he visited France,
whenever he sees something he cannot read, he can pull out his phone, capture the image, and then
translate it to English so that he can finally read it.
6. What's the relationship between user and data?
The user can save the images and translations that they have taken so that they can look at them again at a
later time.
7. What other tools does the user have?
The users are also able to take multi-shots of images to catch fast changing texts such as certain billboard
signs or words on a television screen. The user can also pick their language options so that more than one
language is available to them.
8. How do users communicate with each other?
Users will be able to send other users the images that they take that may also come in use for them as
well. For example, if traveling with a fellow family member but you temporary separate from them, you
can send them your images so that the next time they come around they'll already have the translation.
9. How often are the tasks performed?
The tasks are performed whenever the user feels like he wants to know what something means but he can't
read it because of the language barrier.
10.
What are the time constraints on the task?
The user won't always have a lot of time on their hands. They may not be able to stay there all day and so
the user would have to take pictures quickly and move on in order to experience all the events that the
area has to offer.
11.
What happens when things go wrong?
If things go wrong, the user can just retake the picture or refocus on a part of the picture or look up an
alternative definition that would make the context clearer.
Interface Design
Functionality summary
There are a lot of things we can do with the pictures. First, you can take a picture. That implies that you can
store the picture and organize/manage them as well. You can send the pictures to others, which will be useful
especially with translations. You can also view the pictures in different ways (zoom functions, highlight, and
selection) which will be useful, especially when specifying where to translate and if you want the translation
directly on the picture, where to put the text on. Finally, you can extract text or specific pictures (like signs) from
the image (if applicable).
Text can also be manipulated in this program. Any text keyed in or taken from a picture can be translated.
Several translations can be available as well, especially if there is not enough context. The translation can happen
with locally or remotely stored dictionaries (the cell phone should act as a first resource and also as a cache).
In the last two things, the "to" and "from" languages should be easily changeable for the specific pictures/texts.
There's also an area outside of the main functionality that we got from contextual interviews, which is more
like a tourist guidebook but more detailed. It has some common/basic phrases and monuments/attractions like
most tour books do but the common phrases are vocalized. Also, we also have a database of signs and important
colors and features (for example, in Japan, their stop signs look different and their color for stop is blue instead of
red). We also will have a description about the culture and how people in the country interact, so a person can be
more assimilated and less intrusive (not be like the typical annoying tourist).
We also have a "multi-shot" ability, where it will automatically take a shot at every so and so time interval.
This can be helpful when text changes (like billboards or movies) and also useful for walking around so you have
some context. You can play it back like a movie, complete with translations (where the translations are, such as
on or below original text), can be changed in a separate "options" menu.
Using the map functionality in Android, you can also keep a pin on the map that corresponds to the
picture/movie, so you know where the picture was taken, and also it will help anyone you send the picture to.
Options menu is also available. From here, you can change such things such as where the translated text should
appear on default (like directly on the original part of the picture, to the side a little, or in a separate box/text file).
Language options are also available, like which languages are available directly from the phone (since
dictionaries take space, it would be nicer to have only some that the user uses). Naturally, some other things like
default to and from languages should be in here as well (it should auto-detect the language of Android on first run
and also the "from language" depending on GPS).
User Interface Description/Sketches
3 scenarios of example tasks/sketches
Easy - Choosing from a variety of possible translations
This scenario shows an example of when using the multiple translations would be useful. The user is in a foreign
library and looks at a sign above the bookshelves. He doesn't know what it means so he pulls out his mobile
phone and snaps a picture of it. The mobile application then translates the sign into English for the user to read,
but the translation doesn't make any sense to him. He then uses the application's multi translation function and
finds the next available translation for it, this time making sense
contextually.
Moderate - Save result for future reference
This scenario first shows a user using the application to translate a Chinese word ("male", in this case, referring to
the bathroom). He decides to save the word that is translated. He then shops around the mall, and twenty
minutes later, needed to use the bathroom again. He is confused with all the different Chinese signs everywhere,
so goes back to the application to look at his "Favorite translations" list. On it, he finds "male", and clicks it, and
out comes the original picture with the Chinese word. He recognizes it on a door, and goes happily runs there.
Hard - Selecting region of picture to translate
the scenario depicts a user doing the task of selecting a region of the picture to translate. So in the first frame,
there is the user and a bunch of signs in front of him. In the second frame, the user wants to translate one of the
signs. So in the third frame, the user takes out his cell phone and takes a picture of the signs. Then in the fourth
frame, he uses the touch screen cropping feature to select the sign he wants to translate. When he has selected the
desired sign, he hits translate. So in the fifth frame, the user gets a translation of what the sign says. And finally
in the last frame, the user has a reaction.
Analysis of Approach
Android affords specific technologies, not to mention specific features as well, that will be particularly
beneficial to our project. The first is naturally going to be the ability to take relatively detailed pictures. In order
to recognize text from a picture, we should be able to take decent pictures so our text recognition program will be
able to decipher the text (especially between text and non-text pixels). It should also have the capabilities to use
and store said picture, and Android affords a certain level of RAM, nonvolatile memory (for storage) and
computing speed necessary for somewhat intensive mobile applications like this. We can use a touch screen and a
keyboard and other inputs of that nature. This will make inputting text (for translation, for specifying which part
of the picture/screen to look at, option selection and navigating databases/dictionaries, etc) easier, as opposed to
just the typical cell phone input, which would force us to find less optimal ways of navigating. Android also
affords Internet access, so we can use the Internet access as a way to store data (like a database of
words/dictionaries, etc), since Android also affords not as much storage space (being mobile). This will also
allow us to be able to keep used data and other such things on the phone itself (for fast access and as a memory
saving technique) but also allow us to keep a large database of extensive and complete information. Android also
affords a Linux kernel so any open source code will be able to be recompiled to work with the mobile device.
Also, any other program will also work with WINE, be it Apple or Microsoft based. This is particularly useful for
finding translation programs and image to text conversion programs. This will also give us leeway with how we
use the code and programs that we end up using. Internet will also be useful for accessing databases online and
parsing web pages (like getting several different translations by using different services). Internet is also
imperative to sharing information with others, whether through personal contacts or general help of other users of
the application (user generated content and other such things).
There are plenty of other devices that are used for translation purposes. The biggest issue with those is that
they often don't have a camera or have poor camera quality. Since this is the central task of our application, this is
a huge issue as to why other devices are not suited. And while other devices will often have better storage,
computing, and human interaction abilities (such as touch screen and bigger screens), they will often not be
portable, which severely decrease the mobility of the device. Our other main competitor, the traditional machine
translator, has been described as something that costs a lot for something that doesn't even work. The devices will
often translate word by word or common phrases without attention to grammar, sentence structure or content.
Also, the translations will often be wrong without the ability to analyze the words. There is no recourse or
multiple choices in these situations either. They also don't have the benefit of a large database that is readily
available through wireless communication and the Internet that follows. Finally, human translators are an issue as
well; although the most reliable in both image to text conversion and in translation, they often introduce some lost
in translation issues (the translated phrase will work but some of the meaning in diction may be lost in
translation). Also, they require constant upkeep (both emotional and economical) and it reduces independence for
the user. As for the issue of PDAs, the Internet issue may be prohibitively costly, as well as there being a keypad
issue for issuing commands (although it still has the touch screen). However, the issues aren't that different since
PDAs and smart phones are converging anyway.
The approach we are taking is good for a few reasons. Firstly, having this application on a small mobile device
is good since people would rather carry less than more, with the same functionality. Also, the camera will put less
stress on the user to decipher the text they see and try and type or draw it out on the screen (drawing is a lot less
reliable anyway with the current technology). There's also the issue of the user being unfamiliar with the
character set of the language in question, so they can't type it in. Being able to automate a lot of functions that
would put stress on the human is important. Touch screen will make certain on screen options and such more
intuitive, though.
There are cons to the approach as well however. Even if we use commercial code, the language translation will
never be 100% perfect with our current technology (or at least to the level of a human translator). Speed may also
be slightly an issue (we haven't tested it out yet, however). This is also potentially costly.
Download