Designing Mobile Interfaces for Novice and Low-Literate Users Bill Thies Microsoft Research India Joint work with Indrani Medhi, Thomas Smyth, Emma Brunskill, Kentaro Toyama, Ed Cutrell, Somani Patnaik, Latif Alam, Satish Kumar, and Saman Amarasinghe USID 2009 September 20, 2009 Mobile Phones in the Developing World Population in Billion 6.7 1 3.5 3.2 World Cell Phone Attained Population Users Secondary Education1 D. Bloom, Measuring Global Educational Progress, 2006 2 World Bank, 2000 3.1 1.0 Live to Bank Account Age 602 Holders Usability Barriers (Indrani Medhi) • Conducted ethnographic observations of 125 people on traditional text-based interfaces • Navigation difficulties: – Navigating hierarchical structures – Mapping soft-keys • Input difficulties: – Using scroll bars – Using checkboxes – Constructing SMS and USSD syntaxes • Language difficulties: – Specialized terms (e.g., transaction, jaundice) do not translate to local language In Ph Ke RSA Design Recommendations Case 1: Text-Based UI • Provide local language support (in both text and audio) • Minimize hierarchical structures • Avoid requiring non-numeric text • Avoid menus that require scrolling • Minimize soft-key mappings Design Space flexible Free-form speech Input method Live Operator Structured speech Typing inflexible Spoken Dialog Text-Based IVR Forms, SMS, etc. Interactive Voice Response Text Audio Output method Graphical UI flexible Graphics [+ Audio] Focus 1: Text vs. Spoken Dialog, Graphical UI Task: transfer money to a peer Participants: 58 non-literates (up to 6th standard), Bangalore Text Based Spoken Dialog Task completion 0% 72% Time taken — 5 min Help needed — 4 prompts Graphical UI Rich multimedia 100% UI (without text) 13 min 14 prompts Conclusions: • Non-text designs are strongly preferred over text-based designs • While task-completion rates are better for rich multimedia UI, speed is faster and less assistance is required on spoken-dialog system Design Recommendations Case 2: Rich Client UI • Recommendation: graphical UI with spoken input? flexible Free-form speech Input method Live Operator Structured speech Typing inflexible Spoken Dialog Text-Based IVR Forms, SMS, etc. Interactive Voice Response Text Audio Output method Graphical UI flexible Graphics [+ Audio] Focus 2: Text vs. Live Operator Task: report patient health symptoms Participants: 13 literate health workers and hospital staff, Gujarat Append to current SMS: 11. Patient’s Cough: No Cough Rare Cough Mild Cough Heavy Cough Severe Cough (with blood) Error rate Time taken - Press 1 - Press 2 - Press 3 - Press 4 - Press 5 — printed cue card— Text (Menus) Text (SMS) Live Operator 4.2% 4.5% 0.45% 1.7 min 1.6 min 2.3 min Conclusions: • Live operator interface is only one with sufficient accuracy for health data • This model is also simple to adopt and cost-effective in India (call centers cheap) • Results caused partner to switch upcoming TB program from text to operator Design Recommendations Case 3: Reporting Short Data • Recommendation (in India): use a live operator • Our proposition: Operators are under-utilized for mobile data collection • Benefits: – Lowest error rate – Less education and training needed – Most flexible interface • Challenges: – Servicing multiple callers Peer-to-Peer Media Sharing (Thomas Smyth) • If users are properly incentivized, they will overcome many barriers (slides abridged – more details to be published soon) Enabling User-Generated Content • User-Generated Content has come to define the Web – Original attraction of the Web….everyone can be a publisher – Now…Blogs, review sites, digital video, forums, news comments, … – Empowers ordinary citizens with a voice + a global audience “75% of all content on the Web is user-generated.” — Reggie Bradford, CEO of Vitrue “35% of U.S. Internet users have posted some sort of user-generated content online.” — Home Broadband Adoption 2006, Pew Internet & American Life Project • How do you enable someone to generate content… – With a low-end phone? – With limited literacy? – In their local language? Promising avenue: Leverage voice Solution: An Audio Wiki • Allow users to publish information: – Using a phone rather than a computer – Using voice rather than text • Audio recording and playback, but keypad-driven navigation – Not attempting a dialogue-based system recording, playback navigation • Rich space of applications spanning citizen’s journalism, political activism, dissemination of agriculture & health information, ... • Research challenge: making it usable – Interactive voice response (IVR) typically frustrating – Research: adaptive interfaces, audio linking, flexible playback Rich Space of Emerging Services • VoiKiosk / Spoken Web [IBM Research, ICTD 2009] – 4 months;1,000 users; 20,000 calls – Killer app: personal advertising – Toll-free number • Providing an audio frontend or analog to Twitter – – – – TwitWoop AudioBoo TwitSay TwitterFone – MySay – VoiceField – TweetCall – TweetMic But not a single one is available in India • Opportunity to redefine the “browser” for audio content Conclusions Mobile phones have usability barriers for novice and low-literate users – Use voice and graphical interfaces – Consider a call center when appropriate • If users are properly incentivized, they will overcome many barriers – As evidenced by mobile video sharing – Entertainment is a powerful motivator Future opportunity in enabling usergenerated content for novice users – Can voice services mirror the Internet? – Key challenges for user interface designers