Evaluation Methods April 20, 2005 Tara Matthews CS 160 In 160 We’ve Covered… • • • • Task Analysis & Contextual Inquiry Cognitive Walkthrough Heuristic Evaluation WOZ usability study w/ paper prototypes There are many more methods… • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory How to chose a method? • Stage of study – formative, iterative, summative • Pros & cons • Metrics – depends on what you want to measure • Qualitative vs. quantitative • Research perspective – CS vs. psychology vs. sociology Pros & Cons • • • • • Realism Precision Generalizability Time & cost Researcher expertise Methods • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Survey • Online / paper questionnaires distributed to target audience • Can be used to – tabulate quantitative data – gather qualitative feedback (opinions, feelings, etc.) • Useful at any time in study Survey • Pros – Easy to get a large number of responses. – Quick and easy to conduct. – Highly generalizable. • Cons – Self-selection. – Participants often only offer enough information to answer the question. – Can miss details. – Low in realism and precision. Interview • Evaluators formulate questions on the issues of interest. • Interview representative users, asking them these questions in order to gather information desired. • Interviewer reads questions to user, who replies verbally; interviewer records responses. Interview • Pros – – – – Quick and easy to conduct. Gives designer quick feedback on a range of ideas. Can get a person’s initial reaction to an idea. Can get detailed information from a person. • Cons – Often takes place away from natural setting. – Question wording or interviewer “body language” can bias answers. – High probability of false positives and missed problems (e.g., users may not have a clear idea of how an app will be used). – Can miss details if interviewer doesn’t know what issues to draw out. Controlled Lab Experiment • In lab, manipulate one feature of a system to assess the causal effects of the difference in that manipulated feature on other behaviors of the system. • Example: – in lab, show users 4 versions of a website: • blue, yellow, red, and black text – measure time to find specific words – compare Controlled Lab Experiment • Pros – – – – Provides precise, quantifiable data. Easier to draw inferences from data. Relatively quick. Can get a medium-sized number of participants. • Cons – Short duration of a lab experiment may not be enough to allow users to become accustomed to an app. – Not a natural setting – interaction may not be normal. In-lab Observation • Participants come to lab to "use" an interface • Given sample tasks to complete with it • Evaluators observe and possibly audio- or videotape • Participants may "think out loud" • Can use lo-fi prototype (for a project in the design stage) to an almost-complete interface • Evaluators note participants’ – emotions, exclamations, facial expressions, and other "qualitative" data – take note of quantitative data such as time to complete a task or number of errors In-lab Observation • Pros – Relatively quick. – Can get a medium-sized number of participants. • Cons – Observations are subjective and error prone. – Short duration of lab observation is not enough time for user to get accustomed to using the interface. – Not a natural setting – interaction may not be normal. Controlled Field Experiment • In natural setting, manipulate one feature of a system to assess the causal effects of the difference in that manipulated feature on other behaviors of the system. • Example: – Participants use 3 different input devices in their own office: mouse with 1, 2, or 3 buttons – Perform a set of tasks – Measure differences Controlled Field Experiment • Pros – Less intrusive than most other evaluation methods. – Provides more precise data than field observation. – Can observe natural behavior of user (though some part of the system will be controlled/unnatural). • Cons – More intrusive than field observation. – Less natural than field observation. Field Observation Study • Evaluator makes direct observations of “natural” systems • Takes care to not intrude on / disturb those systems • A.K.A. “ethnography” Field Observation Study • Pros – Only way to observe natural behavior of user & interaction between user & tools. • Cons – – – – Difficult and time consuming. Hard to get permission to observe people. Observations are subjective and error prone. Cannot make strong interpretations from observations. – Not very generalizable. Heuristic Evaluation • Pros – Quick and easy. • Cons – Nielson’s heuristics may not be as relevant to non-GUIs. – Results in false positives in missed problems, especially when experts are not part of target audience. Cognitive Walkthrough • Pros – Quick and easy. • Cons – Results in false positives and missed problems when evaluator is different from target audience. Automate Observation Study • Techniques include – video or audio recording of user – pop-up screens – screen shots – time logging – log users actions (collecting statistics about detailed system use) Automate Observation Study • Pros – Eases burden on observers for data collection & analysis. • Cons – Setup is often more time-consuming to complete. – Harder to get approved if it involves analysis of videotape or audiotape. – May miss nuanced/interpretive details. Experimental Simulation • In-lab experiment that is as much like some real situation as possible. • Example: – ground-based flight simulator – behaves as closely as possible to a real flight – still under researcher control Experimental Simulation • Pros – Still fairly precise. – More realistic than in-lab experiment. • Cons (same as lab exp.) – Short duration of a lab experiment may not be enough to allow users to become accustomed to an app. – Not a natural setting – interaction may not be normal. Claims Analysis • Claim = statement that a certain aspect (button, scrollbar) of a design has psychological implications reflected in how capable a user is in using that design • UI artifacts are listed along with their design features & pros/cons • Helps – select among alternative designs – clarify questions to be analyzed through user testing by stating how the design should work (in claims) GOMS • A method to describe user tasks and how a user performs those tasks with a specific interface design • Views humans as information processors – Small number of cognitive, perceptual, and motor operators characterize user behavior • To apply GOMS: – Analyze task to identify user goals (hierarchical) – Identify operators to achieve goals – Sum operator times to predict performance • GOMS = – Goals: What a user wants to accomplish – Operators: Cognitive or physical actions that change the state of the user or the system – Methods: Groups of goals and operators – Selection rules: Determine which method to apply GOMS • Pros – Predict human performance before committing to a specific design in code or running user studies – Many studies have validated the model (it works) • Cons – Assumes error-free, skilled user behavior – No formal recipe for how to perform analysis – Significant time investment Computer Simulation • Creating a complete & closed system that models the operation of the concrete system without users. • Example: – geophysical process going on in connection with the eruption of Mount St. Helens Computer Simulation • Pros – Supposedly high in realism (depends on accuracy of data/system replication) • Cons – Low in precision & generalizability Formal Theory • Formulating general relations (propositions, hypothesis, or postulates) among a number of variables of interest. • Pros – Relatively generalizable • Cons – Not realistic or precise How to chose a method? • Stage of study • Pros & cons – – – – • • • • Realism Precision Generalizability Time & cost Researcher expertise Metrics Qualitative vs. quantitative Research perspective Methods • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Early Stage • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Early Stage • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Iterative & Summative Stages • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Iterative & Summative Stages • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Realism • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Realism • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Precision • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Precision • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Generalizability • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Generalizability • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Time & Cost • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Time & Cost • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Researcher Perspective • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Metrics: examples • Traditional GUIs: – efficiency (time to complete task) – accuracy (# of errors) – simplicity • Peripheral Displays: – awareness (recall) – distraction (dual-task behavior) – aesthetics Peripheral Displays • Survey • Interview • Controlled-lab experiment • In-lab observation • Controlled field experiment • Field observation study • Heuristic Evaluation • Cognitive Walkthrough • Contextual Inquiry • Automated observation user study • Experimental simulation • Claims analysis • GOMS • Computer simulation • Formal theory Questions?