Reports Gordon Rugg, January 2006 Reports Reports involve the respondent reporting aloud about something. There are various well-established forms of report, such as think aloud technique (where the respondent thinks aloud about what they are doing) and critical incident technique (where the respondent describes some past incident which was in some way critical). These forms of report can be fairly neatly categorised in terms of two criteria. One criterion is the tense – future, present or past. Sometimes the reporting is about something that might happen in the future (as in scenarios, where you ask the respondent what would happen if something were to happen). Sometimes it’s about what is happening now, as when a respondent is thinking aloud about their experience of trying to use a piece of software. Sometimes it’s about the past, as in critical incident technique. Another criterion is the person. Some reports involve the respondent reporting their own future/present/past actions: first person reports. Others involve the respondent reporting on the future/present/past actions of somebody else: third person reports. Third person reports overlap with projective techniques, when you ask someone to respond as if they were some other specified person (e.g. asking carers to respond as if they were a patient). They can be useful for getting at “back” versions and for handling tasks where it’s logistically difficult to do first-person reports (for instance, any task involving language, such as calming down an angry customer). There is, however, the risk that the results from third person reports will be affected by attributions – there’s a relevant literature on attribution theory. It’s also possible in principle to do second-person reports, for example when piloting instructions and materials for a study. You can ask one or two pilot respondents to work through the instructions etc, and to give a running commentary on what they think you’re trying to do in the study. This could reveal some useful things about how your intended respondents might construe the instructions – for instance, whether they think there’s some sort of trick question involved. Fixing such things should improve the validity and robustness of your findings. This approach isn’t widely used, but it’s one we’re going to investigate in more depth. This article isn’t a comprehensive summary of all the types; it focuses mainly on the ones that we happen to use most often because they’re most suited to our purposes. We haven’t given a comprehensive bibliography, but we have given enough keywords to make it pretty easy to track down the original texts when you want more detail. Think aloud technique Think aloud technique is pretty much what it sounds like. You ask someone to do a task, and to think aloud about what they are doing while they are doing it. This is useful for a lot of purposes, and allows you to get at various kinds of knowledge which are difficult or impossible to reach via other methods. The most obvious of these is short term memory, but think aloud technique is also useful for giving insights into whether people are tackling a task using pattern matching or sequential reasoning; it’s also useful for identifying which things they bother with, and which things they don’t notice. As ever, there are various limitations; for instance, the action of thinking aloud interferes with some types of task, so you don’t get valid insights, and analysing the output can be challenging. It can be very useful in preliminary investigations, as a way of identifying things worth following up with other techniques. Think aloud technique has been around for a long time, under various names. These include concurrent verbalisation and on-line self-report. In some fields, learners are taught to think aloud as part of the learning process, so that the instructor can check that the learner is paying attention to the right things (for example, when driving a car). It’s closely related to other techniques such as critical incident technique and scenarios (which involve reporting on past and hypothetical future events respectively). It can also be used projectively, when you ask the respondent to answer as if they were someone else (for instance, asking a nurse to answer as if they were an elderly patient). This can be useful for identifying where there are systematic misunderstandings between groups. The basic concept is simple: you tell the respondent what the task is, and ask them to think aloud while doing it. If they are silent for more than a set length of time (e.g. five seconds) then you use a prearranged prompt to get them talking again (e.g. “Could you tell me what you’re thinking about now?”) These prompts should not be leading questions (e.g. “Are you looking at the background of the picture?”) The task needs to be one where thinking aloud won’t cause interference. There are obvious problems with some verbal tasks such as interpreting or negotiating; these can be tackled to some extent by first recording the respondent doing the task in their normal way, and then playing back the recording to them, and asking them to give a commentary based around the recording. There are less obvious problems with some tasks which involve problem-solving and compiled skills, where the fact that the respondent is thinking explicitly about what they are doing causes interference (probably because they are shifting into sequential reasoning for a task they would normally tackle using pattern matching and/or parallel processing). The document on questioning methodology elsewhere on this website explains these terms, if you’re not already familiar with them. The actual data collection using this technique is usually pretty straightforward. One thing to watch for is respondents using visual signals which will be lost if you use only audio recording (for instance, saying “that bit of the page” and pointing to an area of the web page they’re commenting on). The other most common problem during recording is respondents either talking too much or too little. This problem can be reduced by doing a quick demonstration as part of the respondent’s briefing, in which you do a think-aloud about something completely unrelated to what they will describe, so you don’t cue them. For instance, if they’re doing a think-aloud about car advertisements, you might do a think-aloud about a painting or diagnosing an electrical fault, or whatever area of expertise you happen to have – hobbies are useful for this. If you’re using this technique for reconnaissance, then you won’t need to do elaborate analysis. If you’re using it for your main data collection, then you will probably hit problems of some sort with the analysis. A major source of problems is that the data from this technique is usually messy, unclear and unstructured. If you have a clearly defined research question, you may be able to analyse the results straight off the tapes; if you have to transcribe the data, then this can be very time-consuming (in the order of ten hours of transcription per hour of tape, depending on how good your typing is and how loquacious your respondents are). One thing worth looking at is what your respondents do in the first few seconds after starting the task. For some tasks you’ll get an instant response, within a second – for instance, an immediate response to an advertisement or to a Web page.This tells you that they’re using pattern matching, and responding to the image as a whole even before they’ve read any of the words on it. (What the implications are in a given case is another question, which should give you plenty of food for thought.) For this reason, it’s worth structuring your data collection so that the respondent doesn’t see the task until after you’ve started recording, so you can record that immediate response. Another thing worth looking at is where your respondents go quiet and start thinking. This tells you something about where problematic areas might be. A similar issue is swearwords; these are invaluable indicators of problems, particularly if you’re looking at task design or product design. You should not discourage respondents from swearing. If you are getting someone else to transcribe for you, then make sure that they don’t sanitise the transcript by leaving out or changing the swearwords. The same applies to silent areas: the convention on transcripts is to use one full stop per second of silence (so “….” shows four seconds of silence). “Um” and “er” sounds are also worth noting, for the same reason, particularly when the respondent is otherwise articulate. You can analyse the output qualitatively, by identifying the things which are mentioned, and/or by representations such as cognitive causal maps. You can also analyse it quantitatively, by recording which things are mentioned by which respondents. A frequently used way of doing this is a table, with each column representing a respondent, and each row showing a thing which has been mentioned; each cell then shows whether or not the appropriate respondent has mentioned that thing. Alternatively, the table can show how often each respondent mentions each thing, but this is considerably more work, and makes an implicit assumption that the number of mentions corresponds directly to the thing’s importance, which may not be the case. The table below shows some hypothetical data arranged in this way, with responses for nurses and for patients (n1-n4 and p1-p4 respectively) dealing with problems affecting hospital out-patients with hand injuries. bathing oneself feeding oneself n1 n2 n3 n4 ● ● ● ● ● ● ● ● p1 p2 p3 p4 ● ● ● ● ● ● ● ● combing hair ● doing washing up ● changing TV channels with remote control ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● These fictitious results show that all the nurses, and all the patients, correctly identified bathing, feeding and combing hair as important; most of the nurses identified doing the washing up as important, but only one nurse identified changing TV channels with the remote control as being an important potential problem. Think aloud technique has been applied to a lot of areas. It works well for investigating people’s perceptions of artefacts and products (it’s widely used for investigating perceptions of advertisements and of software interfaces, including Web pages); it also works well with assessing how usable something is, and with investigating how people tackle tasks and problems (which can be useful in training and education, especially if you compare how experts and novices tackle something). One problem with data from this technique is that respondents often raise tantalising points, but don’t unpack them. One simple solution is to use think-aloud technique in combination with laddering, and either unpack the points when they are mentioned, or work through them all at the end of the think-aloud session in a follow-up laddering session. The classic source for think-aloud technique is Newell & Simon’s work. It’s also described in the textbook on Human Computer Interaction by Dix et al. Some examples of student projects using this technique: Glenn McIntyre and Kim Ridsdale used it to investigate which features of a website were perceived as important for assessing the security of the website and the quality of product being offered by the website respectively. Mira Chernikova used this technique to compare perceptions of websites for Australian tourist sites among Russian, Dutch and British respondents, and found some interesting differences between these cultures in relation to their perceptions of the websites. (This involved collecting data in three languages, and then translating it before analysing it, which is why this technique is seldom used for cross-cultural work, even though it gives useful insights – Andy Hurd’s use of card sorts for crosscultural elicitation shows a different way of tackling this problem.) Colette Best used think-aloud technique to investigate which things users wanted from websites providing online tutorials (as opposed to what the literature suggested to be the key things which online tutorials should provide). Zoe Szymansky used think-aloud technique to investigate gender bias in website design. Scenarios Scenarios involve asking respondents to report on what they/some other person would do if a given situation arose. This can be useful for handling what-if cases and cases which couldn’t be handled via present-tense reports: for instance, dangerous situations or ones which would be logistically difficult to arrange. Scenarios can be used to explore possibilities systematically: for instance, exploring all of the options identified as possibilities in a public consultation exercise, or all the logically possible solutions to a design problem. Neil Maiden and colleagues produced an elegant example of this via a software tool which took the prototypical script for an interaction (in this case, using a “hole in the wall” machine to withdraw money), and then automatically generated scenarios for various types of error which could occur at each point in this script (for instance, the scenario that the respondent entered their PIN incorrectly). Scenarios can be very useful, but need to be handled with care. There is considerable evidence that people are very bad at predicting their own future behaviour in situations that they haven’t encountered before. The “heuristics and biases” literature contains numerous examples of this (Kahneman, Slovic & Tversky’s classic text is a good place to start, though their findings have been re-interpreted by researchers such as Gigerenzer, Wright and Ayton). Critical incident technique This technique, as its name suggests, involves focusing on an event which was in some way critical – sometimes because it involved something going horribly wrong, sometimes because the incident involved some particularly important illustrative issues. In this respect it is similar to techniques such as hard case technique (which uses a difficult case to demonstrate a particular point that may not be so obvious in easy cases, or to elicit knowledge about how experts tackle the cases which novices can’t handle) and illuminative incident technique (which focuses on incidents which illuminate some underlying problem whose nature is usually not clearly visible). Critical incident technique is well established in domains such as accident analysis, and has a well formulated procedure which has been described in detail by various authors. There are obvious potential problems with critical incident technique, which are well recognised among users of the technique. One set of potential problems involves deliberate human bias – those reporting on the incident may have vested interests in presenting a version of events which displays them in a favourable light. A related potential problem involves unintentional human bias, particularly when those involved are depending on their memories of the events. Human memory is not a passive recording process resulting in something like a grainy photograph; it is a process which is active at both the point of encoding and the point of retrieval, more like a sketch drawing than a photograph, with the artist deciding what to draw and what to omit, and also having to work out afterwards what was represented by a particular set of lines. Just because a memory is vivid, that does not mean that it is necessarily accurate; even vivid memories from an impartial outside observer may suffer from various biases and failings, such as misremembering the sequence of events. These problems are more of an issue for some uses of this technique than for other uses. For instance, if you are studying the espoused practices of an organisation (i.e. the practices which that organisation claims to follow), then the factual accuracy of versions of a critical incident are less important than the symbolic importance of that event to members of the organisation. If, on the other hand, you are trying to find out the factual events leading to an accident, so you can prevent a similar accident in the future, then these problems are clearly more important. Techniques such as document analysis and indirect observation can be used to complement critical incident technique, as a way of independently checking some facts and of establishing others.