Microsounds: An Experimental Investigation of Sound Cues for Interaction David Thiel, Mary Czerwinski, and Barry Peterson Microsoft Research One Microsoft Way Redmond, WA 98052 USA +1 425-936-5637 dthiel@microsoft.com ABSTRACT This paper presents the design and experimental evaluation of a set of audio cues specifically created to present the user with high information content without high disruption (to the user’s attention). Because these sounds are of very short duration (~250) we call these sounds the Microsound sets. Two experiments test the discriminability of the Microsound designs, as well as their semantic associations to computing events. Results demonstrated that all our sounds were distinguishable from one another and with sounds under 200 ms, an audio designer can effectively evoke consistent positive , neutral, and negative associations to computing events. Benefits to this approach include the possibility of increasing the user’s awareness about system status and a potential HCI improvement for vision-impaired users. Keywords Auditory user interface, acoustic dimensions, user study, auditory processing, sound, attention, audio INTRODUCTION We explore the design of a sound set that can be used to the benefit of the user during human-computer interaction (HCI). By way of introduction to this problem, it is reasonable to question why one should pursue the use of sound in HCI at all. While the auditory channel is invaluable in our interactions with the real world, its importance is often overlooked. In fact, sound in computing has been used almost exclusively as an alerting mechanism for system events. We believe sound has been underutilized in computing for a variety of reasons, some of which we attempt to address with this research. Sound cues during interaction are extremely attention demanding, like animation in the visual channel. When user interface feedback is attention demanding it is the designer’s responsibility to ensure that the information passed through an attention demanding channel is of enough value to the user to warrant a user’s attention. The designer must remember: a user can control where he/she looks, but not what he/she hears. The visual features of the user interface are constantly available in a spatial, not temporal format (assuming the user can find them). Auditory feedback competes for attentional resources asynchronously with the user’s focus. This is a simultaneous strength and weakness of the use of sound during interaction. On the one hand, hearing an alarm is out of the user’s control, and therefore has an alerting effect on the user. On the other hand, if the event associated with the auditory interaction does not warrant the user being distracted from the task at hand, user performance suffers from the disruption. Our goal is to find design heuristics for sound that allow sound designers to fashion sounds that deliver more information with less disruption than is currently done with dosay’s audio cues. Like screen real estate, a user’s attention a limited resource. Brevity with audio has the potential of conserving cognitive resources. Savings in cognitive resources could also be attained by utilizing an alternative and unused information-processing channel [11]. Finding a useful and useable set of Microsounds could potentially increase satisfaction, presence and engagement with a computing session. Therefore, we began with the following design goals to guide our explorations of the Microsound design space: Microsounds should be designed as audio cues that provide information to the user in the smallest possible amount of time. Microsounds should be cues that the user could choose to easily ignore. Microsounds must be distinguishable from each other so that associations with events can be recognized and learned over time. Microsounds should communicate distinctions like “positive”, “neutral”, and “negative” connotations, without training. Previous Work Gaver [5] developed the SonicFinder system, leveraging human intuitions about auditory events resulting from the physics of interface events. For instance, the sound for deleting an item sounded like the real life sound of throwing an item in a trashcan, dropping an item sounded like dropping an item on the floor, etc. Auditory icons were parametric, in that there were different sound events for one versus many items, big versus small items, etc. Microsounds in contrast do not attempt to sound like everyday sounds. Yet Microsounds strive to elicit consistent responses from average users that can be used to convey simple user interface event states. By dropping the constraint of attempting to sound like natural sounds, Microsounds can be designed to communicate in the shortest possible duration. Similarly, Mynatt [9] used realistic sounds as symbols for computing events. Discrete events were mapped to real world sounds (logging onto the computer resulted in a door knocking sound). Computing events with more than one level, say lower and higher volumes of network traffic, could be mapped to walking, jogging and running sounds, respectively. In other words, in order to convey changes in computer “state” information, Mynatt suggested using changes in the states of the sounds of various HCI events. We would like to differentiate our research from both Gaver’s and Mynatt’s in that we intend to convey information in a minimal temporal window. Our approach to doing this is to perform systematic, empirical studies in the design of the Microsounds, as advocated elsewhere [8]. Kramer [8] has examined the issues of mapping continuous data to auditory and visual dimensions in order to enhance pattern recognition. This work lies more in the realm of strict data sonification, and has used redundant audio mappings for the same data elements, instead of a multivariate design approach. This is an important line of investigation, but we are not currently focusing on the issues of pattern recognition during HCI at this point in our research. Discovering the rules for effective sub-250 ms communication in the non-speech audio domain is what we are after. If we can identify an effective set of heuristics for the design of Microsounds, we could envision extending these heuristics to the design of audio displays for continuous events in the future. Therefore, the current work explores some heuristics for the design of an auditory set of cues that was very brief, but still discriminable and meaningful to users. Much of our research has been motivated by work in the applied attention area of cognitive psychology [6, 10, 11]. Work to date in attention suggests that, to the extent that computing tasks require resources that do not necessarily overlap (e.g., come from the same channel or require the same response format), there may be an opportunity we can leverage using vision and audition in combination. To this theoretical motivation we have added the assumption that meaningful sounds of shorter duration require significantly fewer cognitive resources over time, which should benefit task performance as well. Many other researchers have well documented the benefit of auditory displays [7, 9]. The benefits ascribed to the use of audio in displays include the possibility of increasing the user’s awareness of system status, the ability to cue the visual system to a particular spatial location of importance, and potential benefits for visually impaired users. Another line of research has explored the effectiveness of coding information into the musical realm. For instance, Blattner and Brewster [2, 3] have created systems that have used pitch, timbre, rhythm and direction to code hierarchical and syntactic auditory information. An inherent assumption of this work is that the language-like complexity of these soundscapes is easy to learn and understand. It is possible that the musical language coding that needs to be learned in these instances takes too much processing/learning time given the informational payoff to the user. EXPERIMENT 1 Our first experiment tested the discriminability and semantic association of ten sounds. Method Participants Ten people (six males) with no reported hearing difficulties, aged 18 to 40, participated in this experiment. System The experimental session was scripted primarily using SuperLab Pro 2.0 by Cedrus. During the experimental sessions, the users listened to sounds displayed through Altec-Lansing Desktop speakers. Our intent was to make the audio playback similar to that used in home or office environments, so volume levels were set uniformly low and headphones were not used. Participants entered responses through a Cedrus’ button box (see Figure 1).. Figure 1. Cedrus RS6000 six-button interface box The script program, sound output and input were all hosted by a Pentium II 266 MHz, with 128 MB memory, running Windows 95 tm. Stimuli Two sets of Microsounds (10 sounds each) were created for the two experiments presented in this paper. Synthesis techniques were used to create the sounds. Synthesis was chosen over naturally occurring sounds because it afforded more precise timbre and amplitude control for sounds of duration between 20 and 200 ms. Pitched Microsounds were played at equivalent pitches so as not to use pitch as a discriminable feature. Sound 1 did have a pitch relationship between its two events (see Figure 3). An upward fifth relationship was inadvertently used. The impact of this will be discussed in the results section. Sounds varied in five meaningful ways: 1. 5. Nominal Length: Nominal length was determined by when overall volume decayed 20 dB below peak volume (see Figure 6.) Voiced: Whether they were (to borrow a term from speech production) voiced or unvoiced. A sound is said to be voiced when it has a pitched basis. A sound is said to unvoiced when it is based on a noise source. (see Figure 2.). Figure 6. A time domain depiction of how nominal length was determined. Figure 2. Sonograms of a voiced and an unvoiced Microsound. 2. Harmonicity: Whether the sound’s timbre was harmonic or inharmonic (see Figure 3.) We used time-varying envelopes extensively to change overall volume and timbre over time. This was done is such a way as to suggest percussive events on different kinds of materials and shapes. Sounds were designed to have a wide range of frequencies so as to be more robust in the face of interference or bandwidth-limited reproduction. An analysis of the five salient features for each Microsound for Experiment 1 is summarized in Table 1. Stimulus Figure 3. Sonograms of a harmonic and an inharmonic Microsound. 3. Event: How many events combined to make a Microsound (1-4) (See Figure 4.) Voiced=6 Harmonic=6 Event Attack Unvoiced=1 Inharmonic =1 1 to n Time ms Length (down 20db) 01 6 6 2 0 126 02 1 N/A 2 0 83 03 6 2 4 0 215 04 3 6 1 110 193 05 6 5 1 0 50 06 5 1 1 0 128 07 3 2 2 0 75 08 6 6 1 17 161 09 6 6 1 0 151 10 6 2 1 0 215 avg 4.8 4.0 1.6 12.7 139.7 Table 1. Summary of the ten sounds and their feature values designed for Exp. 1. Experimental Design and Procedure Figure 4. Time domain examples of events comprising three Microsounds. 4. Attack Time: How sudden was the onset of the Microsound (see Figure 5) Figure 5. Two time domain examples of varying attacks of Microsounds. The experimental session was divided into two different phases. The first phase, which consisted of 180 trials, involved same-different discriminations for all pair-wise combinations of the Microsounds. A trial consisted of a stimulus pair and single response. For a given trial, the system played Sound A, then after a fixed inter-stimulus interval, played Sound B (or vice versa). Response time was recorded from the cessation of the second sound to the subject’s response of same or different. Sounds were padded with silence so that each sound played for a duration of 250 ms; the inter-stimulus interval (ISI) was 300 ms. Audio synthesis was used to create 10 sounds that were hypothesized to be discriminable. In order to equate the number of same-different judgments in the experiment, We recorded two dependent measures -- participants’ same-different responses and the accompanying response times. Participants pressed a red button on a button box when they judged that the sounds were different and the blue button when they judged them to be the same sound. These button presses were then coded either as correct or incorrect, based upon the matching between the stimulus pair and response. The second phase of the experiment consisted of subjective evaluations of the events that might be associated with Microsounds. During this phase of the experiment, subjects were asked to imagine each Microsound in terms of an event that might occur while computing. Subjects were then asked to rate this imagined event along each of three semantic dimensions—general (negative-neutralpositive), machine functioning (broken-working-running well) and event organization (chaotic-grouped-organized). Each participant performed these tasks in the same order, for a total of 30 subjective ratings (10 sounds by 3 scales). Experimental instructions required participants to respond to the sound by assigning a label to it, according to their initial, subjective interpretation of the event that might have caused the sound. This was facilitated by the presentation of an on-screen picture which included the labels for each of the dimensions, as well as a visual indication as to which button to press for each label. To associate the most negative rating to a sound, participants pressed the red button; similarly, for the most positive rating, participants pressed the blue button. Neutral responses were mapped to the middle four gray buttons, and each of the four buttons was scored to have an equivalent degree of neutrality. The instructional display screen is shown in Figure 7. few asymmetries in the data between whether a sound occurred first or second in a discrimination pair, and so data was averaged across presentation order for any given sound. A one-way Analysis of Variance (ANOVA) was performed on the response times for each sound. No sound was reliably more difficult to discriminate than any other in Experiment 1, as can be seen in Figure 8, which includes the standard error of the mean for each average response time. Exp. 1 Average Response Time Average RT in ms every Microsound was paired with itself for 9 trials. In addition, each Microsound was paired with each of the other Microsounds once, for a total of 180 trials. To control for a presentation order effect, the system randomized the order of trial presentation. Each session lasted under one hour. 600 400 200 0 1 2 3 4 5 6 7 8 9 10 Figure 8. Average reaction times (with bars for standard error of the means) for each Microsound. The average percent correct discrimination for all sound pairs was 99%, and the data showed little variation, ranging from 97% to 100%. Statistical analysis revealed no speedaccuracy tradeoff in the data. The high accuracy, in addition to the fast reaction times, indicates that participants were able to easily discriminate the Microsounds. Subjective Ratings Average subjective ratings for each Microsound are presented in Table 2. These averages indicate the mean subjective rating associated with each Microsound event. For analysis, each subjective button response was assigned one of three numeric values. Negative associations were assigned a value of -1.0 and positive associations were assigned 1.0; neutral associations were assigned 0. 1 2 3 4 5 6 7 8 9 10 1 .70 (.48) -.20 (.42) -.50 (.53) 0.0 (.47) -.10 (.57) 0.0 (.82) -.20 (.63) .40 (.52) .40 (.52) -.40 (.84) 2 .80 (.42) -.40 (.70) -.90 (.32) -.10 (.57) -.20 (.42) -.30 (.95) -.20 (.42) .40 (.52) .50 (.53) -.40 (.52) 3 1.0 (0.0) -.30 (.67) -.50 (.71) -.10 (.74) 0.0 (.47) -.40 (.84) -.40 (.52) .50 (.53) .60 (.52) -.60 (.52) Table 2. Average subjective ratings (with standard deviations in parentheses) by Microsound (column) and rating scale (row): 1=general, 2=machine functioning, 3=event organization Figure 7. The graphic display instructing subjects as to which key to press during subjective ratings. RESULTS Discrimination We used response times as a measure of the difficulty of making the discrimination between each sound; the faster the response time, the easier the choice. Examination of the reaction times in Figure 8 shows that subjects were, on the average, very fast in responding, under 500 ms. There were The average ratings revealed that participants do associate a wide range of qualities to the individual sounds, ranging from -0.90 (strongly negative) to 1.0 (strongly positive). A 3 (rating scale) x 10 (Microsound) repeated measures ANOVA revealed a significant effect of individual sound, F(9,81)=10.9, p<.001, but not rating scale, F(2,18)=0.87, p=0.4. No significant interaction was observed in the data. The scales were not reliably different from each other. We collapsed across them for an overall average subjective rating for each Microsound. Figure 9 shows the average subjective ratings for each Microsound, collapsed across all 3 scales, as well as the standard error of the mean. Overall average rating (-1=neg 1=pos) Exp. 1 Subjective Event Ratings Ten people (six males) with no reported hearing difficulties, aged 18 to 40, participated in this experiment. The subjects were different from those in Experiment 1. Stimuli 1.0 In Experiment 2, 10 new Microsounds were designed with a constraint of 50 ms duration as opposed to an average of 139 ms in Experiment 1. A summary of the salient features used to construct each sound is shown in Table 3. 0.5 0.0 Stimuli -0.5 -1.0 Participants 1 2 3 4 5 7 8 9 10 Average 0.8 -0.3 -0.6 -0.1 -0.1 -0.3 0.4 0.5 -0.5 Voiced=6 Harmonic=6 Unvoiced=1 Inharmonic =1 Event 1 to n Attack Time ms Length (down 20db) 01 2 3 1 0 53 02 3 6 1 0 58 03 6 6 1 0 54 04 6 6 1 0 58 Figure 9. Overall average of the subjective ratings of events. 05 6 5 1 0 42 Perusal of the Figure 9 shows that Sound 1 was reliably more positively rated, on average, than any of the other Microsounds. This in part may be attributed to the upward fifth pitch relationship of Sound 1. Sounds 8 and 9, while not rated differently from each other, were also reliably more positively rated than the rest of the more neutrally rated sounds. Sounds 10 and 3 were most negatively rated, on average, with sound 3 being significantly more negative than all other sounds but sound 10. 06 6 2 1 0 45 07 2 2 2 0 51 08 1 1 1 17 10 09 3 2 1 0 26 10 5 5 1 0 34 4.0 3.8 1.1 1.7 43.1 EXPERIMENT 2 Experiment 1 demonstrated that we had succeeded in varying 5 auditory features to develop a set of sounds that subjects could discriminate well. We also observed that subjects associated the sound set with a wide range of negative-positive semantic classifications that might be useful during HCI. Before we began to apply these sounds in computing contexts, we wondered if we might be able to reduce the duration of the Microsounds further, while maintaining the distinctiveness and semantic connotations of the set. In other words, could we push the design constraints further in order to come up with an even subtler, yet powerful, sound set for HCI? Experiment 2 investigates this possibility for the Microsounds set design. Methods The methodology used in Experiment 2 was identical to that used in Experiment 1, so only the differences will be mentioned here. The Microsound stimuli were reduced on average from 139ms to 43ms in duration, and the ISI was changed from 300 ms., to 200 ms.. Piloting of the stimulus pairs determined that the ISI needed to be reduced in order to minimize short-term memory burden when comparing any two sounds. There were again two phases to the experiment, the 2-AFC discrimination phase, and the subjective ratings of the sounds along the 3 dimensions of "general," "machine function" and "object organization”. Each session lasted under one hour. Avg. Table 3. Summary of the ten sounds and their feature values designed for Exp. 2 RESULTS Discrimination As in Experiment 1, response times were used as a measure of discrimination difficulty; the faster the response time, the easier the same-different decision. Once again, very few asymmetries were observed in the data. Therefore, we collapsed across presentation order for each Microsound. Reaction times were very fast, on average 422 ms. This, along with the very high percent correct (on average, subjects were 99% correct) demonstrates that the Experiment 2 Microsounds were still highly discriminable, even using the much shorter duration audio stimuli. Examination of the average response times in Figure 10 shows that there were no sounds that, on average, were statistically significantly different from any of the others. There were also no significant differences in accuracy, on average, for any of the sounds. The high accuracy, coupled with the overall average response time of 422 ms., indicates that participants were able to discriminate the Microsounds, with no speed-accuracy tradeoff. Exp. 2 Subjective Event Ratings 600 500 400 300 200 100 0 0.8 1 2 3 4 5 6 7 8 9 Overall average rating (-1=neg 1=pos) Avg. RT in ms Exp. 2 Average Response Time 10 Figure 10. Average reaction time and standard error of the mean for each Microsound in Experiment 2. 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 1 2 3 4 5 6 7 8 9 10 Average -0.2 -0.6 0.5 0.1 0.2 -0.2 -0.4 0.1 0.3 0.1 Subjective Ratings for Experiment 2 Average subjective ratings for each Microsound are presented in Table 4. As in Experiment 1, these averages indicate the mean subjective rating associated with each Microsound event. 1 2 3 4 5 6 7 8 9 10 1 .10 (.88) -.60 (.52) .40 (.84) .20 (.63) .20 (.42) -.30 (.67) -.20 (.42) .10 (.74) .30 (.48) 0.0 (.47) 2 -.30 (.82) -.50 (.85) .60 (.52) .10 (.88) -.10 (.57) -.20 (.79) -.60 (.52) .10 (.57) .10 (.32) 0.0 (.67) 3 -.30 (.67) -.60 (.70) .50 (.71) 0.0 (.47) .60 (.52) 0.0 (.82) -.50 (.71) 0.0 (.67) .40 (.70) .20 (.42) Table 4. Average rating (with standard deviations in parentheses) for each Microsound (columns) on three subjective scales (rows). 1=general, 2=machine functioning, 3=event organization Inspection of these average ratings shows once again that participants do associate a wide range of qualities to the individual sounds, ranging from -0.60 to 0.60. However, this range is much narrower than that observed in Experiment 1, which used longer duration sound stimuli. A 3 (subjective rating scale) x 10 (Microsound) repeated measures ANOVA did not reveal either a significant effect of Microsound, F(9,81)=1.6, p=0.12, or scale F(2,18)=1.5, p=0.2. However, a significant sound x scale interaction was observed, F(18,162)=3.5, p<.001. Unfortunately, there were no systematic patterns observable in this interaction, with the scales influencing each sound differentially, as can be seen in Table 4. While more variable than the sounds in Experiment 1, the individual rating scale responses were collapsed to obtain an overall average subjective rating for each sound. Figure 11 shows the overall average rating for each Microsound, as well as standard error bars. Subjects rated Microsound 3 the most positive of the set, while subjects rated Microsound 2 the most negative. Microsounds 4, 8, and 10 were rated as most neutral. Figure 11. Overall average subjective ratings for each Microsound. DISCUSSION Overall, the changes made to the set of stimuli between the two experiments had a negligible effect on the ease of discrimination, though they did narrow the range of positive and negative semantic associations associated with each sound. Discrimination Results from both experiments show that, on average, subjects are highly accurate in discriminating among the Microsounds. Response time results were also similar between the two experiments (average median response times of 416 ms and 422 ms, respectively). The combination of high accuracy and fast discrimination times indicates that both sets of Microsounds were quite discriminable. An interesting finding is that there is little to no reduction in accuracy nor increase in response time when the stimuli are shortened from an average of 139.7 ms to 43.1 ms. Subjective Ratings The subjective results indicated that the different sounds did convey different meanings to participants, however these differences were more marked in the first experiment. While the response variability was high in both experiments, (SD = 0.72 and 0.71 respectively), the range of responses was much greater in Experiment #1 (Range = 0.90 to 1.0) than Experiment #2 (Range = -0.60 to 0.60). Heuristics for HCI Sound Design We have demonstrated, through manipulation of five audio features, that a small set of acoustically discriminable and semantically meaningful sounds can be designed for human-computer interaction. A number of sound design heuristics were applied when building the stimuli with the goal of evoking negative-neutral-positive associations with minimal learning requirements. We discuss these now with the intent to communicate our design lessons based on this research. One of the strongest effects observed across the two experiments was that of harmonicity. The more harmonic the series of overtones in a sound, the more positive the association. The sounds that were judged most negatively, on average, (Experiment #1, Sounds 3 and 10, and Experiment #2, Sound 2) had inharmonic series of overtones. The three sounds judged positively, on average, (Experiment #1, Sounds 1 and 9, and Experiment #2, Sound 3) all had a voiced harmonic basis. Before we tested users, we thought that sounds designed with multiple events might be perceived as more organized than a single event. This turned out not to be true. The overall amplitude structure of Sounds 1 and 2 (Experiment #1) are very similar but subjects rated them, on average, 0.83 and -0.30, respectively. The overriding difference between the two sounds was that Sound 1 was pleasantly tonal while Sound 2 was not voiced and was reminiscent of slapping a leather couch with pencil. We see another strong difference between unvoiced sounds (which are based on white noise) and voiced sounds (which are based on a pitch). Unvoiced sounds were judged to be near neutral on average while voiced/harmonic sounds were seen, on average, as positive. Obviously this is an auditory dimension worth thinking carefully about during auditory display design. We recommend that audio designers interested in developing audio cues for use during HCI use the above features to characterize the computing events which trigger those sounds. As we have shown, a very brief audio cue can be sufficient to provide the user with useful semantic information about the event. CONCLUSION AND FUTURE WORK Two experiments demonstrated the discriminability and semantic associations engendered by a small set of very brief sounds. These sounds were designed with an ear toward improving human-computer interaction through the use of subtle, yet meaningful audio events. Based on the results of the two experiments reported in this paper, important design heuristics were identified that could be leveraged during HCI. We would be overstating the case to suggest that the particular sounds developed for Experiments #1 and #2 were better than any other 20 sounds that follow the same constraints and heuristics if done skillfully. Our next step is to use these heuristics in the context of realistic HCI scenarios, as peripheral cues or in conjunction with visual information in the display. In addition, the inadvertent use of an upward fifth pitch relationship in Sound 1 of Experiment #1 suggests that further study be done with sub 100ms sound to explore how effective pitch relationships can be used to elicit consistent associations. Future studies will explore the effectiveness of all of these minimalist cues, including their usefulness and usability for delivering useful information with minimal disruption. To do this we intend to explore cognitive load issues during Microsound usage, with the hope that using the auditory channel to display brief, semantic information will conserve attentional resources. Anticipating iterative design success, we will eventually move toward working on the auditory display of continuous information dimensions. Microunds will be available soon for examination at http:somwhere.on.the.web. ACKNOWLEDGMENTS We thank the User Interface Research group at Microsoft and Carol Thiel who wrote and provided helpful comments on previous versions of this document. REFERENCES 1. Blattner, M.M., Papp, A.L., and Glinert, E.P. (1994). Sonic enhancement of two-dimensional graphics displays. In Auditory Display: Sonification, audification and auditory interfaces. Kramer, G. Ed. AddisonWesley, Reading, MA, 447-470. 2. Blattner, M.M., Sumikawa, D.A., and Greenberg, R.M. (1989). Earcons and icons: Their structure and common design principles. Human –Computer Interaction, 4, 11-44. 3. Brewster, S. (1997). Navigating telephone-based interfaces with earcons. In People and Computers XII, Proceedings of HCI ’97, Thimbleby, H., O’Conaill, B. & Thomas, P. Eds., 39-56. 4. Ellis, S.R. Ed. (1992). Pictorial Communication in virtual and real environments. Taylor & Francis, UK: 5. Gaver, W. (1989). The SonicFinder: An interface that uses auditory icons. Human-Computer Interaction, 4, 67-94. 6. Kahneman, D. (1973). Attention and Effort. PrenticeHall, Englewood Cliffs, NJ 7. Kramer, G. (1994). An introduction to auditory display. In Auditory Display: Sonification, audification and auditory interfaces. Kramer, G. Ed. Addison-Wesley, Reading, MA, 1-78. 8. Kramer, G. (1994). Some organizing principles for representing data with sound. In Auditory Display: Sonification, audification and auditory interfaces. Kramer, G. Ed. Addison-Wesley, Reading, MA, 185221. 9. Mynatt, E.D. (1994). Auditory presentation of graphical user interfaces. In Auditory Display: Sonification, audification and auditory interfaces. Kramer, G. Ed. Addison-Wesley, Reading, MA, 533555. 10. Schneider, W. & Shiffrin, R. (1977). Controlled and automatic human information processing. Psychological Review, 84, 1-66. 11. Wickens, C.D., Sandry, D., & Vidulich, M. (1983). Compatibility and resource competition between modalities of input, central processing, and output: Testing a model of complex task performance. Human Factors, 25, 227-248.