Microsounds: An Experimental Investigation of Sound Cues for

advertisement
Microsounds: An Experimental Investigation of Sound
Cues for Interaction
David Thiel, Mary Czerwinski, and Barry Peterson
Microsoft Research
One Microsoft Way
Redmond, WA 98052 USA
+1 425-936-5637
dthiel@microsoft.com
ABSTRACT
This paper presents the design and experimental evaluation
of a set of audio cues specifically created to present the
user with high information content without high disruption
(to the user’s attention). Because these sounds are of very
short duration (~250) we call these sounds the Microsound
sets. Two experiments test the discriminability of the
Microsound designs, as well as their semantic associations
to computing events. Results demonstrated that all our
sounds were distinguishable from one another and with
sounds under 200 ms, an audio designer can effectively
evoke consistent positive , neutral, and negative
associations to computing events.
Benefits to this
approach include the possibility of increasing the user’s
awareness about system status and a potential HCI
improvement for vision-impaired users.
Keywords
Auditory user interface, acoustic dimensions, user study,
auditory processing, sound, attention, audio
INTRODUCTION
We explore the design of a sound set that can be used to the
benefit of the user during human-computer interaction
(HCI). By way of introduction to this problem, it is
reasonable to question why one should pursue the use of
sound in HCI at all. While the auditory channel is
invaluable in our interactions with the real world, its
importance is often overlooked.
In fact, sound in
computing has been used almost exclusively as an alerting
mechanism for system events. We believe sound has been
underutilized in computing for a variety of reasons, some of
which we attempt to address with this research.
Sound cues during interaction are extremely attention
demanding, like animation in the visual channel. When user
interface feedback is attention demanding it is the
designer’s responsibility to ensure that the information
passed through an attention demanding channel is of
enough value to the user to warrant a user’s attention.
The designer must remember: a user can control where
he/she looks, but not what he/she hears. The visual features
of the user interface are constantly available in a spatial,
not temporal format (assuming the user can find them).
Auditory feedback competes for attentional resources
asynchronously with the user’s focus.
This is a
simultaneous strength and weakness of the use of sound
during interaction. On the one hand, hearing an alarm is out
of the user’s control, and therefore has an alerting effect on
the user. On the other hand, if the event associated with the
auditory interaction does not warrant the user being
distracted from the task at hand, user performance suffers
from the disruption.
Our goal is to find design heuristics for sound that allow
sound designers to fashion sounds that deliver more
information with less disruption than is currently done with
dosay’s audio cues. Like screen real estate, a user’s
attention a limited resource. Brevity with audio has the
potential of conserving cognitive resources. Savings in
cognitive resources could also be attained by utilizing an
alternative and unused information-processing channel
[11]. Finding a useful and useable set of Microsounds could
potentially increase satisfaction, presence and engagement
with a computing session. Therefore, we began with the
following design goals to guide our explorations of the
Microsound design space:

Microsounds should be designed as audio cues that
provide information to the user in the smallest possible
amount of time.

Microsounds should be cues that the user could choose
to easily ignore.

Microsounds must be distinguishable from each other
so that associations with events can be recognized and
learned over time.

Microsounds should communicate distinctions like
“positive”, “neutral”, and “negative” connotations,
without training.
Previous Work
Gaver [5] developed the SonicFinder system, leveraging
human intuitions about auditory events resulting from the
physics of interface events. For instance, the sound for
deleting an item sounded like the real life sound of
throwing an item in a trashcan, dropping an item sounded
like dropping an item on the floor, etc. Auditory icons
were parametric, in that there were different sound events
for one versus many items, big versus small items, etc.
Microsounds in contrast do not attempt to sound like
everyday sounds.
Yet Microsounds strive to elicit
consistent responses from average users that can be used to
convey simple user interface event states. By dropping the
constraint of attempting to sound like natural sounds,
Microsounds can be designed to communicate in the
shortest possible duration.
Similarly, Mynatt [9] used realistic sounds as symbols for
computing events. Discrete events were mapped to real
world sounds (logging onto the computer resulted in a door
knocking sound). Computing events with more than one
level, say lower and higher volumes of network traffic,
could be mapped to walking, jogging and running sounds,
respectively. In other words, in order to convey changes in
computer “state” information, Mynatt suggested using
changes in the states of the sounds of various HCI events.
We would like to differentiate our research from both
Gaver’s and Mynatt’s in that we intend to convey
information in a minimal temporal window. Our approach
to doing this is to perform systematic, empirical studies in
the design of the Microsounds, as advocated elsewhere [8].
Kramer [8] has examined the issues of mapping continuous
data to auditory and visual dimensions in order to enhance
pattern recognition. This work lies more in the realm of
strict data sonification, and has used redundant audio
mappings for the same data elements, instead of a
multivariate design approach. This is an important line of
investigation, but we are not currently focusing on the
issues of pattern recognition during HCI at this point in our
research. Discovering the rules for effective sub-250 ms
communication in the non-speech audio domain is what we
are after. If we can identify an effective set of heuristics for
the design of Microsounds, we could envision extending
these heuristics to the design of audio displays for
continuous events in the future. Therefore, the current
work explores some heuristics for the design of an auditory
set of cues that was very brief, but still discriminable and
meaningful to users. Much of our research has been
motivated by work in the applied attention area of cognitive
psychology [6, 10, 11]. Work to date in attention suggests
that, to the extent that computing tasks require resources
that do not necessarily overlap (e.g., come from the same
channel or require the same response format), there may be
an opportunity we can leverage using vision and audition in
combination. To this theoretical motivation we have added
the assumption that meaningful sounds of shorter duration
require significantly fewer cognitive resources over time,
which should benefit task performance as well.
Many other researchers have well documented the benefit
of auditory displays [7, 9]. The benefits ascribed to the use
of audio in displays include the possibility of increasing the
user’s awareness of system status, the ability to cue the
visual system to a particular spatial location of importance,
and potential benefits for visually impaired users.
Another line of research has explored the effectiveness of
coding information into the musical realm. For instance,
Blattner and Brewster [2, 3] have created systems that have
used pitch, timbre, rhythm and direction to code
hierarchical and syntactic auditory information.
An
inherent assumption of this work is that the language-like
complexity of these soundscapes is easy to learn and
understand. It is possible that the musical language coding
that needs to be learned in these instances takes too much
processing/learning time given the informational payoff to
the user.
EXPERIMENT 1
Our first experiment tested the discriminability and
semantic association of ten sounds.
Method
Participants
Ten people (six males) with no reported hearing
difficulties, aged 18 to 40, participated in this experiment.
System
The experimental session was scripted primarily using
SuperLab Pro 2.0 by Cedrus. During the experimental
sessions, the users listened to sounds displayed through
Altec-Lansing Desktop speakers. Our intent was to make
the audio playback similar to that used in home or office
environments, so volume levels were set uniformly low and
headphones were not used. Participants entered responses
through a Cedrus’ button box (see Figure 1)..
Figure 1. Cedrus RS6000 six-button interface box
The script program, sound output and input were all hosted
by a Pentium II 266 MHz, with 128 MB memory, running
Windows 95 tm.
Stimuli
Two sets of Microsounds (10 sounds each) were created for
the two experiments presented in this paper. Synthesis
techniques were used to create the sounds. Synthesis was
chosen over naturally occurring sounds because it afforded
more precise timbre and amplitude control for sounds of
duration between 20 and 200 ms. Pitched Microsounds
were played at equivalent pitches so as not to use pitch as a
discriminable feature. Sound 1 did have a pitch relationship
between its two events (see Figure 3). An upward fifth
relationship was inadvertently used. The impact of this will
be discussed in the results section. Sounds varied in five
meaningful ways:
1.
5.
Nominal Length: Nominal length was determined by
when overall volume decayed 20 dB below peak
volume (see Figure 6.)
Voiced: Whether they were (to borrow a term from
speech production) voiced or unvoiced. A sound is said
to be voiced when it has a pitched basis. A sound is
said to unvoiced when it is based on a noise source. (see
Figure 2.).
Figure 6. A time domain depiction of how nominal length
was determined.
Figure 2.
Sonograms of a voiced and an unvoiced
Microsound.
2.
Harmonicity:
Whether the sound’s timbre was
harmonic or inharmonic (see Figure 3.)
We used time-varying envelopes extensively to change
overall volume and timbre over time. This was done is
such a way as to suggest percussive events on different
kinds of materials and shapes.
Sounds were designed to have a wide range of frequencies
so as to be more robust in the face of interference or
bandwidth-limited reproduction. An analysis of the five
salient features for each Microsound for Experiment 1 is
summarized in Table 1.
Stimulus
Figure 3. Sonograms of a harmonic and an inharmonic
Microsound.
3.
Event: How many events combined to make a
Microsound (1-4) (See Figure 4.)
Voiced=6
Harmonic=6 Event
Attack
Unvoiced=1 Inharmonic =1 1 to n Time ms
Length
(down
20db)
01
6
6
2
0
126
02
1
N/A
2
0
83
03
6
2
4
0
215
04
3
6
1
110
193
05
6
5
1
0
50
06
5
1
1
0
128
07
3
2
2
0
75
08
6
6
1
17
161
09
6
6
1
0
151
10
6
2
1
0
215
avg
4.8
4.0
1.6
12.7
139.7
Table 1. Summary of the ten sounds and their feature values
designed for Exp. 1.
Experimental Design and Procedure
Figure 4. Time domain examples of events comprising three
Microsounds.
4.
Attack Time: How sudden was the onset of the
Microsound (see Figure 5)
Figure 5. Two time domain examples of varying attacks of
Microsounds.
The experimental session was divided into two different
phases. The first phase, which consisted of 180 trials,
involved same-different discriminations for all pair-wise
combinations of the Microsounds. A trial consisted of a
stimulus pair and single response. For a given trial, the
system played Sound A, then after a fixed inter-stimulus
interval, played Sound B (or vice versa). Response time
was recorded from the cessation of the second sound to the
subject’s response of same or different. Sounds were
padded with silence so that each sound played for a
duration of 250 ms; the inter-stimulus interval (ISI) was
300 ms. Audio synthesis was used to create 10 sounds that
were hypothesized to be discriminable. In order to equate
the number of same-different judgments in the experiment,
We recorded two dependent measures -- participants’
same-different responses and the accompanying response
times. Participants pressed a red button on a button box
when they judged that the sounds were different and the
blue button when they judged them to be the same sound.
These button presses were then coded either as correct or
incorrect, based upon the matching between the stimulus
pair and response.
The second phase of the experiment consisted of subjective
evaluations of the events that might be associated with
Microsounds. During this phase of the experiment,
subjects were asked to imagine each Microsound in terms
of an event that might occur while computing. Subjects
were then asked to rate this imagined event along each of
three semantic dimensions—general (negative-neutralpositive), machine functioning (broken-working-running
well) and event organization (chaotic-grouped-organized).
Each participant performed these tasks in the same order,
for a total of 30 subjective ratings (10 sounds by 3 scales).
Experimental instructions required participants to respond
to the sound by assigning a label to it, according to their
initial, subjective interpretation of the event that might have
caused the sound. This was facilitated by the presentation
of an on-screen picture which included the labels for each
of the dimensions, as well as a visual indication as to which
button to press for each label. To associate the most
negative rating to a sound, participants pressed the red
button; similarly, for the most positive rating, participants
pressed the blue button. Neutral responses were mapped to
the middle four gray buttons, and each of the four buttons
was scored to have an equivalent degree of neutrality. The
instructional display screen is shown in Figure 7.
few asymmetries in the data between whether a sound
occurred first or second in a discrimination pair, and so
data was averaged across presentation order for any given
sound. A one-way Analysis of Variance (ANOVA) was
performed on the response times for each sound. No sound
was reliably more difficult to discriminate than any other in
Experiment 1, as can be seen in Figure 8, which includes
the standard error of the mean for each average response
time.
Exp. 1
Average Response Time
Average
RT in ms
every Microsound was paired with itself for 9 trials. In
addition, each Microsound was paired with each of the
other Microsounds once, for a total of 180 trials. To control
for a presentation order effect, the system randomized the
order of trial presentation. Each session lasted under one
hour.
600
400
200
0
1
2
3
4
5
6
7
8
9
10
Figure 8. Average reaction times (with bars for standard
error of the means) for each Microsound.
The average percent correct discrimination for all sound
pairs was 99%, and the data showed little variation, ranging
from 97% to 100%. Statistical analysis revealed no speedaccuracy tradeoff in the data. The high accuracy, in
addition to the fast reaction times, indicates that
participants were able to easily discriminate the
Microsounds.
Subjective Ratings
Average subjective ratings for each Microsound are
presented in Table 2. These averages indicate the mean
subjective rating associated with each Microsound event.
For analysis, each subjective button response was assigned
one of three numeric values. Negative associations were
assigned a value of -1.0 and positive associations were
assigned 1.0; neutral associations were assigned 0.
1
2
3
4
5
6
7
8
9
10
1
.70
(.48)
-.20
(.42)
-.50
(.53)
0.0
(.47)
-.10
(.57)
0.0
(.82)
-.20
(.63)
.40
(.52)
.40
(.52)
-.40
(.84)
2
.80
(.42)
-.40
(.70)
-.90
(.32)
-.10
(.57)
-.20
(.42)
-.30
(.95)
-.20
(.42)
.40
(.52)
.50
(.53)
-.40
(.52)
3
1.0
(0.0)
-.30
(.67)
-.50
(.71)
-.10
(.74)
0.0
(.47)
-.40
(.84)
-.40
(.52)
.50
(.53)
.60
(.52)
-.60
(.52)
Table 2.
Average subjective ratings (with standard
deviations in parentheses) by Microsound (column) and
rating scale (row): 1=general, 2=machine functioning,
3=event organization
Figure 7. The graphic display instructing subjects as to
which key to press during subjective ratings.
RESULTS
Discrimination
We used response times as a measure of the difficulty of
making the discrimination between each sound; the faster
the response time, the easier the choice. Examination of the
reaction times in Figure 8 shows that subjects were, on the
average, very fast in responding, under 500 ms. There were
The average ratings revealed that participants do associate a
wide range of qualities to the individual sounds, ranging
from -0.90 (strongly negative) to 1.0 (strongly positive). A
3 (rating scale) x 10 (Microsound) repeated measures
ANOVA revealed a significant effect of individual sound,
F(9,81)=10.9, p<.001, but not rating scale, F(2,18)=0.87,
p=0.4. No significant interaction was observed in the data.
The scales were not reliably different from each other. We
collapsed across them for an overall average subjective
rating for each Microsound. Figure 9 shows the average
subjective ratings for each Microsound, collapsed across all
3 scales, as well as the standard error of the mean.
Overall average rating
(-1=neg 1=pos)
Exp. 1
Subjective Event Ratings
Ten people (six males) with no reported hearing
difficulties, aged 18 to 40, participated in this experiment.
The subjects were different from those in Experiment 1.
Stimuli
1.0
In Experiment 2, 10 new Microsounds were designed with
a constraint of 50 ms duration as opposed to an average of
139 ms in Experiment 1. A summary of the salient features
used to construct each sound is shown in Table 3.
0.5
0.0
Stimuli
-0.5
-1.0
Participants
1
2
3
4
5
7
8
9
10
Average 0.8 -0.3 -0.6 -0.1 -0.1 -0.3 0.4 0.5 -0.5
Voiced=6
Harmonic=6
Unvoiced=1 Inharmonic =1
Event
1 to n
Attack
Time ms
Length
(down
20db)
01
2
3
1
0
53
02
3
6
1
0
58
03
6
6
1
0
54
04
6
6
1
0
58
Figure 9. Overall average of the subjective ratings of events.
05
6
5
1
0
42
Perusal of the Figure 9 shows that Sound 1 was reliably
more positively rated, on average, than any of the other
Microsounds. This in part may be attributed to the upward
fifth pitch relationship of Sound 1. Sounds 8 and 9, while
not rated differently from each other, were also reliably
more positively rated than the rest of the more neutrally
rated sounds. Sounds 10 and 3 were most negatively rated,
on average, with sound 3 being significantly more negative
than all other sounds but sound 10.
06
6
2
1
0
45
07
2
2
2
0
51
08
1
1
1
17
10
09
3
2
1
0
26
10
5
5
1
0
34
4.0
3.8
1.1
1.7
43.1
EXPERIMENT 2
Experiment 1 demonstrated that we had succeeded in
varying 5 auditory features to develop a set of sounds that
subjects could discriminate well. We also observed that
subjects associated the sound set with a wide range of
negative-positive semantic classifications that might be
useful during HCI. Before we began to apply these sounds
in computing contexts, we wondered if we might be able to
reduce the duration of the Microsounds further, while
maintaining the distinctiveness and semantic connotations
of the set. In other words, could we push the design
constraints further in order to come up with an even subtler,
yet powerful, sound set for HCI?
Experiment 2
investigates this possibility for the Microsounds set design.
Methods
The methodology used in Experiment 2 was identical to
that used in Experiment 1, so only the differences will be
mentioned here. The Microsound stimuli were reduced on
average from 139ms to 43ms in duration, and the ISI was
changed from 300 ms., to 200 ms..
Piloting of the
stimulus pairs determined that the ISI needed to be reduced
in order to minimize short-term memory burden when
comparing any two sounds. There were again two phases to
the experiment, the 2-AFC discrimination phase, and the
subjective ratings of the sounds along the 3 dimensions of
"general," "machine function" and "object organization”.
Each session lasted under one hour.
Avg.
Table 3. Summary of the ten sounds and their feature
values designed for Exp. 2
RESULTS
Discrimination
As in Experiment 1, response times were used as a measure
of discrimination difficulty; the faster the response time,
the easier the same-different decision. Once again, very few
asymmetries were observed in the data. Therefore, we
collapsed across presentation order for each Microsound.
Reaction times were very fast, on average 422 ms. This,
along with the very high percent correct (on average,
subjects were 99% correct) demonstrates that the
Experiment 2 Microsounds were still highly discriminable,
even using the much shorter duration audio stimuli.
Examination of the average response times in Figure 10
shows that there were no sounds that, on average, were
statistically significantly different from any of the others.
There were also no significant differences in accuracy, on
average, for any of the sounds. The high accuracy, coupled
with the overall average response time of 422 ms., indicates
that participants were able to discriminate the Microsounds,
with no speed-accuracy tradeoff.
Exp. 2
Subjective Event Ratings
600
500
400
300
200
100
0
0.8
1
2
3
4
5
6
7
8
9
Overall average rating
(-1=neg 1=pos)
Avg. RT in ms
Exp. 2
Average Response Time
10
Figure 10. Average reaction time and standard error of the
mean for each Microsound in Experiment 2.
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
1
2
3
4
5
6
7
8
9
10
Average -0.2 -0.6 0.5 0.1 0.2 -0.2 -0.4 0.1 0.3 0.1
Subjective Ratings for Experiment 2
Average subjective ratings for each Microsound are
presented in Table 4. As in Experiment 1, these averages
indicate the mean subjective rating associated with each
Microsound event.
1
2
3
4
5
6
7
8
9
10
1
.10
(.88)
-.60
(.52)
.40
(.84)
.20
(.63)
.20
(.42)
-.30
(.67)
-.20
(.42)
.10
(.74)
.30
(.48)
0.0
(.47)
2
-.30
(.82)
-.50
(.85)
.60
(.52)
.10
(.88)
-.10
(.57)
-.20
(.79)
-.60
(.52)
.10
(.57)
.10
(.32)
0.0
(.67)
3
-.30
(.67)
-.60
(.70)
.50
(.71)
0.0
(.47)
.60
(.52)
0.0
(.82)
-.50
(.71)
0.0
(.67)
.40
(.70)
.20
(.42)
Table 4. Average rating (with standard deviations in
parentheses) for each Microsound (columns) on three
subjective scales (rows). 1=general, 2=machine functioning,
3=event organization
Inspection of these average ratings shows once again that
participants do associate a wide range of qualities to the
individual sounds, ranging from -0.60 to 0.60. However,
this range is much narrower than that observed in
Experiment 1, which used longer duration sound stimuli. A
3 (subjective rating scale) x 10 (Microsound) repeated
measures ANOVA did not reveal either a significant effect
of Microsound, F(9,81)=1.6, p=0.12, or scale F(2,18)=1.5,
p=0.2. However, a significant sound x scale interaction
was observed, F(18,162)=3.5, p<.001. Unfortunately, there
were no systematic patterns observable in this interaction,
with the scales influencing each sound differentially, as can
be seen in Table 4. While more variable than the sounds in
Experiment 1, the individual rating scale responses were
collapsed to obtain an overall average subjective rating for
each sound.
Figure 11 shows the overall average rating for each
Microsound, as well as standard error bars. Subjects rated
Microsound 3 the most positive of the set, while subjects
rated Microsound 2 the most negative. Microsounds 4, 8,
and 10 were rated as most neutral.
Figure 11. Overall average subjective ratings for each
Microsound.
DISCUSSION
Overall, the changes made to the set of stimuli between the
two experiments had a negligible effect on the ease of
discrimination, though they did narrow the range of
positive and negative semantic associations associated with
each sound.
Discrimination
Results from both experiments show that, on average,
subjects are highly accurate in discriminating among the
Microsounds. Response time results were also similar
between the two experiments (average median response
times of 416 ms and 422 ms, respectively).
The
combination of high accuracy and fast discrimination times
indicates that both sets of Microsounds were quite
discriminable. An interesting finding is that there is little to
no reduction in accuracy nor increase in response time
when the stimuli are shortened from an average of 139.7
ms to 43.1 ms.
Subjective Ratings
The subjective results indicated that the different sounds
did convey different meanings to participants, however
these differences were more marked in the first experiment.
While the response variability was high in both
experiments, (SD = 0.72 and 0.71 respectively), the range
of responses was much greater in Experiment #1 (Range = 0.90 to 1.0) than Experiment #2 (Range = -0.60 to 0.60).
Heuristics for HCI Sound Design
We have demonstrated, through manipulation of five audio
features, that a small set of acoustically discriminable and
semantically meaningful sounds can be designed for
human-computer interaction. A number of sound design
heuristics were applied when building the stimuli with the
goal of evoking negative-neutral-positive associations with
minimal learning requirements. We discuss these now with
the intent to communicate our design lessons based on this
research.
One of the strongest effects observed across the two
experiments was that of harmonicity. The more harmonic
the series of overtones in a sound, the more positive the
association. The sounds that were judged most negatively,
on average, (Experiment #1, Sounds 3 and 10, and
Experiment #2, Sound 2) had inharmonic series of
overtones. The three sounds judged positively, on average,
(Experiment #1, Sounds 1 and 9, and Experiment #2,
Sound 3) all had a voiced harmonic basis.
Before we tested users, we thought that sounds designed
with multiple events might be perceived as more organized
than a single event. This turned out not to be true. The
overall amplitude structure of Sounds 1 and 2 (Experiment
#1) are very similar but subjects rated them, on average,
0.83 and -0.30, respectively. The overriding difference
between the two sounds was that Sound 1 was pleasantly
tonal while Sound 2 was not voiced and was reminiscent of
slapping a leather couch with pencil.
We see another strong difference between unvoiced sounds
(which are based on white noise) and voiced sounds (which
are based on a pitch). Unvoiced sounds were judged to be
near neutral on average while voiced/harmonic sounds
were seen, on average, as positive. Obviously this is an
auditory dimension worth thinking carefully about during
auditory display design.
We recommend that audio designers interested in
developing audio cues for use during HCI use the above
features to characterize the computing events which trigger
those sounds. As we have shown, a very brief audio cue
can be sufficient to provide the user with useful semantic
information about the event.
CONCLUSION AND FUTURE WORK
Two experiments demonstrated the discriminability and
semantic associations engendered by a small set of very
brief sounds. These sounds were designed with an ear
toward improving human-computer interaction through the
use of subtle, yet meaningful audio events. Based on the
results of the two experiments reported in this paper,
important design heuristics were identified that could be
leveraged during HCI. We would be overstating the case to
suggest that the particular sounds developed for
Experiments #1 and #2 were better than any other 20
sounds that follow the same constraints and heuristics if
done skillfully. Our next step is to use these heuristics in
the context of realistic HCI scenarios, as peripheral cues or
in conjunction with visual information in the display. In
addition, the inadvertent use of an upward fifth pitch
relationship in Sound 1 of Experiment #1 suggests that
further study be done with sub 100ms sound to explore how
effective pitch relationships can be used to elicit consistent
associations. Future studies will explore the effectiveness
of all of these minimalist cues, including their usefulness
and usability for delivering useful information with
minimal disruption. To do this we intend to explore
cognitive load issues during Microsound usage, with the
hope that using the auditory channel to display brief,
semantic information will conserve attentional resources.
Anticipating iterative design success, we will eventually
move toward working on the auditory display of continuous
information dimensions. Microunds will be available soon
for examination at http:somwhere.on.the.web.
ACKNOWLEDGMENTS
We thank the User Interface Research group at Microsoft
and Carol Thiel who wrote and provided helpful comments
on previous versions of this document.
REFERENCES
1. Blattner, M.M., Papp, A.L., and Glinert, E.P. (1994).
Sonic enhancement of two-dimensional graphics
displays. In Auditory Display: Sonification, audification
and auditory interfaces. Kramer, G. Ed. AddisonWesley, Reading, MA, 447-470.
2. Blattner, M.M., Sumikawa, D.A., and Greenberg, R.M.
(1989). Earcons and icons: Their structure and common
design principles. Human –Computer Interaction, 4,
11-44.
3. Brewster, S. (1997).
Navigating telephone-based
interfaces with earcons. In People and Computers XII,
Proceedings of HCI ’97, Thimbleby, H., O’Conaill, B.
& Thomas, P. Eds., 39-56.
4. Ellis, S.R. Ed. (1992). Pictorial Communication in
virtual and real environments. Taylor & Francis, UK:
5. Gaver, W. (1989). The SonicFinder: An interface that
uses auditory icons. Human-Computer Interaction, 4,
67-94.
6. Kahneman, D. (1973). Attention and Effort. PrenticeHall, Englewood Cliffs, NJ
7. Kramer, G. (1994). An introduction to auditory display.
In Auditory Display: Sonification, audification and
auditory interfaces. Kramer, G. Ed. Addison-Wesley,
Reading, MA, 1-78.
8. Kramer, G. (1994). Some organizing principles for
representing data with sound. In Auditory Display:
Sonification, audification and auditory interfaces.
Kramer, G. Ed. Addison-Wesley, Reading, MA, 185221.
9. Mynatt, E.D. (1994).
Auditory presentation of
graphical user interfaces. In Auditory Display:
Sonification, audification and auditory interfaces.
Kramer, G. Ed. Addison-Wesley, Reading, MA, 533555.
10. Schneider, W. & Shiffrin, R. (1977). Controlled and
automatic human information processing. Psychological
Review, 84, 1-66.
11. Wickens, C.D., Sandry, D., & Vidulich, M. (1983).
Compatibility and resource competition between
modalities of input, central processing, and output:
Testing a model of complex task performance. Human
Factors, 25, 227-248.
Download