TOCHI-2010-0058

advertisement
Are We There Yet?: The Role of Gender on the Effectiveness and Efficiency of
User-Robot Communication in Navigational Tasks
THEODORA KOULOURI, STANISLAO LAURIA AND ROBERT D. MACREDIE
Department of Information Systems and Computing, Brunel University, UK
SHERRY CHEN
Graduate Institute of Network Learning Technology, National Central University, Taiwan
______________________________________________________________________________
________
Many studies have identified gender differences in communication related to spatial navigation in real and virtual worlds. Most of this
research has focused on single-party communication (monologues), such as the way in which individuals either give or follow route
instructions. However, very little work has been reported on spatial navigation dialogues and whether there are gender differences in
the way that they are conducted. This paper will address the lack of research evidence by exploring the dialogues between partners of
the same and of different gender in a simulated Human-Robot Interaction study. In the experiments discussed in this paper, pairs of
participants communicated remotely; in each pair, one participant (the instructor) was under the impression that s/he was giving route
instructions to a robot (the follower), avoiding any perception of gendered communication. To ensure the naturalness of the
interaction, the followers were given no guidelines on what to say, however each had to control a robot based on the user’s
instructions. While many monologue-based studies suggest male superiority in a multitude of spatial activities and domains, this
study of dialogues highlights a more complex pattern of results. As anticipated, gender influences task performance and
communication. However, the findings suggest that it is the interaction – the combination of gender and role (i.e., instructor or
follower) – that has the most significant impact. In particular, pairs of female users/instructors and male ‘robots’/followers are
associated with the fastest and most accurate completion of the navigation tasks. Moreover, dialogue-based analysis illustrates how
pairs of male users/instructors and female ‘robots’/followers achieved successful communication through ‘alignment’ of spatial
descriptions. In particular, males seem to adapt the content of their instructions when interacting with female ‘robots’/followers and
employ more landmark references compared to female users/instructors or when addressing males (in male-male pairings). This study
describes the differences in how males and females interact with the system, and proposes that any female ‘disadvantage’ in spatial
communication can disappear through interactive mechanisms. Such insights are important for the design of navigation systems that
are equally effective for users of either gender.
______________________________________________________________________________
________
1. INTRODUCTION
How we talk about places and objects in the world challenges researchers from a variety of
disciplines. This research has resulted in the development of theories of human communication,
cognition and behaviour and informs the design of computer applications and user interfaces
including those related to spatial information, such as Geographic Information Systems, dialogue
systems for robot navigation and spatially-aware artificial agents.
It is widely recognised that there are large individual and group differences in how people
process and communicate spatial information, related to, amongst other things, core spatial and
verbal abilities (Vanetti and Allen, 1988), age (Golding, 1996), education, and previous
experience (Newcombe et al., 1983). Research has shown that observed individual differences
may also be ‘stylistic’ – that is, relating to preference rather than ability (Barkowsky et al., 2007).
Gender-related differences have also been consistently reported in research looking at a variety
of spatial tasks; contrary to popular belief, though, evidence remains inconclusive about male
superiority in this area (Lawton, 1994; Allen, 2000a).
Research into communicating spatial information has identified a complex pattern of findings
with regards to gender differences and spatial and linguistic strategies employed. In particular,
men frequently formulate their instructions using the cardinal system and metric distances,
whereas women prefer to include references to proximal landmarks (Ward et al., 1986; Lawton,
1994). Research on wayfinding shows that women rely on local landmarks for orientation as
well, whereas men tend to employ a global perspective that makes use of spatial relations
within the environment (Coluccia et al., 2007a). Researchers have suggested that the
wayfinding strategy that is more commonly associated with men is more efficient and robust.
There is corroborating evidence that men outperform women in various navigational tasks
(Chen et al., 2009; Coluccia and Losue, 2004;), though gender differences seem to reduce or
disappear as the task becomes easier (Coluccia and Losue, 2004) or the field of view increases
(Czerwinski et al. 2002). Lawton (1994) suggests that gender differences are not always
observed in spatial tasks, but when they are that the results tend to favour males. Lawton
(1994) also draws attention to the choice of methodological approach used in studies – namely,
abstract lab tasks versus real-world tasks – which is argued to be a contributory factor in the
diversity of findings reported. Findings from psychometric tasks (e.g., mental rotation)
consistently favour males, but results are less clear when it comes to more ecologically-valid
tasks, for instance map learning and navigation in a campus (Montello et al., 1999). In other
domains of spatial research, men have been found to draw, read and interpret maps more
accurately than women (Allen 2000a; Beatty and Troster, 1987; Coluccia et al., 2007b). Allen
(2000a) attributes this advantage to better spatial working memory, though he points out that
these differences apply to specific aspects of map reading and interpretation and should not be
readily extended to general spatial abilities.
The complexity in the findings suggests that there is a need for further targeted and systematic
investigation of gender-related differences. The research study reported in this paper seeks to
contribute to the existing corpus of research through an analysis of route instructions, with the
focus on route-based dialogue systems where human users give route instructions to computerbased devices, such as robots. Route instructions are an interesting area in the study of spatial
language as they are not only produced to describe the world, but also to elicit a particular
navigational behaviour from an individual (or system) which will result in their reaching a
destination efficiently (Daniel and Denis, 1998).
2. ROUTE INSTRUCTIONS: BACKGROUND
Spatial language typically and naturally occurs in dialogue. As such, we hardly ever produce
route instructions without an intended recipient. Moreover, communicating route knowledge is
a collaborative, goal-oriented process, anchored in a specific spatial context. This makes it a
prototypical dialogic situation. Spatial language is a lively area of research but, surprisingly, the
overwhelming majority of studies explore spatial language in monologue – and often in highlycontrolled and artificial settings.
There has been a range of empirical studies that have sought to bridge the gap between
abstract and real-world route instruction tasks, yet this research tends to be from a particular
perspective/set of assumptions or simplifications that are problematic. The most important of
these is that the research often sees language production and comprehension in isolation,
lacking interactivity between the parties. Among the most cited studies are those of Allen,
whose contributions include a framework for the analysis of route instructions (Vanetti and
Allen, 1988), investigation of individual differences (Allen, 2000a; 1997) and an account of
properties of route instructions that facilitate wayfinding (Allen, 2000b). Similarly, the Human
Cognition Group in Paris (Daniel and Denis, 1998; 2004; Denis, 1997) has provided analyses of
spontaneously-produced route instructions, looking at the effect of their conciseness and
effectiveness on wayfinding. In all of the studies mentioned so far in this section, the
instructions that the participants followed had been independently produced by either another
group of subjects beforehand or by the experimenter.
Golding (1996) also describes a study in which subjects gave instructions, answering questions of
the type: “how do you get to X?” and “where is X?”. While Golding (1996) acknowledges route
explanations as a question-answering process that supports the addressee’s goals and identifies
common ground between the parties involved, it is surprising that the addressees, who were
informed confederates, were allowed minimal contribution to the interaction (being limited to
providing only ‘yes’/‘no’ answers). The importance of the element of interaction that is missing
from these studies is highlighted by Allen (2000b). The study’s findings suggest that instructors
adhere to certain conventions related to the principles of “referential determinacy” and “mutual
understanding” captured in the Collaborative Model of dialogue (Clark, 1996). That is, people
are expected to produce route descriptions which minimise the uncertainty along the route, by
concentrating on linguistic elements that: (i) provide specificity and additional information
about the environment at points that offer potential orientation problems (such as crossroads);
and (ii) that are easy for the listeners to make sense of. The participants in Allen’s (2000b) study
followed scripts of route instructions that differed in terms of these elements and it was found
that navigation errors increased when route instructions violated these principles.
The lack of interaction in many studies is a simplification that is problematic, assuming that
understanding how people produce and comprehend language in isolation can lead to an
understanding of how people communicate. A normal dialogic situation is, however, more than
an information transfer between speakers, and empirical research using dialogue paradigms has
provided ample evidence of how the dialogic situation shapes language (e.g., Garrod and
Anderson, 1987; Pickering and Garrod, 2004; Clark, 1996). In studies of this kind, language is
seen as a collaborative activity in which partners introduce, negotiate and accept information.
This is illustrated in studies that identify partner adaptation in a variety of tasks and contexts
(see, for example, Brennan and Clark (1996); Schober (1993); Fischer (2007)). The interaction
that occurs through dialogue means that interlocutors gradually align their linguistic
expressions. This is evident in Garrod and Anderson’s (1987) maze task experiment which found
that participants converged on similar spatial descriptions. Evidence of alignment has also been
found within route-giving dialogues, where participants aligned in terms of reference and
perspective (Filipi and Wales, 2009; Schober, 2009). Whereas monologue-based accounts treat
language production and language comprehension as distinct autonomous processes, the
Interactive Alignment Model of dialogue (Pickering and Garrod, 2004) assumes that they are
closely coupled to each other in dialogue. According to the model, as the dialogue proceeds
interlocutors come to align their language at many levels (phonological, lexical, syntactic,
semantic, reference frames and situation models). In other words, an interlocutor matches the
most recent utterance from his/her partner with respect to lexical choice, lexical meaning,
syntax, etc. Alignment acts as a mechanism to promote mutual understanding and highlights
the collaborative nature of dialogic communication.
A specific value that can be gained from using dialogue methods to study route instructions is
that they can elucidate the effects of comprehension on the subsequent production of route
information (Pickering and Garrod, 2004). They can also help to identify and understand the
relation between route descriptions and natural communication phenomena that are
suppressed in monologue situations – such as feedback, clarification and confirmation requests,
and repairs (Muller and Prévot, 2009). Such events are important as they can be used to classify
the effectiveness and efficiency of communication and are likely to be important in future
innovations in human-computer navigation systems.
3. AIMS OF THE STUDY
The study reported in this paper adopts a dialogic approach to explore the conducting and
completion of route tasks – an approach fruitfully applied in recent studies of car-based systems
that stress the importance of navigation as a collaborative task (such as Forlizzi et al., 2010).
This allows us to investigate the influence of gender – not only at an individual but also at a pair
level – on communication efficiency and effectiveness by looking at the performance of spatial
route tasks and the content and structure of the instructions given and the responses to them.
Although existing literature points to males having an advantage in monologic direction-giving
and -following tasks, it remains an open question whether this advantage persists in interactions
with other people, whether of the same or opposite gender. Through applying a dialogic
approach, this study aims to determine whether males are better at comprehending, executing
and negotiating route instructions in real-time interaction with their partner. Moreover, the
study explores the hypothesis that men are capable of producing more efficient route
instructions than women. Essentially, it tests the hypothesis that pairs of male participants will
outperform pairs that consist of at least one female, but where the pairings think that the
instructor is a robot (therefore avoiding bias as a result of gender perceptions).
The study’s focus, however, is not only to elucidate differences in performance but also in the
content and structure of the instructions given, by looking at the use of certain linguistic
components (namely, delimiters and landmark references) that are associated with effective
instructions (Vanetti and Allen, 1988; Allen, 2000b; Michon and Denis, 2001). This will provide
evidence in relation to the recurring finding (discussed in section 1) that women favour
landmark-based references more than men do, in both roles of follower and instructor – which
could, in turn, be used to improve the design of computer-based navigation systems.
To address these study aims, an experiment was designed to elicit natural dialogues which
contained spontaneously-generated route instructions within a controlled spatial network. The
details of the method are set out in the next section.
4. METHOD
The study employed a modified version of a Wizard-of-Oz experiment: in a Wizard-of-Oz
experiment, two people interact, one of whom is under the impression that s/he is talking to a
system. The instructors/users in this experiment were made to believe that they were
interacting directly with a robot (the follower). However, in order to ensure the naturalness of
the interaction, the ‘robots’/followers were also naive participants and they were given no
guidelines on what to say and no dialogue script. The domain used in the experiment was
navigation in a town and the user had to guide the robot to six designated locations. The
cooperative nature of the task lay in two additional characteristics. First, in each pairing, only
the user knew the destinations and had a global view of the environment, so the ‘robot’ had to
rely on the user’s instructions and location descriptions. Secondly, the user needed the ‘robot’s’
descriptions to determine its current position and perspective. Participants were able to freely
interact and develop their own strategies to carry out the experimental and discourse task.
Placing a ‘robot’ (rather than making explicit that it was another person) at the other end of the
communication channel serves three purposes. Firstly, the obvious merit of this approach is
that the results can be used in the future design of robotic/computer systems and embodied
conversational agents. Secondly, communication in normal conversational settings makes use
of assumptions and shared knowledge as well as general linguistic conventions. These features
are often transparent to those involved and are likely to be confounding in terms of the aims of
this study. When talking to a ‘robot’, users are expected to avoid using this knowledge and to
depend on assumptions and conventions set up within the course of the particular dialogue
only, allowing clearer insights into their patterns of interaction. Thirdly, it masks the gender of
participants, avoiding gender stereotype issues which might influence the communication, such
as men being less likely to listen to instructions from female voices (see, for example, Jonsson et
al., 2008).
To allow gender differences in route-giving and -following tasks as they emerge from interaction
to be explored, pairs were formed with all possible combinations of roles and gender:
1. Male user/instructor, Male ‘robot’/follower (henceforth referred to as MM)
2. Male user/instructor, Female ‘robot’/follower (MF)
3. Female user/instructor, Male ‘robot’/follower (FM)
4. Female user/instructor, Female ‘robot’/follower (FF)
4.1 Experimental set-up
For the purposes of the experiment, a custom system was developed that supported the
interactive simulation and enabled real-time direct text communication between the user-
‘robot’ pairs. The system connected two interfaces over a Local Area Network (LAN) using
TCP/IP as the communication protocol. The system kept a log of the dialogues and also
recorded the coordinates of the current position of the ‘robot’ at the moment that each
message was transmitted, making it possible to analyse a textual description against a matching
record of the robot’s position and reproduce its path with temporal and spatial accuracy. The
interfaces consisted of a graphical display and an instant messaging facility (the dialogue box).
The dialogue box displayed the participant’s own messages in the top part of the box, with the
messages received by the other participant displayed in the lower part of the box. Figures 1 and
2 show the interfaces operated by the user/instructor and ‘robot’/follower, respectively.
The interface of the user/instructor displayed the full map of the simulated town. The
destination location was shown in red and the tasks that had been completed were shown in
blue.
Figure 1: The interface for the user/instructor.
The ‘robot’s’/follower’s interface displayed a fraction of the map, the surroundings of the
robot’s current position. The ‘robot’ was operated by the follower using the arrow keys on the
keyboard. The dialogue box also displayed a history of the user’s previous messages. To
simulate the ability of the ‘robot’ to learn routes, after each task was completed a button for
this route appeared on the robot’s/follower’s screen. If the ‘robot’ was instructed to go to a
previous destination, the robot/follower could press the corresponding button and the ‘robot’
would automatically execute the move.
Figure 2: The interface for the ‘robot’/follower.
4.2 Experimental procedure
A total of 56 participants (31 males and 25 females) were recruited from various departments of
a UK university. The allocation of participants to the two roles (user/instructor versus
‘robot’/follower) was random and no computer expertise or other skill was required to take part
in the experiment. The participants were allocated to pairs as shown in Table I.
Table I: The pair configurations
Pair Configuration
Number of Pairs
FF
5
FM
7
MF
8
MM
8
Users/instructors and ‘robot’/followers were seated in separate rooms equipped with desktop
PCs, on which the respective interfaces were displayed. Participants received verbal and written
instructions related to the task from their role perspective. For the ‘robots’/followers this
included the fact that they were to pretend to be robots. The ‘robots’/followers were also given
a brief demonstration and time to familiarise themselves with the operation of the interface.
The users/instructors were told that they would interact directly with a robot, which for
practical reasons was a computer-based, simulated version of the actual robot. They were
informed that the robot had limited vision, but advanced capacity to understand and produce
spatial language and learn previous routes, reducing the likelihood of users/instructors inferring
during the interactions that the ‘robot’ was actually a person. They were asked to open each
interaction with “hello” (which actually initialised the application used by the ‘robot’/follower)
and end it with “goodbye” (which closed both of the applications used by the pair).
Users/instructors were asked not to employ cardinal reference systems (such as “North”,
“South”, “up”, “down”), since use of reference systems was not a focus of the study and it was
thought that it may lead to confusion/ambiguity since no reference system was provided in the
map. Instead ‘forward’, ‘backward’, ‘right’ and ‘left’ were to be used as directional statements.
The users/instructors were further instructed to use the robot’s perspective. The users were
given no other examples of, or instructions about, how to interact with the robot.
The pairs attempted six tasks presented to each pair in the same order; the user/instructor
navigated the ‘robot’/follower from the starting point (bottom right of the map) to six
designated locations (pub, lab, factory, tube, Tesco, shop). The users/instructors were free to
plan and modify the route as they wished. The destinations were selected to require either
incrementally more instructions or the use of previously taught routes. Dialogues could run
until the task was completed or the user/instructor chose to end them. At the end of the
experiment, participants were debriefed and the full nature of the experimental set-up was
disclosed and explained. Before this disclosure, the users/instructors were probed about their
understanding of the experimental set-up. Each of them confirmed their confidence in the setup and expressed surprise when told subsequently that they had been interacting with a human
acting as the ‘robot’. This gives confidence that any effects identified in the results are not a
result of language adaptation by the users/instructors arising from them believing that they
were instructing another person.
5. DATA ANALYSIS APPROACH
The study yielded a corpus of 160 dialogues, which comprised 3,386 turns by the participants
(1,853 user/instructor turns and 1,533 ’robot’/follower turns). The users/instructors produced
1,460 instruction units. Quantitative analysis of relevant data – such as the time taken to
complete each task, and the number of words, turns and instructions in each dialogue – was
undertaken alongside detailed qualitative discourse analysis of the dialogues – which identified
the frequency of miscommunication and the type and granularity of the instructions. The
approaches taken as part of the discourse analysis are outlined in the following three subsections.
5.1 User/instructor utterances: component-based analysis of instruction
units
The 950 instruction turns were segmented into 1,460 instruction units (also referred to as Turn
Constructional Units by Tenbrink (2007) and Minimal Information Units by Denis (1997)). The
primary, initial annotation of instruction units was based on the classification schemes of Denis
(1997) and Tenbrink (2007). The main distinction made in Denis’s (1997) original scheme is
whether the instructions contain references to landmarks. The categories in the scheme used in
this study were: (i) action prescriptions without landmarks (e.g., go forward, turn right); (ii)
action prescriptions with landmarks (e.g., turn left at the pub, cross the street); and (iii)
introduction/description of landmarks with descriptive verbs such as “is”, “see”, or “find” (e.g.,
you’ll see a bridge on your left).
Following Tenbrink (2007), we introduced a subdivision of landmarks, categorising them as:
references to three-dimensional landmarks (such as buildings and bridges); two-dimensional
landmarks (referred to as pathways, such as streets and junctions); or references to the
destination location.
An example of the analysis and tagging is shown in Table II, which is a dialogue turn comprising
four instruction units.
Table II: An example of a dialogue turn comprising four instruction units with tags used – DIR
denotes action statements with verbs of movement, L denotes locations, P denotes pathways,
DES denotes descriptive statements with descriptive/‘state of being’ verbs, D denotes the
destination.
Cross the bridge [DIR L] then turn right [DIR]. Turn right again at the next junction [DIR P]. The
factory is to your left [DES D].
Finer-grained component analysis was then performed on the corpus of instruction units. The
analysis used Allen’s Communication of Route Knowledge framework (Vanetti and Allen, 1988),
which considers features and smaller constituents, such as frame of reference and modifiers.
This framework complements the initially used scheme in two respects. Firstly, it further divides
the Pathway category into ‘choice points’, which include junctions, intersections and crossroads,
and ‘pathways’, which include channels of movement (streets, roads, etc.). Secondly, it
introduces delimiters – features that define the instructions and provide differentiating
information about an environmental feature (i.e., a landmark).
The final scheme used in this study, bringing together Denis’ (1997), Tenbrink’s (2007) and
Vanetti and Allen’s (1988) ideas, is outlined in Table III. The tags that were used are shown
within the brackets.
Table III: The final instruction units classification scheme used in this study (developed from the
schemes presented by Denis (1997), Tenbrink (2007) and Vanetti and Allen (1988)).
Action Type
Tag
Action only directive based on verb of movement
DIR
No action, reference to environmental feature
DES
Environmental Feature
Tag
Location
L
Pathways
P
Choice points
C
Destination
D
Delimiter
Tag
Distance designations: specify action boundary information, such as
space separating points of reference (i.e., ‘until you see a car park’,
‘from the bridge to the church’)
1
Direction designations: specify spatial relations in terms of an intrinsic 2
body-based frame of reference (left, right) or cardinal directions
(north, south, up, down, forward, backward)
Relational terms: prepositions to specify the spatial relationship 3
between the ‘robot’/follower and the environmental feature, or
between environmental features (on the left of, toward, away from,
between, in front of, beside, behind, across from)
Modifiers: adjectives to differentiate features (‘turn left at the big red 4
bridge’, ‘take the first/second/last road on the left’)
Category 2 delimiters (such as left, right, down, forward) are the basic constituents of a route
instruction since they specify the direction of movement. However, purely directional
instructions are underspecified and provide minimal information to the follower (Tenbrink,
2007). Complementing the directional instructions with action boundary information (provided
by category 1 delimiters), and/or terms that clarify the frame of reference (category 3
delimiters) and specify the target landmark (category 4 delimiters) increases the instruction’s
level of granularity and reduces referential ambiguity (Allen, 2000b; Tenbrink, 2007).
To estimate the specificity and level of granularity of user instructions – of interest given the
study’s focus on efficiency and effectiveness of communication – the number of actions and
delimiters embedded in each instruction were calculated. According to research (Denis, 1997;
Michon and Denis, 2001 and Fischer, 2007), the inclusion of environmental features also
decreases referential ambiguity, so such components were also considered. Examples of the
annotation of the instruction units and the resulting calculation of components are given in
Table IV.
Table IV: Example of component-based annotation of user instructions (DIR: action directive
based on verb of movement; C: choice point; L: location; numbers in the ‘tag’ column signify
delimiter type from Table III).
Instruction Unit
Tags: Action, Delimiter and Number of Components
Environmental Feature
in the Instruction Unit
Move forward
DIR 2
2
Move forward until you get to DIR 2 1 4 C 3
the first junction on your right
6
Move forward until you reach DIR 2 1 L
a bridge
4
The annotation is illustrated by considering the most complex instruction unit in the example
captured in Table IV (‘Move forward until you get to the first junction on your right’): the
instruction is a directive statement (DIR) based on the verb of movement, ‘move’; ‘forward’ is a
category 2 delimiter designating direction; ‘until’ is a category 1 delimiter, providing boundary
information for the action, ‘move forward’; ‘first’ is a category 4 delimiter specifying the target
landmark, ‘junction’; the ‘junction’ is a choice point; and the choice point is further
complemented by the category 3 delimiter, ‘on your right’, stating its position in relation to the
frame of reference. This gives six components in the instruction unit.
5.2 ‘Robot’/follower utterances: analysis of responses
As the study focuses on both sides of the interaction, ‘robot’/follower turns were also
considered in the annotation. The responses by the ‘robot’/follower, immediately after a user
instruction, were tagged based on whether they were statements (S) or questions (Q), and
whether they contained references to locations (L), pathways (P), choice points (C), the
destination (D) or simple directional designations (i.e., category 2 delimiters, such as left, right,
forward, etc.).
The Interactive Alignment model (see Section 2) proposes that the tendency of interlocutors to
repeat each other’s lexical choices is an indication that they are aligned in terms of lexicon
(Brennan and Clark, 1996). Higher alignment is associated with better understanding and
dialogue success (Costa et al., 2008). Recognising matches is therefore important in making
judgements on the effectiveness of the communication. To this end, the ‘robot’/follower tags
were compared to the corresponding tags of the user instruction and the match rates were
calculated. An example of tagged dialogue is shown in Table V: the first ‘robot’ response (2) is
tagged as a ‘match’, repeating the user’s word “junction”; whereas the second ‘robot’ response
(4) is not a ‘match’.
Table V: Examples of instructions and responses with indications of whether or not they are
matched. The columns denote (from left to right): the speaker (User or Robot), the Utterance
Number, the Utterance, the annotation of the instruction, the annotation of the
‘robot’/follower’s response and the match between instruction and response.
Speaker
Utterance
Number
Utterance
Instruction
Tags
User
1
turn right until you come to
the junction
DIR 2 1 C
Robot
2
I am at the junction
User
3
turn back, at the junction
DIR 2, DIR 2
turn left, destination is on
C, DES 3 D
the left
Robot
4
please
give
instructions
further
Response
Tags
Match
SC
Yes
S
No
5.3 Annotation of miscommunication
Other judgements related to efficiency of the communication can be drawn from the
identification and analysis of miscommunication. From a theoretical perspective, there are two
types of miscommunication: misunderstandings and non-understandings (Hirst et al., 1994).
Misunderstandings corresponded to execution errors, which refer to instances in which the
‘robot’/follower failed to understand the instruction and deviated from the described route.
The coordinates (x, y) of the ‘robot’s’ position were recorded for each exchanged message and
placed on the map of the town (which was defined as 1024 by 600 pixels), allowing the
movements of the robot to be retraced when undertaking analysis of the dialogues. Execution
errors were determined by matching the coordinates corresponding to each of the
user’s/instructor’s utterances with those returned as a result of their execution by the
‘robot’/follower. An excerpt of a dialogue containing an execution error is shown in Table VI.
Figure 3 illustrates the route which the user described and the robot followed during the
interaction presented in Table VI. The ‘robot’/follower accurately executed the instructions in
utterances 5, 6 and 7. However, the ‘robot’/follower misunderstood the next instruction
(utterance number 8) and ended up in an unintended location.
Table VI: An excerpt of a dialogue containing an execution error. The columns denote (from left
to right): the speaker (User or Robot), the Utterance Number, the ‘robot’ coordinates and time
that the utterance was sent and the utterance.
Speaker
Utterance
Number
Coordinates
and Time Stamp
Utterance
User
1
1000,530
Hello
@13:37:32
Robot
2
1000,530
Hello
@13:37:36
User
3
1000,530
we are going to Tesco
@13:37:42
Robot
4
1000,530
ok. directions please
@13:38:5
User
5
1000,530
go straight ahead and turn right at the junction
@13:38:20
User
6
909,464
@13:38:47
User
7
902,358
@13:39:12
User
8
675,259
then go straight and follow the road round the
bend to the left
you will pass a bridge on your right, continue
going straight
then cross the bridge and turn left
@13:39:35
User
9
561,117
@13:40:8
Tesco will be on the right hand side and that is
the destination
Figure 3. The ‘robot’s’ execution of the instructions given in the dialogue presented in Table VI: the solid white line illustrates the
accurately executed route; the grey long dashed line represents the route that the instructor described but the ‘robot’ failed to execute;
the grey dotted line shows the deviation from the intended route; the numbers in brackets along the executed route indicate the
utterances communicated at that point.
The second type of miscommunication considered in the analysis are the utterances by the
‘robot’/follower that signalled non-understanding (typically formulated as clarification requests)
(Gabsdil, 2003). The annotation of non-understandings follows the definition provided by Hirst
et al. (1994) and Gabsdil (2003). Non-understandings occur when: (i) the ‘robot’/follower forms
no interpretation of the user/instructor’s utterance; (ii) the ‘robot’/follower is uncertain about
the interpretation s/he obtained; or (iii) the utterance is ambiguous to the ‘robot’/follower,
leading to more than one interpretation of the instruction. Table VII contains examples of
utterances corresponding to these different sources of non-understanding, but it should be
made clear that the analysis did not consider each source separately. Non-understandings also
included cases in which the ‘robot’/follower understood the meaning of the instruction but had
a problem with its execution. An example of this final type of non-understanding is where the
user/instructor is telling the ‘robot’/follower to move forward, but the instruction cannot be
executed given the ‘robot’s’ current location on a t-junction, as in example (iv) in Table VII.
Allen (2000b) practically demonstrated the validity of combining deviations from the described
route with instances in which followers expressed non-understanding (i.e., they did not know
where to go next) into a single measure – termed ‘information failure’. This approach was
adopted in the study, with the two types of miscommunication (execution errors and nonunderstandings) being combined in one measure.
Table VII: Examples of non-understandings produced by the ‘robot’/follower.
Examples of
understanding
Non- Speaker
(i)
(ii)
(iii)
(iv)
Utterance
User
Turn left.
User
There is a pub. The building
next to you.
Robot
Please instruct which way
exactly.
User
You must turn to your left and
go to the end of the junction.
Then you turn right.
Robot
Turn right when I can see the
tree?
User
Go back to last location.
Robot
Back to the bridge or back to
the factory?
User
Go forward.
Robot
There is a fork in the road.
6. RESULTS
This section introduces and justifies the analysis approach used and reports the results of the
quantitative and qualitative analysis of the dialogues between users/instructors and
‘robots’/followers.
6.1 Analysis Approach
One-way ANOVA for independent groups was performed, the factor being the pair
configurations (MM, MF, FM, FF). The efficiency of the interaction was determined using the
following measures: the time taken and the number of words, turns and instructions per task.
The effectiveness of the interaction was established by measuring the rates of
miscommunication. Component analysis of the instruction units was undertaken to provide
detail on the granularity and types of interaction, which are also important in determining
effectiveness and efficiency. Finally, the match rates between the ‘robot’/follower responses
and the user/instructor instructions were used as an indicator of alignment between partners.
Two-way ANOVA was also undertaken, the factors being user and ‘robot’ gender. The results of
the one- and two-way ANOVA were consistent for all three variables considered (time taken, the
number of instructions per task, and rates of miscommunication). The high-level analysis and
data for the two-way ANOVA are presented in Appendix A.
The paper reports the results of the one-way ANOVA because this analysis emphasises (or
‘foregrounds’) the effect of the interaction of role/gender, which is expressed as the factor of
group configuration (all possible combinations of role and gender).
Particular caution was exercised with respect to the assumptions for the parametric tests. For
all three variables (i.e., time, instructions per task and rates of miscommunication) the shape of
the distributions were examined before performing the ANOVA. For the group of n=5, some of
the histograms did not look particularly ‘normal’, but the assumptions were not grossly violated,
since there were no signs of outliers in the boxplots and no ‘lumps’ or large gaps in the
distributions. As such, the data are not inconsistent with being drawn from a normallydistributed population. Secondly, Levene’s test was used to ascertain equal variances between
groups. Finally, the most ‘conservative’ (i.e., the lowest risk for type I error) post hoc test – the
Scheffé test for pairwise multiple comparisons – was used to identify the levels of significance
for specific differences between groups.
To provide additional assurance of the suitability of adopting a parametric test, a nonparametric test was used as a comparison. The Kruskal-Wallis test – the non-parametric
equivalent to one-way ANOVA, based on the ranks of scores – was performed. The MannWhitney test was used for post-hoc analysis. The results of the Kruskal-Wallis test were along
the same lines as the parametric ANOVAs that were undertaken. The results in terms of pairwise differences were also supported by the Mann-Whitney test. The results of the nonparametric tests are given in Appendix B for completeness.
6.2 Time taken per task
The results associated with the average time taken to complete each task suggest that pair
configuration has a significant impact on the speed with which the pairs completed each task,
F(3,24) = 4.038, p = 0.019. The post-hoc test indicates that statistically-reliable differences are
found between the FM and FF pairs ( p = 0.05) and between the FM and MF pairs (p = 0.05). In
particular, FM pairs were significantly quicker (306 seconds) – by almost two minutes – than FF
and MF pairs (425 seconds and 409 seconds, respectively). Figure 4 shows the average
completion time per task for the four pair configurations. The means and standard deviations
are included in Table VIII. The additional two-way ANOVA analysis and non-parametric tests
presented in Appendices A and B support the assertion that this is a pair effect.
450
Completion Time (sec)
400
350
300
250
200
150
100
50
0
FF
FM
MF
MM
Figure 4: Average completion time per task for the four pair configurations.
6.3 Number of words, turns and instructions
In order to further explore communication efficiency, the number of words, turns and
instructions required to complete each task were recorded. Comparisons of the number of
words and turns (by users/instructors, ‘robots’/followers and totals per pair) showed no reliable
differences. However, analysis of the mean number of instructions that users/instructors
provided revealed an effect of pair configuration, F(3,23) = 3.771, p = 0.025. The largest
difference, which provided the greatest contribution to the effect, was found between FM and
FF pairs (p = 0.03), with the former using on average 40% fewer instructions to correctly reach
the destination than the latter. The mean number of instructions per task and standard
deviations for each pair configuration are shown in Table VIII.
In brief, it seems that all interlocutors, irrespective of role and gender, were equally ‘talkative’
and claimed conversational ground at similar rates. However, female users/instructors in FF
pairs seemed less efficient in the use of route instructions.
Table VIII: The means and standard deviations for the three variables – time, number of
instructions and miscommunication (number of execution errors and non-understandings) – per
task, for the four pair configurations.
Time per Task
Pair
Configuration
Number of Instructions
per Task
Miscommunication per
Task
Mean
Standard
Deviation
Mean
Standard
Deviation
Mean
Standard
Deviation
FF
424.97
67.21
12.340
4.080
2.180
1.443
FM
306.19
61.27
7.380
2.264
0.785
0.533
MF
409.15
65.99
8.787
2.454
1.234
0.671
MM
370.14
73.85
8.095
1.959
0.809
0.539
6.4 Frequency of miscommunication
As previously noted, a combined measure of the two types of miscommunication ((i) execution
errors and (ii) ‘robot’ turns that were tagged as expressing non-understanding
miscommunication) was used in this study as a measure of effectiveness. The one-way ANOVA
revealed a significant effect (F (3, 23) = 3.628, p = 0.028) with respect to this combined measure
of miscommunication. Post hoc analyses using the Scheffé criterion indicated that the average
number of errors and non-understandings per task was higher in the FF condition (M = 2.18, SD
= 1.44) than in the FM condition (p = 0.035) (M = 0.78, SD = 0.53). Marginal specific differences
(p = 0.08) were also found between FF and MM (M = 0.8, SD = 0.53) pairs. These results suggest
that ‘robots’ in FF pairs were almost three times more likely to fail to understand and execute
instructions than male ‘robots’ paired with users of either gender.
The rates of
miscommunication are summarised in Table VIII.
6.5 Instruction types and granularity
The corpus of utterances contained 1,460 single instructions. Primary component analysis of
the instructions revealed that the biggest single type of instruction was action prescriptions
without landmarks (47% of the utterances). 53% of the instruction corpus contained a reference
to a location or a path entity. In particular, users/instructors employed instructions that
included location references in 19.4% of the instruction instances. Pathway references
accounted for 18.7% of instruction instances. Finally, destination references with action
constituted the first instruction (stating the destination) whereas destination references without
action were used without exception as final instructions and formed 15% of all instructions.
Figure 5 shows the distribution of the instruction types in the corpus.
No Action +
Pathway (DES P)
Action +
Destination (DIR
D)
No Action +
Location (DES L)
No action +
Destination (DES
D)
Instruction Types
Action Only (DIR)
Action + Pathway
(DIR P)
Action + Location
(DIR L)
Figure 5: Overall distribution of instruction types.
Comparing the distribution of instruction types across pair configurations yielded a reliable
difference (χ2(3)=29.601, p<0.001), showing that users/instructors in MF pairs tend to use
considerably fewer simple action prescriptions than users/instructors in the other pair
configurations. Only 35% of their instructions were action-only descriptions as opposed to 50%
for the other pairs. They also used more location references (27% versus 16%). The results of
the analysis are schematically and numerically presented in Figure 6 and Table IX.
Instruction Types
Action Only (DIR)
MM
Action + Location (L)
Action + Pathway (DIR P)
MF
Action + Destination (DIR D)
FM
No Action + Location (DES L)
No Action + Path (DES P)
FF
No action + Destination (DES D)
0%
20%
40%
60%
80%
100%
Figure 6: Schematic presentation of the use of each instruction category across pair configurations.
Table IX: Percentages showing the use of each instruction category across pair configurations.
Pair
Action Only
Configuration
Action+
Action
Action
No action
No action
No action
Location
+Pathway
+Destination
+Location
+Pathway
+Destination
FF
50.57%
15.52%
19.83%
6.03%
0.29%
0.00%
7.76%
FM
51.29%
18.06%
15.16%
5.48%
0.32%
0.65%
9.03%
MF
35.05%
26.80%
20.62%
9.79%
0.77%
0.00%
6.96%
MM
51.45%
15.22%
17.39%
6.76%
0.24%
0.72%
8.21%
The analysis revealed an association between pair configuration and level of granularity of the
instructions provided (χ2 = 9.674, df= 3, p=0.02). Using the categories of ‘instructions with two
components’ and ‘instructions with three or more components’, inspection of the frequencies
showed that users/instructors in MF pairs were more likely to provide more detailed and explicit
information (see Figure 7).
Interestingly, approximately the same frequencies were observed across the other
configurations, but ‘robots’/followers in FF pairs seemed to be the least capable of dealing with
reference resolution problems, under-specification and missing boundary information, as
indicated by the elevated miscommunication rates (reported in Table VIII).
Instruction Component Number
MM
MF
FM
FF
0%
20%
40%
60%
CompNo=2
CompNo>2
80%
100%
Figure 7: Frequency of instructions with two, and three or more components for the four gender pairings.
6.6 ‘Robot’/follower responses
Two points relating to gender differences can be identified from the ‘robot’/follower response
data. The first concerns the ‘match’/‘no match’ rates – the extent to which the ‘robot’/follower
responses either do or do not match the linguistic content of the previous instruction (see Table
V for examples). The data suggest that an association exists between the pair configuration and
‘match’/‘no match’ rates (χ2 = 15.148, df= 3, p=0.002), with FF pairs most likely to use nonmatching responses (see Table X).
The second point relates to the use of landmarks in responses. Female participants were found
to use more reference-based descriptions than males. Though no inference was possible
regarding the use of landmark references by ‘robot’/followers across the pairs configurations
because of the large inter-subject variability that existed, it is interesting to note that female
‘robot’/followers in MF pairs used relatively more references than those in FF pairs. This may be
reflective of the earlier reported finding that male instructors in MF pairs used the largest
number of landmark references.
Table X: ‘No match’ rates between user/instructor instruction and ‘robot’/follower response for
different pair configurations.
Pair Type
Percentage
of
‘no
‘robot’/follower responses
FF
48.45
FM
39.32
MF
33.68
MM
39.42
match’
7. DISCUSSION
Although gender differences in user interface design and use are of great interest to researchers
and developers alike, the interaction design process usually excludes gender considerations. As
a result, even today ‘the user’ remains genderless (Bardzell, 2010). This study helps to address
this gap and can be placed within the new subfield of HCI, termed ‘Gender HCI’ (Beckwith et al.,
2006), which focuses on the differences in how males and females interact with ‘gender-neutral’
systems and, by taking gender issues into account, how systems can be designed to be equally
effective for both men and women (Fern et al., 2010).
Research in this area includes the pioneering work by Czerwinski and her colleagues (Czerwinski
et al., 2002). Their approach was first to identify gender differences in Virtual Reality (VR)
navigation and then to find solutions to offset these differences in display and VR world design
(by provision of larger displays and wider views). Another example is the Gender HCI project of
the EUSES consortium, which uncovered gender differences in end-user programming in terms
of confidence and feature use and proposed solutions for the design of programming
environments (Beckwith, 2007). Similarly, Fern et al. (2010) showed differences between male
and female users, and the relation between their strategies and success in a debugging task.
The position held in this body of research is that software design determines how well female
problem solvers can make use of the software. Understanding how gender influences
strategies, behaviours and success is the first step towards design that promotes successful
behaviours and strategies by users of both genders.
Along the same lines, the study reported in this paper contributes to ‘Gender HCI’ by detecting
gender differences in the novel domains of Human-Robot Interaction (HRI) and spoken dialogue
systems, which are prime examples of collaborative/goal-oriented interaction between humans
and computer systems. As noted in section 2, there is a significant amount of research on
spatial cognition and language, a considerable part of which has focused on the investigation of
gender differences. The novelty of the current study, however, is that it has examined gender
differences using the dialogue paradigm in a naturalistic but carefully controlled spatial setting.
Most existing research identifies male superiority in a range of spatial activities and domains,
leading to the prediction that all-male pairs would outperform all other groups and that allfemale pairs would be the least successful. Similarly, it might be expected that pairs with a male
in either user/instructor or ‘robot’/follower role (i.e., MF or FM pairs) would show more efficient
interactions than FF pairs. The study reported in this paper, however, reveals a more complex
pattern of results.
As anticipated, gender influences task performance and communication. However, the findings
suggest that it is the interaction – the combination of gender and role – that has the most
significant impact. In particular, in this study, pairs of female users/instructors and male
‘robots’/followers (i.e., FM pairs) are associated with the fastest and most accurate completion
of tasks. Female users/instructors needed to give fewer instructions, but only when the person
following them was a male. Male ‘robots’/followers in this pair configuration are associated
with the lowest rates of execution errors and non-understandings.
Whereas females in FM pairs were involved in the most efficient communication, when paired
with female ‘robots’/followers they failed to produce similar results. In FF pairs, tasks took
longer, female users/instructors gave more instructions and female ‘robots’/followers faced
greater difficulty in understanding and executing instructions. MF pairs were also significantly
slower than FM pairs in completing the tasks. These results do not, though, imply male
superiority in direction interpretation and following since female ‘robots’/followers in MF pairs
performed equally well in terms of mean number of instructions and were almost as ‘errorprone’ as male ‘robots’/followers paired with male users/instructors.
While this analysis in terms of performance-related measures identifies a picture in which FM
pairs were the most successful and FF pairs the least, the dialogue-based analysis refines this
view and illustrates how MF pairs achieved successful communication through alignment of
spatial descriptions.
In terms of instruction type (action-only versus action + reference to environmental feature), MF
pairs used considerably fewer action-only instructions and a greater number of incorporated
landmark references (i.e., action + reference to environmental feature) compared to the other
groups. Though there is ample evidence that females use landmarks as a strategy to find and
describe a route, there must be a different explanation given that the instructor in this pair type
was male. In this study, male users/instructors included significantly more landmark references
only when interacting with a female as ‘robot’/follower. The explanation proposed here is that
the male users/instructors adapted their own linguistic choices to match the needs of the
female ‘robots’/followers, by incorporating more landmark references compared to when they
were interacting with male ‘robots’/followers. Indeed, lexical alignment between partners in
MF pairs was the highest among all pair types. This fits with studies that show that speakers
adapt their utterances according to the perceived needs, characteristics and spatial capabilities
of their partners (Sacks et al., 1974; Schober 1993; 2009). Purely spatial instructions, although
simpler in form, are mostly underspecified and ambiguous (Tenbrink, 2007), whereas landmark
references provide cues for (re-) orientation and are used to solve or prevent navigation
problems (Michon and Denis, 2001). Users/instructors in MF pairs did not rely as frequently on
purely spatial instructions, avoiding a source of potential miscommunication.
Male
users/instructors in MF pairs also employed the highest number of delimiters, thus decreasing
ambiguity in their instructions and facilitating way-finding. On the other hand, users/instructors
in all other pair configurations used a greater number of simple spatial instructions, and also
provided instructions at a similar level of granularity. This may be because female
users/instructors did not adapt as well as male users/instructors to the needs of their female
partners. This inference is further supported by the low rates of alignment of ‘robot’/follower
responses to instructions in FF pairs. If this interpretation of the findings is correct, it raises
questions around how male users/instructors were able to perceive their partner’s needs within
a very unusual communication situation of (albeit simulated) human-robot interaction. This
presents opportunities for further experimental investigation.
The fact that differences exist between how people provide instructions to humans compared to
artificial agents in similar contexts is not counter-intuitive. However, the dimensions and extent
of these differences merit additional in-depth research. Comparing the corpus collected in this
study to similar corpora provides interesting insights into the subject. Studies that have used
the same classification of instructions (action only, action + reference to environmental feature,
etc.) across a variety of experiments and conditions report that simple action prescriptions do
not exceed 19% of all instructions given (Denis 1997; Daniel and Denis, 2001, 1998). In Muller
and Prévot (2009), the rate is even lower (5%). The common factor in all these studies,
however, is that the ‘follower’ is a human. When the follower is a simulated robot, the
proportion of action-only instructions rises – to, for example, 31% in the study by Tenbrink
(2007) – suggesting that action-only instructions are less common when produced as part of
navigation tasks for human participants. A likely reason for this is that people are generally
naive about the linguistic and functional abilities of a robot, so they tend to employ a higher
proportion of simple action-based descriptions that are not anchored on visually-recognised
landmarks (see also studies by Moratz and Fischer (2000); Moratz et al. (2001)).
This section concludes with a key recommendation that has to do with adaptability of the
dialogue manager of the system. In this study, females were not found to use landmark-based
spatial descriptions, although this has been described in other research as their default wayfinding and instruction-giving strategy (in non-interactive navigation tasks). Nor was it found
that gender alone predicts whether and how compound descriptions (that is, descriptions with
high granularity) will be employed. However, the findings do highlight the importance of the
‘input-output matching’ of spatial descriptions produced by user/instructor and ‘robot’/follower
as a precondition for stable and successful communication. That is, although the agents initially
start by using different spatial descriptions, as the dialogue progresses the most frequently used
words become increasingly likely to be reused, inhibiting the other competing expressions. The
process of input-output matching is rapid, often occurring in under 15 turns (i.e., soon after
completion of the first task in the experiment). This phenomenon is of immediate practical
concern to the design of human-computer dialogue systems and has implications for handling
both user-generated and system-generated responses. Corpus-collection studies are essential
for building the grammar of the dialogue manager of the system and, as the work presented in
this paper exemplifies, they need to be naturalistic as well being tuned towards the future
application. In deployment of the system, the dialogue manager is initially equipped with this
grammar of expressions (for instance, a grammar containing the appropriate schemata of
landmark-based, simple action-based and compound descriptions), all of them equally likely to
be used by either the system or the user. As the dialogue unfolds, the findings from this study
suggest that the dialogue manager should record and monitor the content and structure of
user’s responses so that it is able to gradually narrow down the grammar to the preferred
expressions. This could contribute to the accuracy of the spatial language understanding
component and, possibly, bring us closer to what makes human-human communication and
collaboration successful. Moreover, dialogue systems, like robots, are also destined for longterm interaction with the user. Hence, the adaptation occurring within a single interaction
should be extended to adaptation between interactions to provide more stable and aligned
dialogues.
8. CONCLUSION
This study has identified pronounced gender-specific differences in the domain of dialoguebased navigation of a robot system. Previous research from diverse fields predicts male
superiority in both roles of route interpretation and production, such that all-male pairs would
outperform all other groups and all-female pairs would be the least successful. This, however,
was only partially supported by our study in which ‘mixed’ pairs exceeded or matched the
performance of all-male pairs. In particular, the results in sections 6.2 to 6.6 outline intricate
patterns showing that female users/instructors with male ‘robots’/followers were the most
successful. Male users/instructors with female ‘robots’/followers achieved their strong
performance by taking advantage of a particular interactive mechanism, aligning their
expressions to those of their partner. Male instructors/users adapted to the ‘needs’ of their
female partners by adjusting the use of landmark references, highlighting the fact that the
language one produces in monologue is different from language in dialogue.
The results do not challenge previous studies, but complement them by suggesting that gender
differences in accurate way-finding or direction-giving can be mitigated when females interact
with males, either as instructors or followers. That is to say, if there exists a female
‘disadvantage’ it seems to disappear through mechanisms that emerge naturally in the
interaction with males. This observation holds practical significance for the development of
dialogue systems, as it points to the existence of dialogue features that equally benefit users of
both genders. Because of the relegation of dialogue as a research paradigm, such interactive
mechanisms have received comparatively little attention. The outcomes of this study raise
questions that present rich opportunities for further experimental investigation. In particular, a
next step is to pinpoint the dialogue features and strategies that relate to improved
performance and communication and test them in isolation in a follow-up controlled dialogue
study. As suggested above, a tentative hypothesis readily emerges from our current results: the
coordinated use of landmark references that was observed in MF pairs could be the key to why
they outperformed FF pairs and matched, in many respects, the performance of the other pair
configurations. In this study, participants successfully coordinated in the presence of
uncertainties arising from language and the environment. Thus, the element of interactive
clarification becomes significant for successful communication and merits further investigation.
This study also illustrates a valid methodology to assess the range of linguistic options that users
are likely to employ in spatial Human-Robot Interaction and shows how the interplay of gender
and role affects the content of the instructions. It identifies user patterns of adaptation; for
instance, users appear to prefer to give short and incremental instructions, in contrast to
strategies used in human/human spatial communication. The study showed a benefit deriving
from partner alignment in choice of words – a strategy that was influenced by role and gender.
Overall, we contend that these observations can serve to inform the requirements analysis and
design of human-computer dialogue systems. From a wider perspective, these insights may also
be useful for researchers and designers to better understand how spatial information should be
displayed or communicated by systems and how the availability and presentation of such
information may change the behaviour and experience of users of different gender.
REFERENCES
Allen, G. L. (1997). From knowledge to words to wayfinding: Issues in the production and comprehension of route directions. In
Hirtle, S. & Frank, A. (eds.), Spatial Information Theory: A theoretical Basis for GIS. Berlin: Springer-Verlag, p.363-372.
Allen, G. L. (2000a). Men and women, maps and minds: Cognitive bases of sex-related differences in reading and interpreting maps.
In O'Nuallain, S. (ed.), Spatial Cognition: Foundations and Applications. Amsterdam: John Benjamins, p.3-18.
Allen, G. L. (2000b). Principles and practices for communicating route knowledge. Applied Cognitive Psychology, 14(4), p.333–359.
Bardzell, S. (2010). Feminist HCI: taking stock and outlining an agenda for design. In Proceedings of the 28th International
Conference on Human Factors in Computing Systems (CHI '10). ACM, New York, USA, p. 1301-1310.
Barkowsky T., Knauff M., Ligozat G. & Montello, D. R. (eds.) (2007). Spatial Cognition V: Reasoning, Action, Interaction, Lecture
Notes in Computer Science, Berlin: Springer.
Beatty, W. & Troster, A. (1987). Gender differences in geographical knowledge. Sex Roles, 16(11), p.565-89.
Beckwith, L. (2007). Gender HCI Issues in End-User Programming, Ph.D. Thesis, Oregon State University.
Beckwith, L., Burnett, M., Grigoreanu, V. & WiedenBeck, S. (2006). HCI: What about the software? Computer, p. 83–87.
Brennan, S. E. & Clark, H. H. (1996). Conceptual pacts and lexical choice in conversation. Journal of Experimental Psychology:
Learning, Memory and Cognition, 22 (6), p.1482–1493.
Chen, C., Chang, W. & Change W., (2009). Gender differences in relation to wayfinding strategies, navigational support design, and
wayfinding task difficulty. Journal of Environmental Psychology, 29, p.220–226.
Clark, H. H. (1996). Using language. New York: Cambridge University Press.
Coluccia, E., Bosco, A. & Brandimonte, M. A. (2007). The role of visuo-spatial working memory in map drawing. Psychological
Research, 71, p.359–372.
Coluccia, E. & Losue, G. (2004). Gender differences in spatial orientation: a review. Journal of Environmental Psychology, 24(3),
p.329–340.
Coluccia, E., Losue, G. & Brandimonte, M. A. (2007). The relationship between map drawing and spatial orientation abilities: a study
of gender differences. Journal of Environmental Psychology, 27, p.135–244.
Costa A., Pickering, M. J. & Sorace, A. (2008). Alignment in second language dialogue. Language and Cognitive Processes, 23(4),
p.528–556.
Czerwinski, M., Tan, D. S. & Robertson, G. G. (2002). Women take a wider view. In Proceedings of the SIGCHI Conference on
Human Factors in Computing Systems: Changing Our World, Changing Ourselves (Minneapolis, Minnesota, USA, April 20 25, 2002). CHI '02. ACM, New York, NY, p.195-202. DOI= http://doi.acm.org/10.1145/503376.503412.
Daniel, M. P. & Denis, M. (1998). Spatial descriptions as navigational aids: A cognitive analysis of route directions.
Kognitionswissenschaft, 7, p.45-52.
Daniel, M. P. & Denis, M. (2004). The production of route directions: Investigating conditions that favour conciseness in spatial
discourse. Applied Cognitive Psychology, 18, p.57–75.
Fern, X., Komireddy, C., Grigoreanu, V. & Burnett, M. (2010). Mining problem-solving strategies from HCI data. ACM Trans.
Computer-Human Interaction 17, 1, p. 1-22.
Filipi, A. & Wales, R. (2009). Situated analysis of what prompts shift in the motion verbs come and go in a map task. In Coventry,
K.R., Tenbrink, T., & Bateman, J.A. (eds.), Spatial Language and Dialogue. Oxford: Oxford University Press, p.56-70.
Fischer, K. (2007). The Role of Users' Concepts of the Robot in Human-Robot Spatial Instruction. In Barkowsky T., Knauff M.,
Ligozat G. & Montello D.R. (eds.), Spatial Cognition V: Reasoning, Action, Interaction, Lecture Notes in Computer Science.
Berlin: Springer, p.76-89.
Forlizzi, J., Barley, W. C. & Seder, T. (2010). Where should I turn?: Moving from individual to collaborative navigation strategies to
inform the interaction design of future navigation systems. In Proceedings of the 28th international Conference on Human
Factors in Computing Systems (Atlanta, Georgia, USA, April 10 - 15, 2010). CHI '10. ACM, New York, NY, p.1261-1270.
DOI= http://doi.acm.org/10.1145/1753326.1753516.
Gabsdil, M. (2003). Clarification in spoken dialogue systems. In Proceedings of the 2003 AAAI Spring Symposium Workshop on
Natural Language Generation in Spoken and Written Dialogue. Stanford, USA.
Garrod, S. & Anderson, A. (1987). Saying what you mean in dialogue: A study in conceptual and semantic co-ordination. Cognition,
27, p.181–218.
Golding, J. M., Graesser, A. C. & Hauselt, J. (1996). The process of answering direction-giving questions when someone is lost on a
university campus: The role of pragmatics. Applied Cognitive Psychology, 10, p.23-29.
Hirst, G., McRoy, S., Heeman, P., Edmonds, P. & Horton, D. (1994). Repairing conversational misunderstandings and nonunderstandings. Speech Communication, 15, 3-4 (Dec. 1994), p.213-229.
Jonsson, I., Harris, H. & Nass, C. (2008). How accurate must an in-car information system be?: Consequences of accurate and
inaccurate information in cars. In Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in
Computing Systems (Florence, Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, p.1665-1674. DOI=
http://doi.acm.org/10.1145/1357054.1357315.
Lawton, C. A. (1994). Gender differences in wayfinding strategies: Relationship to spatial ability and spatial anxiety. Sex Roles, 30,
p.765-779.
Michel, D. (1997). The description of routes: A cognitive approach to the production of spatial discourse. Current Psychology of
Cognition,16(4), p.409-458.
Michon, P.E. & Denis, M. (2001). When and why are visual landmarks used in giving directions? In: Montello, D.R. (ed.), Spatial
Information Theory. Berlin: Springer, p.400–414.
Montello, D. R., Lovelace, K. L., Golledge, R. G. & Self, C. M. (1999). Sex-related differences and similarities in geographic and
environmental spatial abilities. Annals of the Association of American Geographers, 89(3), p.515–534.
Moratz, R. & Fischer, K. (2000). Cognitively adequate modelling of spatial reference in human-robot interaction. In 12th IEEE
International Conference on Tools with Artificial Intelligence. Vancouver, British Columbia, Canada, 13-15 November.
Moratz, R., Fischer, K & Tenbrink, T. (2001). Cognitive modeling of spatial reference for human-robot interaction. International
Journal on Artificial Intelligence Tools, 10 (4), p.589-611.
Muller, P. & Prévot, L. (2009). Grounding information in route explanation dialogues. In Coventry, K.R., Tenbrink, T. & Bateman,
J.A., (eds.), Spatial Language and Dialogue. Oxford: Oxford University Press, p.166-176.
Newcombe, N., Bandura, M. M. & Taylor, D. G. (1983). Sex differences in spatial ability and spatial activities. Sex Roles, 9, p.377386.
Pickering, M. & Garrod, S. (2004). The interactive alignment model. Behavioural and Brain Sciences, 27 (2), p.169-189.
Sacks, H., Schegloff, E.A. & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation.
Language, 50, p.696–735.
Schober, M. F. (1993). Spatial perspective-taking in conversation. Cognition, 47, p.1–24.
Schober, M. F. (2009). Spatial dialogue between partners with mismatched abilities. In Coventry, K.R., Tenbrink, T. & Bateman,
J.A., (eds.), Spatial Language and Dialogue. Oxford: Oxford University Press, p.23-39.
Tenbrink, T. & Hui, S. (2007). Negotiating spatial goals with a wheelchair. In Proceedings of the 8th SIGdial Workshop on
Discourse and Dialogue, Antwerp, Belgium,1-2 September, p.103-110.
Vanetti, E. J. & Allen, G. L. (1988). Communicating environmental knowledge: The impact of verbal and spatial abilities on the
production and comprehension of route directions. Environment and Behavior, 20, p.667-682.
Ward, S. L., Newcombe, N. & Overton, W. F. (1986). Turn left at the church, or three miles north: A study of direction giving and sex
differences. Environment and Behavior, 18(2), p.192–213.
APPENDIX A: TWO-WAY ANALYSIS OF VARIANCE
This section presents the results of the two-way ANOVA, performed along with the one-way
ANOVA presented in the main body of the paper, which examined the effect of gender and role
of participant on the three dependent variables: (i) time taken per task; (ii) number of
instructions per task; and (iii) miscommunication per task. The between-participants factors
were: (i) user gender (female users vs male users); and (ii) ‘robot’ gender (female ‘robots’ vs
male ‘robots’).
As the data and analysis that follows shows, the two-way ANOVA revealed significant interaction
effects for the instruction and miscommunication variables (two out of the three variables in the
experiment). There was also a main effect of ‘robot’ gender for the third variable, time per task
(male ‘robots’ were faster than female ‘robots’). However, when simple effects are significant,
the second step is to examine the error bar charts. It became apparent from the plots that only
groups with male ‘robots’ paired with female users were significantly different from the other
groups. This meant that the results of the one-way ANOVA and the two-way ANOVA were
replicated for all three variables. The findings and analysis in relation to each of the three
dependent variables will now be presented. The raw data, is also provided at the end of the
appendix (see Table XXI).
A.1 Time per Task
The two-way analysis of variance revealed a main effect of ‘robot’ gender (F(1,24) = 9.225, p =
0.006), which indicated that the mean time per task was significantly lower for male ‘robots’ (M
= 340.3 seconds per task, SD = 73.66) than female ‘robots’ (M = 415.23 seconds per task, SD =
64.11). The main effect of user gender and the user gender X ‘robot’ gender interaction were
not found to be significant. This suggests that only ‘robot’ gender was related to completion
time. The summary analysis is given in Table XI and the detailed two-way ANOVA/betweensubjects effects is given in Table XII.
Table XI: Mean time taken per task and Standard Deviations for all conditions.
Number
Mean time
of
per task Std. Deviation pairings
User
Robot
F
F
424.9733
67.21813
5
M
306.1905
61.27250
7
Total
355.6833
86.20870
12
F
409.1542
65.99631
8
M
370.1458
73.85026
8
M
Total
Total
389.6500
70.59377
16
F
415.2385
64.11688
13
M
340.3000
73.66592
15
Total
375.0929
78.03486
28
Table XII: Time per task – two-way ANOVA table showing tests of between-subjects effects.
Source
Type III Sum
of Squares
df
Mean Square
F
Sig.
55150.243a
3
18383.414
4.038
.019
3848314.805
1
3848314.805
845.283
.000
user
3908.349
1
3908.349
.858
.363
robot
41996.727
1
41996.727
9.225
.006
user * robot
10734.415
1
10734.415
2.358
.138
Error
109264.636
24
4552.693
Total
4103865.120
28
164414.879
27
Corrected Model
Intercept
Corrected Total
R Squared = .335 (Adjusted R Squared = .252)
A.2 Number of Instructions per Task
The two-way ANOVA yielded a main effect for the ‘robot’ gender (F(1, 24) = 4.376, p = 0.04),
such that female ‘robots’ (M = 10.15, SD = 3.5) required a higher number of instructions per task
than male ‘robots’ (M = 8.24, SD =2.79) (see tables XIII and XIV). However, a significant
interaction effect was observed, (F(1, 24) = 5.195, p = 0.03), revealing large differences between
FM pairs (M = 7.38 instructions per task, SD = 2.26) and FF pairs’ (M = 12.34, SD = 4.08). This
result suggests that FM pairs required fewer instructions to complete a task than FF pairs (see
Table XV) .
The presence of the interaction effect relegates the importance of the main effect as there was
no effect of ‘robot’ gender when the users were male. The interaction was further investigated
with t-tests (see Table XVI) which confirmed that the number of instructions that female users
produced depended on the gender of the addressee (t(10)= 2.714, p = 0.02).
Table XIII: Mean number of instructions per task and Standard Deviations for all conditions.
User
Robot
F
F
12.340000
4.0802642
5
M
7.380952
2.2642845
7
Total
9.447222
3.9206127
12
F
8.787500
2.4542763
8
M
9.000000
3.1370798
8
Total
8.893750
2.7231577
16
F
10.153846
3.5070178
13
M
8.244444
2.7958775
15
Total
9.130952
3.2341787
28
M
Total
Mean
Number
Std. Deviation of Pairs
Table XIV: Number of instructions per task – two-way ANOVA table showing tests of betweensubjects effects.
Source
Type III Sum of
Squares
df
Mean Square
F
Sig.
74.008a
3
24.669
2.841
.059
2373.057
1
2373.057
273.277
.000
user
6.305
1
6.305
.726
.403
robot
38.002
1
38.002
4.376
.047
user * robot
45.112
1
45.112
5.195
.032
Error
208.409
24
8.684
Total
2616.898
28
Corrected Model
Intercept
Corrected Total
282.418
27
R Squared = .262 (Adjusted R Squared = .170)
Table XV: Mean number of instructions per task and Standard Deviations for male and female
‘robot’ interactions in the female user/instructor condition.
Robot
N
Mean
Std. Deviation
Std. Error Mean
F
5
12.340000
4.0802642
1.8247496
M
7
7.380952
2.2642845
.8558191
Table XVI: T-test table showing the analysis of simple effects to determine the differences
between female ‘robots’ and male ‘robots’ in the female user/instructor condition.
Levene's Test for
Equality of
Variances
t-test for Equality of Means
95% Confidence
Interval of the
Difference
Sig.
(2-
F
Instructions Equal
1.480
Sig.
t
.252 2.714
df
10
Mean
Std. Error
tailed) Difference Difference
Lower
Upper
.022 4.9590476 1.8269987 .8882408 9.0298545
variances
assumed
Equal
2.460 5.767
.051 4.9590476 2.0154745
variances
- 9.9393868
.0212916
not
assumed
A.3 Number of Miscommunication Instances per Task
There was a main effect of ‘robot’ gender (F(1,24) = 3.933, p = 0.05), suggesting that female
‘robots’ (M = 1.6, SD = 1.08) are more prone to miscommunication than male ‘robots’ (M = 0.98,
SD = 0.89) (see Tables XVII and XVIII). The user gender X ‘robot’ gender interaction was found to
be marginally significant (F(1,24) = 2.954, p = 0.08) and showed differences between FM pairs
(M = 0.78 errors/non-understandings per task, SD = 0.53) and FF pairs (M = 2.18, SD = 1.44) (see
Table XIX).
Analyses of the simple effects using t-tests were performed to explore the interaction effect (see
Table XX). The t-tests confirmed that the main effect should be cautiously interpreted, as both
male and female ‘robots’ were equally error-prone when paired with male users/instructors. On
the other hand, female ‘robots’ when paired with female users/instructors seem to be three
times more likely to fail to understand/execute an instruction than male ‘robots’ in FM pairs
(t(10)= 2.376, p = 0.039).
Table XVII: Mean number of miscommunications per task and Standard Deviations for all
conditions.
User
Robot
Mean
F
F
2.180000
1.4436836
5
M
.785714
.5332837
7
Total
1.366667
1.1951924
12
F
1.237500
.6717514
8
M
1.166667
1.1270132
8
Total
1.202083
.8970296
16
F
1.600000
1.0889172
13
M
.988889
.8919985
15
1.272619
1.0177864
28
M
Total
Total
Std. Deviation
N
Table XVIII: Number of miscommunications per task – two-way ANOVA table showing tests of
between-subjects effects.
Source
Type III Sum
of Squares
df
Mean Square
F
Sig.
Corrected Model
5.876a
3
1.959
2.128
.123
Intercept
48.638
1
48.638
52.836
.000
user
.532
1
.532
.578
.455
robot
3.621
1
3.621
3.933
.059
user * robot
2.954
1
2.954
3.209
.086
Error
22.093
24
.921
Total
73.317
28
Corrected Total
27.969
27
a. R Squared = .210 (Adjusted R Squared = .111)
Table XIX: Mean number of miscommunications per task and Standard Deviations for male and
female ‘robot’ interactions in the female user/instructor condition
Robot
N
Mean
Std. Deviation
Std. Error
Mean
F
5
2.180000
1.4436836
.6456349
M
7
.785714
.5332837
.2015623
Table XX: T-test table showing the analysis of simple effects to determine the differences
between female ‘robot’ and male ‘robot’ in the female user/instructor condition.
Levene's
Test for
Equality of
Variances
t-test for Equality of Means
95% Confidence
Interval of the
Difference
Sig.
(2-
F
Miscommunication Equal
Sig.
t
3.814 .079 2.376
df
10
Mean
Std. Error
tailed) Difference Difference
Lower
Upper
.039 1.3942857 .5868046 .0868037 2.7017678
variances
assumed
Equal
variances
not
assumed
2.061 4.787
.097 1.3942857 .6763666
- 3.1564243
.3678529
Table XXI: Raw Data
Case
User (1:
Number Female,
2:
Male)
Robot
(1:
Female,
2:
Male)
Time
per
task
(in
secs)
Number of
Instructions per
task
Number of
Miscommunication
instances per task
1
1
1
357
9.1667
0.5
2
1
1
425
18.8333
4.3333
3
1
1
376.67 11.5
2.6667
4
1
1
529.8
13.4
1.4
5
1
1
436.4
8.8
2
6
2
1
334.4
6
1.8
7
2
1
350.4
8.4
0.2
8
2
1
519.83 11
2
9
2
1
460.2
1.2
10
2
1
434.83 12.5
2
11
2
1
368.4
1.2
12
2
1
448.33 6.3333
0.5
13
2
1
356.83 6.6667
1
14
1
2
285.17 9.8333
0.6667
15
1
2
273.5
8.5
1
16
1
2
398.5
6
1
17
1
2
216.5
3.8333
0.3333
18
1
2
306.5
7
1.6667
19
1
2
371.33 10.1667
0.8333
20
1
2
291.83 6.3333
0
21
2
2
346
10
1.8
22
2
2
408
7
1
8.4
11
23
2
2
277.17 11
0.5
24
2
2
396.33 15.3333
3.6667
25
2
2
278.5
0.8333
26
2
2
422.33 7.6667
1
27
2
2
340.83 8
0.3333
28
2
2
492
0.2
5
8
APPENDIX B: NON-PARAMETRIC ANALYSIS
A Kruskal-Wallis one-way ANOVA – the non-parametric equivalent to one-way ANOVA which
transforms the initial data to their ranks before performing the ANOVA – was performed on the
four groups (FF, FM, MF, MM). The Mann-Whitney test was used to perform pairwise
comparisons. The analyses were performed for the three dependent variables: (i) time taken
per task; (ii) number of instructions per task; and (iii) miscommunication per task. The raw data
is provided in table XXI.
The results of the Kruskal-Wallis test were along the same lines as the parametric ANOVAs:
there were significant differences for the variable time per task (p=0.03) and marginal significant
differences for instructions per task (p=0.06) and miscommunication (p=0.08). The ‘elevated’ p
values were anticipated since non-parametric tests are less powerful than parametric tests. The
previous results in terms of pair-wise differences were also supported by the Mann-Whitney
test. The findings and analysis in relation to each of the three dependent variables will now be
presented.
B.1 Time Taken per Task
The Kruskal-Wallis one-way ANOVA identified significant differences for the time taken per tasks
between the different pair configurations (x2 = 8.629, p = 0.035) with mean ranks of 20.2 for FF
pairs, 7.71 for FM pairs, 17.88 for MF pairs and 13.50 for MM pairs (see Table XXII). As such, FF
pairs had the longest completion times whereas FM pairs had the shortest.
Table XXII: Output of the Kruskal-Wallis test.
Pair Configuration
N
Mean Rank
FF
5
20.20
FM
7
7.71
MF
8
17.88
MM
8
13.50
Time
Chi-Square
df
Asymp. Sig.
8.629
3
.035
Pairwise comparisons using the Mann-Whitney test revealed significant differences between FM
and FF pairs (p = 0.019, U= 3.000, z = -2.355) and FM and MF pairs (p = 0.021, U= 8.000, z = 2.315), suggesting that pairs consisting of female users/instructors and male ‘robots’/followers
were associated with faster completion times (see Tables XXIII and XXIV).
Table XXIII: Output of the Mann-Whitney test for FM and FF pair configurations.
Pair Configuration
N
Mean Rank
Sum of Ranks
FF
5
9.40
47.00
FM
7
4.43
31.00
Total
12
Time
Mann-Whitney U
3.000
Wilcoxon W
31.00
0
Z
-2.355
Asymp. Sig. (2-tailed)
.019
Exact Sig. [2*(1-tailed
Sig.)]
.018a
Table XXIV: Output of the Mann-Whitney test for FM and MF pair configurations.
Pair Configuration
N
Mean Rank
Sum of Ranks
FM
7
5.14
36.00
MF
8
10.50
84.00
Total
15
Time
Mann-Whitney U
8.000
Wilcoxon W
36.00
0
Z
-2.315
Asymp. Sig. (2-tailed)
.021
Exact Sig. [2*(1-tailed
Sig.)]
.021a
B.2 Number of Instructions per Task
The Kruskal-Wallis one-way ANOVA showed a marginal significant difference between the pair
configurations (x2 = 7.105, p = 0.06). The medians were 22.0 for FF pairs, 10.2 for FM pairs, 14
for MF pairs and 12.07 for MM pairs (see Table XXV).
Table XXV: Output of Kruskal-Wallis test.
Pair
Configuration
N
Mean Rank
FF
5
22.00
FM
7
10.21
MF
8
14.00
MM
7
12.07
Total
27
Instructions
Chi-Square
7.105
df
3
Asymp. Sig.
.069
Pairwise comparisons revealed differences between FF and the FM and MM configurations (p =
0.028, U= 4.000, z = -2.192 for both comparisons) and marginal differences between FF and MF
(p = 0.056, U= 7.000, z = -1.908). These results suggest that the combination of gender and role
influences the number of instructions necessary to complete the task (see Tables XXVI to XXVIII).
An outlier was detected in the MM group (case 24) and removed prior to the analysis.
Table XXVI: Output of the Mann-Whitney test for FF and FM pair configurations.
Pair
Configuratio
n
N
Mean Rank Sum of Ranks
FF
5
9.20
46.00
FM
7
4.57
32.00
Total
12
Instructions
Mann-Whitney U
4.000
Wilcoxon W
32.000
Z
-2.192
Asymp. Sig. (2-tailed)
.028
Exact Sig. [2*(1-tailed
Sig.)]
.030a
Table XXVII: Output of the Mann-Whitney test for FF and MF pair configurations.
Pair
Configuratio
n
N
Mean Rank
Sum of Ranks
FF
5
9.60
48.00
MF
8
5.38
43.00
Total
13
Instructions
Mann-Whitney U
7.000
Wilcoxon W
43.000
Z
-1.908
Asymp. Sig. (2-tailed)
.056
Exact Sig. [2*(1-tailed
Sig.)]
.065a
Table XXVIII: Output of the Mann-Whitney test for FF and MM pair configurations.
Pair
Configuration
N
Mean Rank
Sum of Ranks
FF
5
9.20
46.00
MM
7
4.57
32.00
Total
12
Instructions
Mann-Whitney U
4.000
Wilcoxon W
32.000
Z
-2.196
Asymp. Sig. (2-tailed)
.028
Exact Sig. [2*(1-tailed
Sig.)]
.030a
B.3 Number of Miscommunication Instances per Task
The Kruskal-Wallis one-way ANOVA yielded a x2 = 6.756 with an associated probability value of p
= 0.08. The groups differed significantly on the miscommunication measure with FF pairs having
the most misunderstanding problems (medians were 20.6 for FF pairs, 10.43 for FM pairs, 16 for
MF pairs and 10.57 for MM pairs) (see Table XXIX). The results of the Kruskal-Wallis analysis
were in line with the parametric ANOVAs reported in the main body of the paper. The
‘elevated’ p values were anticipated since non-parametric tests are less powerful than
parametric tests.
Table XXIX: Output of the Kruskal-Wallis test.
Pair
Configuration
N
Mean Rank
FF
5
20.60
FM
7
10.43
MF
8
16.00
MM
7
10.57
Total
27
Miscommunication
Chi-Square
6.756
df
3
Asymp. Sig.
.080
The Mann-Whitney test revealed marginal significant differences between FF and FM pairs (p =
0.06, U= 6.000, z = -1.871), and between FF and MM pairs (p = 0.05, U= 5.500, z = -1.956),
suggesting that role and gender had an impact on frequency of miscommunication (see Tables
XXX and XXXI). An outlier was detected in the MM group (case 24) and removed prior to the
analysis. The other results related to pairwise differences in miscommunication reported in the
main body of the paper were also supported by the Mann-Whitney test.
Table XXX: Output of the Mann-Whitney test for FF and FM pair configurations.
Pair
Configuration
Mean
Rank
N
Sum of Ranks
FF
5
8.80
44.00
FM
7
4.86
34.00
Total
12
Miscommunication
Mann-Whitney U
6.000
Wilcoxon W
34.000
Z
-1.871
Asymp. Sig. (2-tailed)
.061
Exact Sig. [2*(1-tailed
Sig.)]
.073a
Table XXXI: Output of the Mann-Whitney test for FF and MM pair configurations.
Pair
Configuratio
n
N
Mean Rank
Sum of Ranks
FF
5
8.90
44.50
MM
7
4.79
33.50
Total
12
Miscommunication
Mann-Whitney U
5.500
Wilcoxon W
33.500
Z
-1.956
Asymp. Sig. (2-tailed)
.051
Exact Sig. [2*(1-tailed
Sig.)]
.048a
PRIOR PUBLICATION STATEMENT
Koulouri and Lauria’s most closely related prior papers (or concurrently submitted papers) have
focused on how users interact with ‘robots’ in a navigation task (studying the management of
miscommunication, use of spatial descriptions, linguistic resources, experimental design
methodologies, etc.). Though in the same domain as this work, this submission to TOCHI has a
very different focus that has not been looked at in any of their other papers: the submission’s
unique contribution is the analysis of gender differences in spatial navigation dialogues to
investigate HCI/HRI (Human/Robot Interactions) for route instructions. Chen and Macredie have
published papers that analyse interaction/use patterns based on a number of individual
differences (including gender) in a range of domain areas (most notably Hypermedia Learning
Systems), but not in the analysis of route instructions/route navigation. There is therefore no
direct overlap between this submission to TOCHI and any of their published or concurrently
submitted papers.
Download