Mining For Psycho-Social Dimensions through Socio-Linguistics

advertisement
Sociotechnical Behavior Mining: From Data to Decisions? Papers from the 2015 AAAI Spring Symposium
Mining For Psycho-Social Dimensions through Socio-Linguistics
Peggy Wu, Christopher Miller, Tammy Ott, Sonja Schmer-Galunder, Jeff Rye
Smart Information Flow Technologies
PWu@sift.info; CMiller@sift.info; TOtt@sift.info; SGalunder@sift.info; JRye@sift.info
can cause survey fatigue among participants. During debriefs, some of our study subjects questioned the validity
of the answers they provided, despite their best intentions
and efforts to provide accurate data. A data collection tool
and validated analysis methods that utilize naturally occurring behaviors will alleviate survey and reporting burden.
Such a tool would not only be useful in research, but we
believe that it could be used to facilitate the selfmonitoring of psychosocial dimensions through automated
analysis of communications or narrative self-reports such
as journals.
Abstract
Communication is social by nature, and reveals psychosocial dimensions about an actor’s perceptions of themself
and others. While grammar and spell-check can help polish
the presentation of communication, it does not reflect the
way that a message will be received in a particular social
space. A means to analyze the communication for actor beliefs can help the author and others understand the underlying social climate and message that is being transmitted.
NASA has identified the need to monitor individual behavioral health and team dynamics as crucial to ensuring high
performance and mission success. We describe an application that integrates theories from sociolinguistics with natural language processing techniques to successfully detect
individual moods, attitudes, and team dynamics relevant to
long duration exploration class missions. The methods were
used to analyze data gathered from human subject experiments at three diverse analog studies, with results showing
high correlation with subject self-reports and third party observations. We discuss preliminary results and implications
for the tool’s potential wide-spread use.
Virtually all team performance and psychosocial problems
manifest themselves in “transactions”—interaction and
communications between team members. (Salas et al.
2007, p. 189) define a team as “two or more individuals
who interact socially and adaptively, have shared or common goals, and hold meaningful task interdependences.”
Communication behaviors can be a rich data source for
identifying and evaluating team health indicators. Communication, with all its nuances, may be even more important
for long duration space flights (Stuster 2010) especially
under the circumstances of social monotony, possible discrepancies in cultural assumptions, and delayed communications with ground crews. Therefore, our approach examines observable communication behavior to detect and assess factors affecting team dynamics and individual emotions.
Introduction and Motivation
Future astronauts will work in a unique environment in
which they are placed in multicultural teams and are socially isolated and confined to a small environment for an
extended period of time, all while subject to constant monitoring and scrutiny. To evaluate relevant psychosocial
states at both the individual and team levels, an accurate,
objective, repeatable, efficient, accepted and minimallyintrusive means of data collection and assessment is needed. In this domain and others for behavioral health, researchers heavily depend on the use of surveys. This reliance on introspection and surveys has many pitfalls for
both participant compliance and accuracy, especially in the
context of long duration missions. Survey data are subject
to a participant’s memory, biases, vigilance, personality
(e.g. some participants are simply not very reflective), and
Communication behavior observations are inherently nonintrusive, represent a large portion of a person’s social and
work relationships, and can be correlated with data from
other measures (e.g. circadian desynchronization, extended
wakefulness, work overload, task performance) to arrive at
a holistic picture of serious performance threats and situational contexts where they occur. We developed such a tool
under the NASA funded project called ADASTRA—
Automated Detection of Attitudes and States through
Transaction Recordings Analysis. The goal of ADASTRA
Copyright © 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
33
sition of an act might start off as nominal, but we draw inferences from the transaction itself. When imbalance is
perceived in an interaction, we try to explain it by adjusting our assumptions about the relationship or imposition—
or the character and knowledge of the participants.
is to create minimally intrusive, valid and efficient assessment methods to identify and track key individual and
team psychosocial states using communications data
streams and written introspection by leveraging analytic
methods based in socio-linguistics and sociology theory.
ADASTRA extracts individual and team psychological
dimensions using spoken and written behaviors rather than
surveys. The advantages of this method are twofold: firstly
it allows unobtrusive detection to reduce subject survey
burden and secondly, it enables in-depth data-mining to
identify factors not identified a priori in surveys. ADASTRA techniques have been performed on data collected
from multiple missions in three analog space exploration
environments, as well as with historical spaceflight data.
Our results provide empirical evidence of some hypothesized key threats and indicators for team and individual
psychological health; they also uncover correlations that
were not previously studied or anticipated. Below, we describe the ADASTRA system, the analog environments in
which data was collected, and some of the results that support the validity of this analysis method.
Brown and Levinson’s politeness model is a subjective
framework that inspired our algorithms for detecting the
difference between expected and exhibited redress behaviors. This differential can then be used to infer power difference and social distance between interlocutors. When
combined with other measures such as attitudes and affect,
we begin to arrive at a computationally tractable description of the complex landscape of social relationships and
their changes over time.
Related Work
Over the past decade and a half, there has been a veritable
explosion in social science research that capitalizes on the
fact that more and more of our lives and activities are either conducted online or recorded via audio, visual and/or
textual means. While many techniques exist to derive emotion from audio (e.g. voice prosody) and video (e.g. facial
expression recognition), we believe a strictly semantic approach is needed for those data sources that contain only
text (e.g. written journals, blogs, emails, instant messaging,
and other text correspondence). ADASTRA focuses on
three primary techniques to derive insight from text:
• LIWC—Linguistic Inquiry and Word Count (Tausczik
and Pennebaker 2010) is perhaps the best known of a family of software approaches that simply count the frequency
(and sometimes the position) of words that a researcher
targets. There are a variety of emerging empirical and theoretical approaches linking specific words and word classes
to psychological, behavioral, and biological phenomena.
• LSA—Latent Semantic Analysis (Landauer et al 1998)
goes beyond LIWC by examining not just the occurrence
of specific words, but the occurrence and relative position
of words and their semantic equivalents. Records can then
be characterized in terms of the concepts they contain and
the proximity of those concepts—for example, the frequency of “jihad” and similar/related terms to positive
emotional terms might serve to characterize the news articles in a nation’s press.
• Etiquette EngineTM—Our own work (e.g., (Miller et al
2010a; 2010b)focused specifically on the appropriate use
of politeness terms (broadly defined) as a measure for interpersonal relationships. We have explored the use of variants of both LSA and LIWC, but have been guided in their
use by the role that politeness plays in managing and signaling relationships as expressed in the Brown and Levinson theory. This has the unique strength of guiding cross-
Theoretical Background
Much of our work has centered on a computational model
of the role and interpretation of politeness in human interactions, based on the work of (Brown and Levinson 1987).
Brown and Levinson proposed that the function of politeness is to redress the face threat inherent in social interaction. Goffman hypothesized that individuals are motivated
by positive face (i.e. the desire to be seen as a valuable
member of the group) or negative face (i.e. the desire to be
autonomous) (Goffman 1967). In Brown and Levinson’s
model, social interaction invariably threatens either or both
of these aspects of face, and measures must be taken to
maintain the social status quo. Brown and Levinson believe that politeness usage is one of these measures. The
degree of face threat that needs to be addressed is a function of the power and social distance (roughly, familiarity)
of the interlocutors, plus the degree of imposition of the interaction. We make use of polite “redressive strategies” to
offset face threat. If less redressive value is used by the
speaker than the listener deems necessary, the interaction
will be perceived as rude; if more, then “overly polite”—
but the value of both the threat and the redress is based on
the perceptions of the individual, which are personally and
culturally informed. This explains how the same utterance
can be polite or rude depending on context—and how one
individual can intend an act as nominal while another sees
it as rude.
A final aspect of the Brown and Levinson model important
to our work is that politeness perception is a cognitive process. Our perceptions of social relationships and the impo-
34
cultural and cross-language interpretations of politeness
usages, as we demonstrated for ISS crew in video records
in (Rubino et al, 2010).
Next, we describe how these techniques are used in the
ADASTRA tool.
The ADASTRA Toolset
The goal of ADASTRA is to utilize naturally occurring
textual data (e.g. journal entries, conversation transcripts,
written communications) to derive individual and team
psychological dimensions relevant to spaceflight. We believe that using observed behaviors as a data source can
yield more insight, and perhaps more accurate insight, than
self-reports and surveys. Figure 1 depicts the major components of the ADASTRA system. In addition to the textual discourse, we added components that analyze free text,
such as journal/blog entries, using extensions of the LIWC
and LSA techniques. We applied these methods to naturalistic data observed from spaceflight analog facilities.
days and carry out science research tasks as well as spaceflight simulations. We collected journal and survey data, as
well as transcribed speech and text chat from a total of 16
crew members.
In summary, we collected a corpus of written and transcribed data from 45 subjects across three analogs.
Next, we summarize our analytic methods for the different
types of data collected from the analogs.
We performed studies at three different ground-based facilities where various aspects of spaceflight were simulated.
They include:
Analysis Methods
Bedrest—This facility’s primary focus is to study the effects of microgravity on human physiology. Participants
undergo 14 days of intake protocols, and are then confined
to bed rest for 70 days, followed by 14 days of recuperation and post-treatment protocols. They maintain a 6degree head down angle for all activities during the 70-day
bed rest period. Participants are monitored by a human
24/7 to ensure compliance. We collected journal and sur-
In our data exploration and through the process of adapting
our methods to three different analogs, analog subjects, and
data types, we expanded on known analysis and discovered
previously unknown but significant analysis methods that
address NASA’s concerns about team and individual psychological health. Below, we provide an overview of a
subset of methods used.
Power Difference Network—Based on the Etiquette Engine
and Brown and Levinson’s theory, this algorithm uses
conversational transcripts to produce a snapshot of the
power hierarchy among actors, similar to an organizational
chart. For validation, these results were compared with a
crew’s organizational structure. Individual crew members
were assigned the roles of commander, engineers, or mission specialists, with the commander as the leader of the
group. Crew members also completed surveys regarding
their perceived power difference throughout the mission.
Salience and Trend—This is a collection of keyword category counts. Trending topic frequency over time can provide useful indicators of emergent topics such as the unfolding of specific events or concerns (e.g. the emergence
of concerns over a leg injury in a subject). The method also
serves as a preliminary step to assist in down-selecting valence and other specialized techniques described below.
Figure 1 Basic Architecture of ADASTRA
vey data from a total of 18 bedrest subjects.
HI-SEAS—The Hawaii Space Exploration Analog is a
long-duration Mars simulation located in the barren landscape of Mauna Loa, HI, at an elevation of approximately
8000ft. In each mission, six crew members are confined to
the isolated habitat for four month. They perform science
research projects inside the habitat and conduct two to
three Extra Vehicular Activities (EVAs) in the form of geological surveys outdoors while donning prototype space
suits. We collected journal and survey data as well as crew
to mission support written communications from a total of
11 crew members.
HERA—The Human Exploration Research Analog is a
Mars mission simulation based in Houston at the Johnson
Space Center. It contains three primary modules where
four crew members per mission are confined for seven
35
or present-tense verbs respectively. This measure is useful
as it may be combined with others, such as sentiment
towards past, present or future, to arrive at a measure of
happiness versus meaningfulness (Baumeister et al 2013).
• A significant negative correlation between the use of
terms about physical state (both our own defined category
and words derived from Tausczik and Pennebaker’s work)
and survey ratings of physical state (rs=-.273, p<.001).
Thus, subject’s increase in their use of terms related to
their physical state was generally indicative of their reporting feeling worse. We do not believe that one can generalize this finding to the interpretation that increased mentions of any topic is automatically correlated with negativity towards that topic. However, this finding is consistent
with the psychological phenomenon that bad events tend to
“stand out” over good ones (Baumeister et al 2001) explains that negative emotions and events ranging from bad
social relationships to physical trauma have higher impact
on individuals and necessitate more reflection and processing across a number of domains, possibly because one
is motivated to avoid future negative events. Note that the
data source used is a journal where subjects are explicitly
asked to reflect on their day.
Valence - closed vocabulary Latent Semantic Analysis
(LSA)—This provides trends of general mood as well as
topic-specific sentiment (e.g. attitude regarding habitat,
food). Some topics, such as general emotion, are based on
work by (Tausczik, and J. Pennebaker 2010) and (Pennebaker, 2011), while others are manually generated based on
the language-use specific to the source data (e.g. names of
places, people, technical their semantic distance to the affective norms of English words (ANEW 2014). These results were compared with the Positive Affect Negative Affect Schedule (PANAS), as well as survey questions.
Specialized LSA: past/present/future—This collects and
analyzes a subject’s use of words associated with the past,
present, and future to provide insight into an individual’s
temporal focus. These results were correlated with survey
questions that immediately followed journal writing.
Specialized LSA: self vs. others—This presents a subject’s
use of words associated with him/herself as opposed to
others, which provides insight into an individual’s focus
and introspection. These results were correlated with survey questions that immediately followed journal writing.
Results
While our approach is proving powerful at finding general
trends across subjects, we believe that its greatest contribution comes from its ability to track and identify cognitive,
attitudinal, and emotional trends within an individual over
time or relative to others. The degree to which our analyses
provide accurate data for individuals is difficult to validate
statistically, but we present several interesting ways to gain
individual insights into the emotions and attitudes of individual crew members that will aid in both individual and
team psychological support.
• First, there were marked individual differences in word
count per entry and in emotional content and ratings. While
it is statistically true to say that most subjects showed a
slow downward trend in PANAS positivity scores and in
use of positive emotional terms over time, this is not uniformly true. For example, one subject showed almost no
variation in his PANAS scores. Two others showed much
more dramatic declines, while yet another actually shows a
probable rise in PANAS positivity scores and positive
emotion term usage.
As of the time of this writing, we are in the processing of
collecting, transcribing, and formatting data for analysis.
Below, we present results using a subset of bedrest journaling and survey data. In general, the data has supported
the validation of our methods. Below, we discuss a select
set of findings.
• A significant positive correlation between the proportional use of negative emotional terms (Spearman’s Rho
rs=.187, p<.001), anger terms (r s=.179, p<.001), and anxiety terms (r s=.160, p<.001) (as based on (Pennebaker 2011))
and PANAS negativity scores. This confirms that when
subjects rate themselves as having more negative mood,
their journal writing reflects this.
• A significant negative correlation between our
past/present survey question (indicating perceived past focus) and use of past verb forms (r s=.-.314, p<.001) and significant positive correlation between our survey question
and use of present (rs=.169, p<.001) verb forms. This confirms that when subjects see themselves as more past- or
present-focused, they use a higher proportion of past tense
36
References
Salas E, Stagl KC, Burke CS, Goodwin G. 2007 Fostering team
effectiveness in organizations: toward an integrative framework.
Nebr. Symp. Motiv. Paper, 52:185–243.
Stuster, J. 2010. Behavioral Issues Associated with LongDuration Space Expeditions: Review and Analysis of Astronaut
Journals. Experiment 01-E104 (Journals): Final Report. Accessed
Nov
2010,
http://ston.jsc.nasa.gov/collections/TRS/_techrep/TM-2010216130.pdf
Brown, P. & Levinson, S. 1987. Politeness: Some Universals in
Language Usage. Cambridge Univ. Press: Cambridge, UK.
Goffman, E. 1967. Interactional Ritual. Chicago: Aldine.
Tausczik, Y. & Pennebaker, J. 2010. The Psychological Meaning
of Words: LIWC and Computerized Text Analysis Methods.
Journal of Language and Social Psychology, Vol. 29(1), 24-54.
Landauer, T., Foltz, P. & Laham, D. 1998. Introduction to Latent
Semantic Analysis. Discourse Processes 25, 259–284.
Miller C., Ott T., Wu, P. and Vakili, V. 2010a. In Blanchard, E.
and Dalhousie, D. (Eds.) Handbook of Research in Culturally
Aware Information Technology: Perspectives and Models. IGI;
Hershey, PA., pp. 387-411.
Miller, C., Schmer-Galunder, S. & Rye, J. 2010b. Politeness in
Social Networks: Using Verbal Behaviors to Assess SociallyAccorded Regard. In IEEE Second International Conference on
Social Computing, Aug 20-22, Minneapolis, MN, pp. 540-545.
David, E., Rubino, C., Keeton, K., Miller, C. and Patterson, H.
2010. An Examination of Cross-cultural Interactions aboard the
International Space Station. NASA Technical Report prepared by
WYLE Scientific, Sept. 24.
J. Pennebaker. The Secret Life of Pronouns. Bloomsbury Press,
NY, 2011.
ANEW
dictionary
accessed
Oct
2014:
http://personal.stevens.edu/~rchen/readings/anew.pdf
Baumeister, Roy F., Vohs, Kathleen D., Aaker, Jennifer L., and
Garbinsky, Emily N. (2013). Some key differences between a
happy life and a meaningful life. Journal of Positive Psychology
DOI: 10.1080/17439760.2013.830764
Baumeister Roy F., Bratslavsky Ellen, Vohs Kathleen D.,
Finkenauer Catrin, 2001. Bad is Stronger Than Good. Review of
General Psychology Vol 5. No. 4 323-370. Accessed Oct 2014:
http://assets.csom.umn.edu/assets/71516.pdf
Figure 2 A depiction of general positivity and
negativity of journal entries for one individual using LSA analysis
• LSA Valence analyses can be calculated for individual
subjects over time and can be used to provide an indication
of emotional state (as inferred from the journal entries).
Figure 2 provides a daily computed valence score for a
subject, along with two representative journal entries for
high and low valence points. LSA sentiment analyses give
us a finer-grained sense of what individual subjects are
feeling good and bad about by comparing the overall valence scores for their entries to the topics that correlate
with those scores. This analysis can be extended to be topic-specific so we can derive the LSA valance for food, exercise, habitat environment etc. over time.
Conclusion and Discussion
ADASTRA can act as a monitoring tool to “listen in” and
alert when something has, or seemingly might, go awry. It
can be highly customized to individuals and can provide an
objective third person perspective for corroborating with
other subjective opinions.
Sudden and unanticipated shifts in power, social dynamics,
and individual general emotions and sentiment can indicate
potential problems while helping to identify positive events
that are otherwise not obvious. A nonintrusive method of
detecting them can help make the connections between
precipitating events to changes in moods and attitudes. We
believe that self-monitoring might be an ideal application
for such a tool to help an individual increase selfawareness. It can also inform an author about how a message might be perceived by others before the message is
sent. ADASTRA can enable the objective self-monitoring
of psycho-social dimensions, physical wellbeing, and perceived workload, all of which in turn help individuals improve their own abilities to recognize reasons and catalysts
for changes in individual moods and team climate.
Acknowledgments
The above work was sponsored in part by the U.S. Office
of Naval Research under contract # N00014-09-C-0264
and NASA’s Human Research Program under contract
#NNX12AB40G. We would like to thank our ONR program managers Dr. Martin Kruger and Ms. Maya Rubeiz
for the opportunity to participate in the 2013 Empire Challenge military exercise at Ft. Huachuca, AZ. We would also like to thank our NASA sponsors Lauren Leveton,
Laura Bollweg, Brandon Vessey, Holly Patterson, the BHP
element, and the subjects and staff at the various space-
37
flight analog facilities for their oversight, direction, and
support.
38
Download