Eric Horvitz: It's an honor to have hosted Jamie Pennebaker in this

advertisement
>> Eric Horvitz: It's an honor to have hosted Jamie Pennebaker
in this second round as being a visiting researcher at Microsoft
Research. He came last year for a brief period of time and this
year he's been here for a little bit longer. So he's had a
little bit more time to dig in on some interesting projects
along his lines of interest. Jamie is the Region Centennial
Professor of Psychology at UT Austin. I should say University
of Texas at Austin, in case people don't know what UT is. I
guess there are also some Toronto people in the audience here at
Microsoft Research.
He's a social psychologist. He studies how natural language
reflects people's social and psychological states. He explores
the links between words that we would think just fly by, removed
from our studies, function words, stop words; their importance
in language and how they can be studied to give us insights
about personality, intelligence, thinking styles, social
relationships, status, power, deception and other emotionrelated behaviors. And he's a highly cited social scientist.
And his most recent book is "The Secret Life of Pronouns: What
Our Words Say About Us." And I have to say, since meeting
Jamie, hearing his talks and reading his work, you might notice
that my emails changed a bit in terms of power relationships
implied. I'm just teasing.
But go, Jamie.
>> James Pennebaker: Okay, thank you.
[applause.]
>> James Pennebaker: Thank you. By the way, there are a bunch
of seats up here. So oh, yeah, I was going to do this. My
first slide -- oh, that's right, I'm not using any slides.
What! I know, this is going to be a shock. What I want it to
be is more of a discussion and to discuss today about the nature
of real world language. It's important to appreciate that I'm
coming at the nature of language from a completely different
perspective than a computer scientist. And very often if you've
ever talked with a psychologist, a research psychologist, you
may have had this experience that the two of you talk past each
other and each of you are thinking: What in the world is this
person talking about? Don't they understand anything? And let
me assure you the psychologist is thinking the same thing.
[laughter.]
>> James Pennebaker: What I would like to do is give you a
sense of my perspective as a social psychologist. And having
spent now a lot of time with computer scientists I have a good
appreciation of your perspective. And one of the central
differences in our perspectives is you have been trained to
think about trying to think about the real world in terms of
what accounts for the most variance. You're great at
classifying. Your machine learning is one of the most exciting
tools around.
However, what machine learning does, it doesn't really tell you
why a person is using language or doing what they're doing. It
is not a device aimed at understanding why people do what they
do. I come from the perspective of trying to figure out what
makes people tick. And I don't care about how much variance I'm
accounting for. In fact, the real advance, I think, is this
melding of these two professions, both trying to understand what
makes a person the way they are and what's driving them, but
also coming at it from: How can we optimize this?
And I think that there should be more of an interaction between
computer scientists and social psychologists because there's
just a natural connection. Also I say social psychologists
because that's what I am, but there are other types of
psychologists as well, cognitive psychologists and others.
So, a little bit of background. A social psychologist is
somebody who studies interactions, group behaviors and other
things like that. When I began my career, I was interested in
the mind-body issues. Why do people get sick when they do? How
do they perceive their bodies and so forth?
I've done a number of studies on this and I was putting together
a book. It occurred to me: You know, I ought to just give
people a general questionnaire to find out who reports physical
symptoms. And this was a really different type of project for
me. I got together three students of mine, and I said: I want
us to come up with a questionnaire. You can ask -- the idea is,
ask people whatever you want. What do you think would be
interesting to know about people. So we started generating
questions. We asked questions about what they ate, what their
relationships with other people, how they got along with their
parents.
One of the students in my group said: Oh, how about this? Did
you ever have a traumatic sexual experience prior to the age of
17? And I thought: Sure, that sounds good. So we stuck that
on.
[chuckling.]
>> James Pennebaker: Then we passed this out to about 800
students and we found that about 15 percent reported having had
a traumatic sexual experience prior to the age of 17. Now, here
is what was interesting. This was a 12-page questionnaire.
This is back when you actually handed out pieces of paper.
That one question was related to every health problem we had on
the questionnaire.
At about this time I was contacted by a magazine that was very
popular at the time, Psychology Today. They were doing a study,
an article on physical symptoms and they called me because
that's what I was doing research on. They were going to do a
survey in the magazine that people would mail back.
So you can see, this is in the old days. This was not a random
sample. But they got 24,000 people responding to this
questionnaire. One of the questions I had on the questionnaire
and that one was the sexual trauma question.
That we found 22 percent of women, 11 percent of men reporting
having had a traumatic sexual experience prior to the age of 17.
And those who endorsed it were two to three times more likely to
have been hospitalized for any cause the previous year. They
are more likely to have been diagnosed with high blood pressure,
cancer, colds, flus, everything that we asked.
And I started to become fascinated by this. Why is it that
people who report this question have so many health problems?
So we started doing more surveys. What we found was, it wasn't
a sexual trauma per se. It was having any kind of major
upheaval in your life that you kept secret. That's one thing
about a traumatic sexual experience. Unlike most other traumas,
almost everybody who had had one kept it secret. And this
started to make me wonder: Does holding a secret, a major
secret, is that a major -- is it possible that that is kind of
the toxic agent?
In fact, we did some surveys finding that if you found two
people, one who had had the same kind of trauma who both said it
was equally traumatic and one that had confided in others and
the others hadn't, that person who had not confided it was far
more likely to get sick.
Then this led me to the next part of my career where we had
people come in the lab and had them divulge secrets. We had
them write about emotional upheavals. So we would do these
studies. We bring people in. They are randomly assigned to
write about one of two conditions. One condition, we asked them
to write about the most traumatic upsetting experience of their
lives; and the other half were asked to write about superficial
topics. We had them write for four days, 15 minutes each. It
was a powerful experiment. These students wrote about horrific
stories and what we discovered was those people who were asked
to write about traumatic experiences in the months afterwards
went to the student health center at about half the rate as
people in our control conditions.
And this study, that first study was published in the mid-'80s.
Then we started to do some other studies, and another lab
started to do other studies. By now, there's probably between
three and four hundred studies that have been published using
this method.
It's called the expressive writing method. It has been shown to
be linked to improvements in physical health, mental health.
People do better in college afterwards. They do -- it's shown
to have a profound effect on people's emotional state, their
relationships with others, and so forth.
But the question that started to bug me in the early '90s was:
Why does writing make such a difference? And there had been a
number of studies trying to test various models. They all kind
of came out. And one thing that occurred to me was, wouldn't it
be interesting just to go in and to analyze what they had
actually written? Can you tell who is going to benefit from
writing by just looking at the way they put their story
together?
Now, at the time the first thing I did was to rely on judges.
Nowadays I would have used some kind of crowdsourcing, but this
crowdsourcing was basically some graduate students in clinical
psychology, a few clinicians. We had them read these stories.
Some of our projects, we would get the people whose health
improved. We called that group X. We have another group of
people who where wrote and whose health did not improve. We
called that group Y. I would say give them a group of essays
from group X and group Y and ask what's the difference? Well,
the thing about this is, if you ask people to read traumatic
stories, what it does is it depresses them. So here we
discovered a technique that first of all was not very reliable.
That is, judges didn't agree very well. It took forever. It
was expensive. And it depressed the people who were doing it.
[chuckling.]
>> James Pennebaker: This is not an efficient way to build a
career. So it occurred to me, I had taken FORTRAN in college.
I've always been interested in at least low level programming.
It occurred to me, maybe we need a computer program that can
analyze language. Initially I started calling friends around
the country asking them: Who has a program like this? Because
it seems obvious that there should be such a program. And I
couldn't find anybody. So working with one of my graduate
students who actually had been a computer science major as an
undergraduate we built a program called LIWC, Linguistic Inquiry
and Word Count, a remarkably stupid program by your standards.
I hope you are horrified, how simple this program was. The idea
was I wanted to come up with a method by which to go into any
given text and calculate the percentage of words that were
positive emotion words, negative emotion words, anger words,
sadness words, et cetera, but also cognitive words, words that
told us that people were thinking in certain kinds of ways.
Also words that would get at specific topics. And there were
certain theories that were hot at the time. I wanted to be sure
to capture those.
Then we would get groups of judges to generate words. We get
words from thesauruses and elsewhere. We come up with a list of
words and each word was judged by multiple people. It was a
human-based system.
What we discovered -- oh, and so long as we were doing this, we
also threw in other words including various common things like
pronouns, prepositions, articles and so forth.
And what we discovered early on was there were certain classes
of words that seemed to predict health improvement. One was we
found that, for example, use of positive emotion words. The
more positive emotion words a person used in their writing, the
more their health improved. So if they had had a traumatic
experience and they could still use words like joy, love,
happiness, they were much better off than if they didn't. And
ironically, even if they said "I'm not happy, this experience
gives me no joy," saying you're not happy is a predicter of
getting better as opposed to staying sad. In other words, if a
person is not happy, they are still thinking along this
dimension of happiness. Negations, by the way, are very
interesting, but we are not going to get into that today.
Life's too short.
Another dimension we found on the first run is that cognitive
words -- and cognitive words included words like "because, cause
and effect," causal words; insight words: "Understand, realize,
know," et cetera. That people who increased in their use of
these cognitive words over time benefited, whereas people who
did not increase over time did not. And the effect sizes for
the cognitive words was much, much bigger than the emotion
words. And using some other analyses a little bit later, this
is when I briefly got into the LSA, the Latent Semantic Analysis
world.
What we also discovered was that certain classes of pronouns
predicted improvement. And basically, what we found was that
people who flipped back and forth between using first person
singular pronouns: "I, me," and "my;" and other pronouns like
"he, she, they, you," even "we," and going back and forth
between those in no particular order was associated with
improved health. Whereas people who wrote using the pronouns in
the same way across those four days did not improve. And the
effect sizes were quite big. And these were averaging across
multiple studies.
>> Audience: Did you ever discover why?
>> James Pennebaker: I'm going to tell you. The reason is -great question!
[laughter.]
>> James Pennebaker: How about this. What we think it was was
changing in perspective. It's a marker of perspective switch.
It's very similar to what a therapist would do. A therapist
would do this. If you went in, if I went into a therapist and I
said, "You know, I'm having problems with my relationship with
my wife. She does this, she does this, she does this, she does
that." The therapist would say, "Stop. How about you? How do
you feel?"
If I went in and I say to him, "I'm having trouble with my wife
and I feel this, I feel that, I feel that," the therapist would
say, "Stop. Tell me about your wife. What's your wife's
perspective?"
What this is doing, people who naturally can do this are the
people who seem to benefit more quickly.
Now, about this time I was also, it was a very interesting time
because the mid-'90s were a time -- mid to late '90s were a time
when the Internet came of age. All of a sudden I had access to
data that I've never had before in digital form. I would start
downloading. Every night I would come home and just download
data.
One of my favorite places was AOL. Again, for some of you this
has meaning, for some of you. And for others, wow, that sounds
like history. I read about that somewhere.
[laughter.]
>> James Pennebaker: There were these AOL chat groups that were
fabulous. In any case, I would sit and analyze the data using
my computer program. I became interested in sex differences,
because we all have an intuitive sense of how men and women
talk. And I would download this and I would analyze it. And it
came out backwards from everything I thought. And I did what
all good scientists do. I ignored it because it can't be true.
It was a fluke.
[chuckling.]
>> James Pennebaker: In the end, a few weeks later I would do
another download. I would analyze it. It came out backwards.
Flukes. These things happen.
I did this several times before I started to think: You know,
maybe I should just line up all of my studies and see how men
and women are behaving. And what I discovered were some really
striking differences.
Now, before I tell you those differences, another side issue. I
was on my way to give a talk somewhere and I had just bought a
book by George Miller called "The Science of Words."
"Science of Words" was interesting because he made a distinction
in it that I'd never heard of. By the way, I have no
linguistics training. I don't understand linguistics. And I
tried, by God I did try. But it just was not a field that I
could ever wrap my head around. But Miller was interesting
because he started to break words down in ways that made sense.
And one was this profound difference between what he called
content words and function words. Content words are essentially
nouns, regular verbs, and adjectives and most adverbs. These
are content. When you're having a substantive conversation with
somebody, these are the words that your brain is processing.
The average person in this room has a vocabulary of between
50,000 and 100,000 words, of which -- 50 and 100,000 words, all
of which are content words. Except for a small group of other
words called function words. Function words are pronouns, "I,
me, you, he, she, it;" prepositions, "to, of, for." Articles,
"a, and, the." Conjunctions, "and, or, but." Negations, "no,
not, never." And selected adverbs that don't have any direct
reference like "so, very, really," words such as that.
What's interesting about all of these is that these function
words, there's only 500 in English. And no matter what other
languages you speak, you have function words as well and they
account for a minor, very small part of vocabulary. And what's
interesting is even though there's very few of these function
words, they account for over half of the words that we use.
What I've said so far, 55 percent of the words I've said so far
-- I have been counting -[laughter.]
>> James Pennebaker: -- have been function words. Now, in the
next few minutes you're going to start listening to what I'm
saying and you're going to start paying attention to the
function words and you're going to give up very quickly because
your brain can't process them. They come in, first of all, in
terms of speed, they come in at priming speed. Secondly, if you
start paying attention to these function words, you can't hear
what I'm saying. This is what's so interesting. These words
are there for everybody. Why are they there? Well, they are
shortcuts and they're also words that shape what I'm saying in
some very interesting ways that I'll tell you in just a second.
And the other issue is function words and content words are
processed in the brain very differently. So, for example, there
are two areas that we've long known to be related to language,
and they are particularly interesting in looking at brain
damage. One, the two areas are for almost everybody in this
room in the left side of your brain, even if you're left-handed.
And the one that's about right here in the temporal lobe is
called Wernicke's area and the one that's right up here is in
the frontal lobe and is called Broca's area.
If there's damage to the frontal lobe, and Broca's area is
damaged, the way the person speaks is really slow and
hesitatingly. So, hesitating. If I asked a person with Broca's
damage to describe this up here, the person might say: Hmm ...
podium, wire, computer screen ...
And it would be a pretty boring conversation. And also with a
great deal of hesitancy. And the person also would be clearly
lacking in social skills, which is interesting in terms of,
because the frontal lobe is associated with that.
>> Audience: Do you know if the order in which objects are
named in someone with damage to Broca's area is in an order that
would reflect some sort of correct syntactic construction, such
as inanimate things?
>> James Pennebaker: That's a great question. I don't know. I
have no idea. Wow, excellent question.
Damage to Wernicke's area is interesting because historically
what is -- it's generally described as a word salad. If you
listen to the salad, here's what it would be. So if I asked a
person with Wernicke's damage to describe this, they would say
much like this: Oh, yeah, sure, this is right here and next to
that is this and you can see that, and right here is this ...
All of the words are function words. Now, what's interesting,
function words are social. They require social skills to use.
So, for example, let's say there's a piece of paper in the room
and I come in and I'm the first one in the room. I pick up the
piece of paper and it says "I'm not here. I'll be right back."
Does that note make sense? Well, yeah, kind of. But on another
level it makes absolutely no sense because it's on the floor.
It might have just blown in from somewhere else. And it doesn't
make sense because "I," who is I? "Will"? "Will" implies
future tense. When was it written? "Be back." "Back," does
that mean back here? Where was the note actually from?
In other words, these words are words that only have meaning
between the speaker and listener at one particular point in
time, in a particular location. And even if we found the author
of that note a year from now and we gave the note to the author
and asked: What does this mean? The author might say: I have
no idea.
[chuckling.]
>> James Pennebaker: Now, here's what's interesting. As you
start looking at these function words, they start to tell us who
people are. It was interesting, it was only in these function
words that we were finding differences between males and
females. Males and females generally, from downloading things,
we are by and large talking about similar topics. Not
completely, but similar. But they're using these function words
were profoundly different. So, for example, I'm not even going
to ask you this because it will just embarrass you. And if it
makes you feel any better, I've embarrassed people at Stanford,
at Harvard, at Cambridge, at junior colleges. The people who
have been best have been the people in old age homes which is
actually I find both interesting and a little disturbing.
[chuckling.]
>> James Pennebaker: Although it's good for me.
[laughter.]
>> James Pennebaker: But the point is, so here would be the
test I would give you. Who uses the word "I" more, men or
women? The answer is women, overwhelmingly, across almost all
contexts. Who uses "we" words more, men or women? Of course,
you all know it's women, but of course it is not because I
wouldn't be asking it. There is no difference.
Articles, "a, and, the," you would guess women, but you have no
idea. In fact, it's men overwhelmingly. Who uses emotion words
more? You would probably say women. You would be wrong.
There's no difference.
Cognitive words, words like "because, cause, effect," words such
as that. You would think men. Again you would be wrong. It's
women. And who uses social words more, "he, she, they, we," et
cetera? You would guess women, and thank God, you got one
right.
[laughter.]
>> James Pennebaker: And the mistakes you would make are
exactly the ones I made when I started this as well.
And then starting to go in and analyze this more, it's important
to know that function words tell us where people are paying
attention. So pronouns tell us where people are paying
attention. Now, most people would think that the reason people
think men use "I" words more is because they think men are selfcentered, they are arrogant, et cetera. Well, they might be
self-centered and arrogant, but that doesn't mean that's where
they are paying attention. "I" means you are looking inward;
it's a self focus. Women are more self-focused than men. And
as we've done studies and others have now as well. Anything
that causes you to pay attention to self, to your body, boosts
the rate of your using the word "I." For example, people who are
physically sick use "I" much more than people who are not
physically sick. People who are depressed use "I" words more
than people who are not depressed.
And "we" words are interesting. There's no difference there
because there's two types of "we" words. One "we" word type is
you and me together. Holding hands, it's beautiful. And women
actually do use that kind of "we" more. The other kind of "we"
is the "we" which means, it's the kind of "we" I use with my
graduate students. Well, guys, we need to analyze this data
differently.
[laughter.]
>> James Pennebaker: And that does not mean we are going to
hold hands analyzing the data together. It's essentially a
shortcut for "you."
[laughter.]
>> James Pennebaker: And males use that kind of "we" much more.
And if you cannot determine who the referent is, it's probably a
male. It's also interesting that when you can't figure out who
the reverent is in "we," the person is being deceptive. What
you can do is actually look at politicians. Politicians use
"we" all the time. And the more the politician uses "we" the
more you should be nervous.
[chuckling.]
>> James Pennebaker: And let's see. Cognitive words are always
fun. Cognitive words are interesting. Well, let's do cognitive
and articles. Males use articles at much higher rates. "A,
and," and "the." And the reason is, you only use articles in
front of, or almost always in terms of concrete objects or
things. So if you are talking about carburetors or a certain
computer chip, you are going to use lots of articles. However,
if you are talking about more social things, you don't use
articles. And in fact, that's the big difference. Males
generally talk more about objects and things and women tend to
talk more about people. That's why females tend to use social
words more.
What's interesting is, when you're talking about people, people
are far more complex than objects and things. We like to think
rocket science is the most complex thing. That's ridiculous.
Human beings are far more complex than a rocket. You can get
the best scientists in the world and predict how this group of
ten randomly selected people are going to behave in the next
room over the next three weeks? You can't do it. Whereas you
can do that with really good scientists predicting a rocket.
The point is that when males, when people are talking about
relationships, they have to use much more cognitive words. Why
is it that Joe is interested in Sally, when everybody knows that
she is a terrible person for him? And you're starting to use
"believe, think, understand, cause and effect." Whereas when
you're talking about carburetors: Carburetor broken.
[laughter.]
>> James Pennebaker: So the point is that men and women are
using language different. Now, I want to make it clear: I
don't really care about differences between men and women. I'm
using this more as an example of how we can start to think about
these words as reflecting thinking.
Now, my real interest over the last several years has been to
look at the nature of language and especially functional words
across three domains. The first is essentially personality
dimensions. The second is situational dimensions and the third
is looking at it in terms of social processes. I'll talk about
each of these briefly.
The first is trying to understand the differences, or how we can
get a sense of people's personalities by the way they use words.
We started this actually, it wasn't a personality project but it
was actually the first study that I did that was trying to get
at the sense of who people are. And this was looking at who
commits suicide. And we were looking at poets. Poets commit
suicide at incredibly high rates. It's probably the most
dangerous profession on earth if you're a published poet.
So what we did was we got a small group of suicide, poets who
committed suicide and a control group of poets that we matched
in terms of when they were born, age, sex, education, and so
forth. The study was only about 16 people, as I recall, so
eight in each condition. And we analyzed their poetry. I was
honestly, because this is before I had really gotten into any of
this research. I was thinking: This will be easy because the
people who are going to commit suicide are going to talk more
about death, they are going to be more depressed and so forth.
The only difference we found was the use of the word "I;" that
suicidal poets used the word "I" at much higher rates than nonsuicidal poets. They didn't differ in terms of negative emotion
words. They didn't differ in terms of death-related words
either.
So it was quite interesting that this one dimension, this one
feature, the use of "I" was so different. It's important to
appreciate, "I" words are fairly high base rate words. In
natural conversation, and probably even in poetry, "I" is the
most common word used by people, which is probably around 4
percent of all the words we use.
I then got involved working with some clinicians and we did a
series of studies looking at people who were clinically
depressed who previously had been depressed and people who were
not depressed, looking at how they wrote essays on coming to
college. Again, the people who were depressed used "I" words
the most. People who were not depressed used "I" words the
least. And people who had been depressed but were not now were
somewhere in between.
We've also looked at this as -- I have been called several times
to do an analysis of a person in terms of did they really commit
suicide? And the most interesting case was the analysis of an
Australian explorer, Henry Hellyer who died mysteriously in the
1840s. The big question was: Did he commit suicide or was he
murdered? And a lot of his writings were available.
What was interesting was by looking at his writings what we
found was in the years and months as he got closer to death, his
use of "I" spiked just prior to his death, which suggests very
strongly that it would probably be suicide.
Now, we've also looked at this in terms of other individual
differences. Let's see, what are some ... some others. One
thing that we have been doing a lot of work on recently is
thinking style. Can you get a sense of how people think? We
recently -- so recently we did this project where we analyzed
the admissions essays for people who have been accepted into the
University of Texas. This is, this was 25,000 students. Each
person had to write two essays. And what we did was to go
through and these were people a few years back so we can track
their grade point average over four years.
What we did, we went through and we looked at these eight
dimensions of function words. And what we were able to do is to
come up with -- the psychometrics of function words are
gorgeous. I spent a lot of time in the world of psychometrics.
And articles and prepositions are positively correlated with
each other.
>> Audience: What are psychometrics?
>> James Pennebaker: I'm sorry?
>> Audience: What are psychometrics?
>> James Pennebaker: Psychometrics means when you're coming up
with a scale, you want to find out is this scale internally
consistent? In other words, does each part of this scale
measure the construct that you're trying to get at?
So an SAT test, for example. You would hope that all 40
questions, let's say we're looking at the math part. All 40
questions are at least somewhat related to the other questions.
And that each question is related to the sum of all the other
questions, because you're trying to get at this coherent
construct of intelligence. And the same thing is true for
personality, and so forth. So what we find is that articles and
prepositions are both positively correlated with each other.
They go in the same directions. The more you use articles, the
more you use prepositions.
These other six dimensions that we're focusing on, personal
pronouns, impersonal pronouns, adverbs, auxiliary verbs, et
cetera, those are all internally correlated with each other as
well. And those six are negatively correlated with articles and
prepositions. Meaning we can turn these, you know, we can
reverse score all of these. And all of them are correlated with
each other. They are all getting the same general concept of
what I would call the thinking style.
And we're calling this thinking style a categorical dynamic
index. And this categorical dynamic index means the higher you
score on it, that is the more you use articles and prepositions,
the less you use these other things, the more you are
categorizing the world. The more you are looking at the world
in a logical, formal, hierarchical way.
At the other end, people who are using lots of pronouns,
auxiliary verbs, conjunctions and so forth, these people are
looking at the world in much more of a narrative way. They are
telling stories. It's really quite interesting. So you can go
in and pull out the top scorers and you read these essays. And
the essays are not frankly very interesting.
[chuckling.]
>> James Pennebaker: I want to come to -- there are three
reasons I want to attend the University of Texas. The first is
their computer science department is very, holds great promise
in my interests, blah-blah-blah.
And then you pull out the bottom ones of this dimension and it
will say: The reason I have been interested in computer science
was because when I was five years old my family moved from
Oklahoma to Tennessee and on the way, this happened ... in other
words, they're telling a story and you're thinking: Well, what
a great story! Then you look and see how these people do in
college. And the correlation between the CDI score and fouryear GPA is about a point two correlation. And a point two
correlation predicting four years later is pretty remarkable.
So the higher, the more formal, the more you're using these
categorical words, the better you do in college. Telling
stories, you're screwed.
[laughter.]
>> James Pennebaker: And this is true, by the way, if you're in
fine arts, if you're in physics, if you're in -- it makes no
difference what field you are. The higher you are in
categorical thinking the better you do. And one reason is all
the tests, whether you're in fine arts, English, any area of
science, whatever. They are asking you to analyze, to think
logically. They are not asking you: For this test, could you
tell me a story about protons?
[laughter.]
>> James Pennebaker: So this is what American and probably
world education is, is to think in logical, formal ways. Now,
what's interesting is CDI is also highly stable. In other
words, it's almost a personality dimension.
And recently we have been using this as a marker of using it for
author identification. And with one of my graduate students
we've just published an article on the discovery of a, or it has
been discovered a long time but it deals with a book that has
been probably thought to have been by Shakespeare. Scholars
have been arguing about this particular play for the last 300
years. So Shakespeare died in the early 1600s. One hundred
years later a guy by the name of John Theobald in London claimed
he had discovered three manuscripts written by Shakespeare and
that he now had put them together and come up with, had a new
play called "Double Falsehood."
In fact, the story matched one that they knew existed that had
been presented in the early 1600s. So ever since the early
1700s, scholars have been fighting whether or not this was a
real play by Shakespeare or not. So we simply went in and
analyzed all of Shakespeare's work, all of Theobald's work, and
there's another guy, John Fletcher, who had cowritten a couple
of plays by Shakespeare.
And using CDI and also -- we used both machine learning and some
other methods, actually used multiple methods. All of them
showed this play really does have the fingerprint of
Shakespeare. And there's a little bit of fingerprint of John
Fletcher in the last two acts, which actually is very similar to
two of the other plays that we knew that Fletcher and
Shakespeare collaborated on.
The point is, even using this kind of metric, this categorical
thinking metric is interesting because it is telling us
differences between people.
Now, personality is one issue. It's also interesting to look at
how people's language changes as a function of events in their
lives. Can we tell what is going on in your life by the way
you're using language? In fact, we can get a sense. We know,
for example, that the older you get, there are certain changes
in the way you use words. So, for example, as you get older,
and again this goes opposite of what I think most people here
would have predicted. Certainly it's different from what I
predicted. The older you get, the more people use positive
emotion words and the less they use negative emotion words. The
older you get, the more you use future tense and the less you
use past tense.
Yes?
>> Audience: Is it the fact that those people survive? Or ->> James Pennebaker: No, that's a good question. Is it just
the survival? I think not. You know, our samples are
sufficiently large and diverse. Oh, and I can tell you why the
answer is no, because we also analyzed text from authors,
including Shakespeare, who wrote over a large part of their
lives. They showed the same pattern. So as they got older they
used more positive emotion words, fewer negative emotions words,
fewer "I" words, more "we" words. More cognitively complex
words as you get older.
Which kind of makes sense. When you're young, you know, you're
in your teens, your 20s, your emotional life is going up and
down like crazy. As you get older, you know, it goes like this.
Hey, it's all great!
[laughter.]
>> James Pennebaker: Kids are gone, yeah, this is -- life's
simple.
>> Audience: Can you see people getting stuck and not moving in
that direction?
>> James Pennebaker: Of course, I'm sure there are -- you know,
the nature of our statistics, we don't look at that, but of
course there are. So all of this, please understand I'm
speaking in generalizations. So in everything I've told you,
most of the people are going to, the majority of people are
probably going the way that I'm telling you, but there are
always some that are not. Maybe the ones who aren't changing,
they are going to die soon.
[laughter.]
>> James Pennebaker: I don't know.
We've also been interested in looking at specific events. For
example, we looked at 9/11. We had about a thousand bloggers
who were heavy users of livejournal.com. And we had these
people, to be in our study they had to post on average at least
every other day. Now, what we were able to track is what
happened to this sample on 9/11. And many of the things you
would expect. So immediately afterwards there's a drop in
positive emotion words and increase in negative emotion words.
What's interesting is how long do emotions last? What we found
was once 9/11 hit -- by the way, we had data for four months,
two months before 9/11, two months afterwards. So we have a
really stable baseline. What happened is, immediately after
9/11 you get a big increase in negative emotions and it takes
about 11 days before they get back to baseline.
Positive emotion words were very different. Positive emotion
words, there's a big drop, and it comes up very quickly? And by
day four, they are at baseline. And day five, they are above
baseline. And they stay above baseline for the next two months.
An interesting irony. 9/11, and I think this is true for many
major events, is associated with long-term positive emotions.
Why would that be? Because it brings the culture together.
It's a social phenomenon. So one thing we looked at was the use
of the word "we." What you find at 9/11 is, there's a huge drop
in "I." Remember, I told you "I" was associated with
depression. People are not depressed. There's a difference
between depression and being sad. There is a drop in "I" and a
huge increase in "we." This increase in "we", it went up and
then within a couple weeks it came back down, but it didn't come
down to baseline. In fact, over the next two months it remained
above baseline.
So upheavals are interesting because they do foster social
interactions with others. We've also been interested in how you
can use language to track threat. So, for example, we looked
at, we have been interested in can you predict if a leader is
going to go to war? There have been lots of leaders that have
attacked others. One of the interesting ones was George Bush
and going to war with Iraq. Now, what we did was to analyze all
his press conferences, just his language in press conferences.
And Bush, unlike presidents before and after him, spoke to the
press a huge amount, which from a data perspective is fabulous.
And what you find with him is his use of "I" words were pretty
constant up until August of 2002. All of a sudden his use of
"I" words dropped from about 4 percent to 2 percent. That's, by
the way, that's a huge, huge drop. It stayed at that level
until after we went to war about eight months later in I guess
it was February or March of 2003. Then it gradually came back.
Now, I became curious. By the way, I didn't tell you about
deception. When people lie, they tend to stop using "I." They
start distancing themselves. By the way, we went and looked at
other leaders who have threatened and gone to war. We found
this same general pattern. There's a drop in "I" and I think
part of it is that if a leader is thinking seriously of
attacking, they are holding their cards. They don't want to
convey what they're doing.
Now, we've done this same thing in looking at the tweets of the
Boston bomber, the one who has just been convicted. What we
found with him was he was using Twitter; he used Twitter at a
very good rate. And his rates of using "I" is about, I think it
was around seven or 8 percent. In October, about mid-October
his use of "I" words in all his tweets dropped to a very low
level, and they stayed that way all the way until the actual
bombing. I would submit that that is probably when he made a
decision to join his brother in the bombing.
The point is this: It's interesting to start using these kinds
of signals to get a sense of people's psychological state.
One other, let me make another kind of side note here. In the
computer science world, there is a huge interest in sentiment
analysis. And sentiment analysis is generally thought of
emotions. The idea that you can tell a person or a group's
emotional state by looking at the use of emotion words. This
makes sense on one level in that it seems like that would, which
is just common sense. But it is not, it doesn't make sense from
a psychologist's perspective because language did not evolve to
express emotions through words. Because emotions are expressed
often through actions, tone of voice, and so forth. Emotions
are also expressed through these function words. So, for
example, if I'm angry at you, I may not say the word "angry" at
all. I can read your email and I can tell you're pissed even
though you don't say the word "angry." It's leaking out through
these other dimensions. In fact, we do a pretty good job at
identifying emotional state. In fact, we do just about as well
and sometimes better identifying the emotional state by looking
at function words as opposed to just emotion words.
The two types of information are really important. So sentiment
analysis really is more than just, or should be a lot more than
just emotion words.
Yeah?
>> Audience: So you mentioned when people are lying, they start
to use "I" a lot less. But you also mentioned way back in the
beginning that this keeping a traumatic experience secret was a
marked indicator of future health problems, bad outcomes.
>> James Pennebaker: Of health problems.
>> Audience: And people who use "I" more often tend to have
those bad anecdotes. So those two ->> James Pennebaker: Well, we're juggling different issues
here. When a person is actively holding back something, holding
a big secret and not telling another person, they will use "I"
less. In other words, this is what's so intriguing is there's a
lot of work on the links between self-deception and other
deception. One model right now is that they are really kind of
the same things.
So if I've had a major upheaval and I don't want you to know
about it, our interactions are going to get stilted because I'm
not going to say much about me. I'm also going to be thinking
about that event quite a bit and I'm not going to be paying much
attention to you.
And this fits in with, there's now been dozens of studies
looking at the nature of deception in language. And across
multiple studies, the general pattern -- and I should tell you,
there's some interesting weirdness about deception because
there's so many different types of deception. Deception writing
a Yelp review is different from deception in terms of
interpersonal -- me lying to you or vice versa.
In the moral world me lying to you and also the criminal
investigation world, deceptions markers are low use of "I."
Another one is lack of complexity in language. One of the
dimensions of complexity are what we call exclusive words.
Words like "except, without, but." It turns out that when
you're lying, it's almost impossible to use those words because
these exclusive words are making a distinction between what is
in a category and what is not in the category. This is in it,
but not that. And if I ask you: What were you doing last
Saturday night? And you did something that was illegal and I'm
the policeman asking you, you're going to say: Oh, hmm, well, I
did this and I did this and then my friend, he did this and I
... and then we went here and we did that.
If I'm telling the truth I'll say: Well, I went to the store.
I was going to get this, but then I realized I forgot this, but
then I did this, but then I didn't do that.
In other words, I'm saying what I did do and didn't do. And if
you're lying, it just becomes way too complex to say what you
didn't do because you didn't do any of it.
[chuckling.]
>> Audience: [indiscernible]
>> James Pennebaker: Exactly. Now, this brings us to this last
issue about, these are some of the situational factors. One of
the issues that's also interesting is looking at social factors.
For example, can we look at social relationships through the
nature of language and function words? And one of the things I
have been intrigued with is the nature of status. Who's got
higher versus lower status? And one thing that we've discovered
is the relative use of "I" as a really powerful marker. In any
interaction, the person who's the higher status uses the word
"I" less. The opposite of what you think. And the person who
is lower status uses the word "I" more. And I started, we first
did this project with emails. Then we did it with experiments
where we would manipulate status. Other ones where we just had
people come and talk. Then we analyzed letters that we had
access to in terms of people in the military, high versus low
rank and so forth.
And you can take this to the bank. I was looking at our very
first project with email and I was thinking: That's
interesting. I bet it's not true for me, though, because I love
everybody, you know? Everybody is equal.
So I looked at my data and it was just like everybody else's.
And what happened is, an undergraduate would write me: Dear Dr.
Pennebaker, I'm so-and-so and I was wondering if I could meet
you because I would like to talk about so and so.
And I would write back: Dear student, thank you so much for
your email. How about next Tuesday? That seems like that would
work.
Then I look at my email to the dean: Dear Dean, I'm Jamie
Pennebaker and I want to know if I can do this and I can do
that.
[laughter.]
>> James Pennebaker: And then the Dean writes back: Dear
Jamie, thank you so much for your email.
[chuckling.]
>> James Pennebaker: What's interesting about this is that
nobody is putting anybody down. This is not some kind of nasty
stuff. This is the language of power and status in our culture.
Now, here is an interesting issue. I've talked about this
finding to groups everywhere. And Americans often come up to me
afterwards and say: Wow, I didn't realize I was using language
this way. I'm going to stop using "I" in my emails. I always
think: Well, that's kind of interesting. Then I spoke to a
group and there was a guy who was from China and he came up and
he said: Wow, that was a great talk. I now realize I need to
start using "I" words more in my emails. And I was thinking:
This guy gets it! Because if someone, if I'm the high status
person and somebody else is low status person and they send me
an email and they are trying to say "I've got a lot of power!"
That kind of puts me off. If they come to me and they are
basically being more genuine, that works because that's what a
natural interaction is.
So it's very interesting how we as a culture interpret these
findings. You wouldn't believe how many emails I get where
people will not use "I" once. And then we'll have a PS at the
bottom: It took me 30 minutes to write this email because, you
know, I wanted to not use "I."
Also, by the way, if you feel self-conscious talking to me,
forget it. I can't hear it. Send it to me, send me what you
want to say by text and I'll put it in my computer program.
[laughter.]
>> James Pennebaker: But I can't hear it. Yeah?
>> Audience: Is there a difference between spoken communication
and written communication?
>> James Pennebaker: It's very similar. There are, of course,
some differences, but in terms of the general patterns,
everything holds up pretty much the same.
>> Audience: Okay.
>> James Pennebaker: Yeah?
>> Audience: You mentioned different culture. I'm wondering if
the [indiscernible] is not only with the interpretation
[indiscernible]
>> James Pennebaker: Okay. So we've done cultural studies and
we're finding the same general patterns in terms of status,
depression, kind of the major themes we've looked at. Sex
differences. They hold up in every culture we've looked at so
far. And we've looked at it in a lot of different ways. So our
computer program, LIWC, we've got it in about 15 different
languages. And the languages that are most disparate from
English that we've looked at have been Chinese; Hungarian, which
is disparate from every damn language that there is.
[chuckling.]
>> James Pennebaker: And we've done all the European languages,
Russian, but the patterns hold up markedly similarly.
>> Audience: I expect that article [indiscernible] would be
different. Some languages don't have that [indiscernible].
>> James Pennebaker: That's right.
>> Audience: -- [indiscernible] is similar.
>> James Pennebaker: That's right. However, so several
languages still have articles, but every language can
distinguish between "the cup" from "a cup." But different
languages bounce around that. So in Chinese, if you say "cup,"
it means a cup. So there's, you don't need "a." But if you
meant that cup, there's some word for that or this particular
cup. So they all exist. But it's true, every culture has some
interesting subtle differences in function words that I think
says a lot about the culture itself.
Let me go through two more things really quickly. I've gone on
already much longer than I usually do. In terms of these social
dynamics, a second issue beyond status is trying to get a sense
of can you get a sense of how two people are connecting with one
another? This is a [choretonic] kind of issue as well. The
question is, how do you know if two people are clicking? They
are connecting? Well, one thing we've come up with is a metric
that we call language style matching or LSN. We essentially
calculate a score in terms of how similar the two people are
using pronouns, prepositions, articles, conjunctions, and so
forth. You come up with a very simple metric. And what you
find is that the more the two are matching, the more they're on
the same page.
So, for example, we did an analysis of speed dating. And speed
dating is glorious -[chuckling.]
>> James Pennebaker: -- from a researcher's perspective. Where
we've analyzed the language of the two people. Now, what we
find is that we can predict who will go out on a subsequent date
at rates higher than the people themselves. And you're
thinking: Well, that's not possible. Well, it is possible
because there's a lot of people who one person says "Yeah, this
would be great" and the other person is saying "Are you
kidding?" They would have low style matching. It is only when
the two people both are on the same page.
We've also looked at a separate study with 86 dating couples
among college freshmen. Freshmen dating couples are great to
study because they are really unstable.
[laughter.]
>> James Pennebaker: You know, basically you need variance.
And we gave these -- to be in our study they had to do instant
messaging, to IM quite a bit. And they had to agree to give us
ten days, or nine days of their IMs. What we found was the
greater their style matching, the more likely they were to be
together three months later. In fact, those who were above the
mean, above the average, 80 percent were still dating three
months later. Those that were below the mean, only 50 percent
were still dating. Now, so that's the style matching work.
We've also been using this in terms of small groups. So I teach
a giant introductory online class course of about, around 1500
students. And we break the students every day into small
groups, or every other day. And every time they are in a
different group. One thing we can do is we can assess the
degree to which we can do language style matching across them
and find out the degree to which they are on the same page.
What we are trying now to figure out is can we manipulate the
group? Because we are monitoring the group constantly and we
can now give the group feedback in terms of: Are you all on the
same page? You're not paying attention to one another. And
other ways of manipulating this. Yes?
>> Audience: When it comes to the groups that are working in a
cohort together or a couple that might stay together, what does
it say about not just the matching but the nature of ...
>> James Pennebaker: Well, that's our problem. Our groups are
not -- we don't allow them to stay together. Each time is very
different.
>> Audience: How about the couples? In other words, if I were
examining "I" words, function words, it is not just style
matching. I can imagine -- [indiscernible] on both sides, or in
terms of [indiscernible] on both sides.
>> James Pennebaker: With the couples, it doesn't really
matter. If both of them are not using "I" words, that's fine.
They are both probably geeks talking about this program that
they are working on. But that's -[chuckling.]
>> James Pennebaker: But that's beautiful. Then there are
other people, one who is really self-obsessed and the other is
self-obsessed, that's beautiful.
>> Audience: But no signal, no signal given about the nature of
it?
>> James Pennebaker: No, no.
>> Audience: [speaker away from microphone.]
>> James Pennebaker: And by the way, these are kind of first
analyses. So that's the, that's group data. There was one
other issue. If I had an overhead, I wouldn't have to ... oh,
yeah, yeah, yeah. This is starting to look at communities. So
looking at say blogging communities. What can we learn about
people in terms of their social dynamics there? And this
actually gets at this a little bit. One of my former students,
Cindy Chung, did this dissertation that was just lovely. What
she did was to look at diet.com. She was interested in diet
blogs. One thing that is cool about diet.com is that people
post their weights whenever they post. So what you can do is
track them over time to see who loses weight. And she's
interested in can you tell the language of weight losers versus
those who don't lose weight?
And in terms of the actual language, the more personal their
writing is, the more likely they are to lose weight. But there
was one thing that predicted weight loss far better than
anything and that was the comments. Now, I would have
predicted, most of my colleagues in social psychology would have
predicted that the more comments you receive, the more likely
you are to lose weight. Turns out it has virtually no effect.
Instead, it's the more comments you make on other people's
blogs, the more you lose weight. It is kind of the difference
between giving and receiving. That giving aid and support seems
to be much healthier than just receiving it. And there are all
sorts of interesting theories that might explain this. One of
them is it's, it suggests a greater commitment to the community,
a greater perhaps commitment to weight loss as well.
Now, what I've done is given you a huge bunch of results in a
very, very short time. And the time that I've spent here at MSR
has been, honestly, fabulous. Because I've spoken with a number
of you and the kind of work that's being done here is breathtaking. From a social psychologist's perspective, I think you
can start to see why I'm so excited talking with people because
you guys are, first of all, have access to data that is
unbelievable. And the questions you are addressing are
fascinating. And many, perhaps most are at their very core
deeply social psychological questions. And this is why I think
there's just a natural affinity between our fields. I would
urge you to go outside the computer science community. God
knows I love the computer science community, but also go out and
have lunch with a social psychologist.
So I'm going to stop here and open it up to questions.
[applause.]
>> James Pennebaker: Thank you. Yeah?
>> Audience: Yes. With the freshmen dating study, I was
reminded actually about an experience I just had this morning
with a woman I ran into who said she should ping my wife and she
doesn't have a computer science background. And I'm sure she
picked up "ping" from her husband who is a computer scientist
and this is CS terminology.
>> James Pennebaker: Right.
>> Audience: Those freshmen, did you have any way to factor out
the fact that some of them may have spent enough time together
that they were starting to talk like each other?
>> James Pennebaker: Funny you should ask. One of the things
I'm interested in, this is what I called my linguistic drone
project, which is -- that's right, you all have drones! We can
work together.
Can you identify a person by the way they use function words?
In terms of who they are hanging around with and where they are
in the country? So what we were able to do with this project,
because most of these people went to high schools in the State
of Texas. And so what we were able to do is, I picked eight
high schools that were fairly similar in terms of social class
and analyzed any given group, and to see if I could, how well I
could pick a particular person or group of people here, and
would they be classified correctly. And I did better than
chance. Not great, but ... which tells me that the people
within a high school are probably being taught in similar ways.
And they probably talk together and adopt a similar kind of
language.
The signal is not very strong, but there is something there.
Yeah?
>> Audience: Along the trends you're talking about, the trends
were very subtle and very surprising to us. Do you have any
sense of whether writers, fiction writers are good at creating a
voice that isn't their own? Are they doing that mostly through
content words or do they have an intuitive way of understanding
and way to leverage beyond what other folks can do?
>> James Pennebaker: It's funny. I gave a talk at the
University of Toronto and somebody asked that same question. I
remember answering because my wife is an author. I basically
said: Oh, well, writers, you know, clearly can get into the
heads of ... then that night I went back and on my computer I
actually had some data. Turns out I was wrong. Most writers
are really good at conveying characters through content, but
they are not as good through function words. And we've done
analyses, there's some authors who are actually very good at
conveying the tone of the opposite sex. But in general most
writers make their opposite sex characters speak like their own
gender. Also we've done some interesting work on TED Talks that
have been translated and we find that the language of the TED
Talk reflects the translator more than it does the actual
speaker. Yes?
>> Audience: So a lot of people clearly kind of pick up on this
power of changing language. Like if I use "I" less will I
convey more status? And that kind of makes sense from a social
standpoint, where my language is communicating something to you.
Does it work on an internal part? If I use "I" less, am I less
depressed?
>> James Pennebaker: Yeah, yeah, yeah. No, we've done several
studies on this because it's an important question. Does
language drive an emotional state? We've done these studies and
it's pretty easy to change people's language, by the way. So
we'll have questionnaires and we'll have open-ended questions
saying, "Tell us about your first year in college. Here's what
some other people say."
And all the examples are "we" or in the other condition all in
"I." Then they write "What do you think?" And people will
write the way the other people do.
It has no effect on anything psychological that we've measured.
So if you are depressed, changing your language won't make a
difference. However, I think language can serve, it's more like
a speedometer. So it's reflecting and it can serve as a cue
about how you're doing.
So, for example, I've always thought of this kind of experiment,
I've never done it. I know if I sent you into a room with a
small group and I say "Okay, when you're in there, don't use the
word 'I.' Use words like 'we' and so forth." I don't think you
will be more or less likely to become a leader.
However, if I said beforehand, "When you go in there, I want you
to become the leader. Really work at trying to do this."
If I told you to do that, you would do much better than chance
at becoming a leader. And your language would change in line
with what a leader's is. So it is really the psychological
state is driving the language, I think.
>> Audience: I'm curious how you got people to share secrets.
And how, did you promise they wouldn't be -- if you share a
secret that it then would be analyzed? I'm curious about that.
>> James Pennebaker: So here is the way these studies work.
You know, there's not many times in our lives where you are
given the opportunity to really let go and explore your deepest
emotions and thoughts. So they come into the lab and they talk
to me or maybe a graduate student. And I say: Okay, in this
study we are asking people to write about their deepest thoughts
and feelings and the most upsetting experiences in their lives.
What I would like to have you do is write about this and really
let go and explore your deepest thoughts and feelings. Your
name will never be linked with what you're writing. We won't
share your writing with others except researchers. It will be
analyzed along with your questionnaires. But we really are
interested in what makes you tick.
And I would say that over the last -- actually, I haven't done
an expressive writing study in a few years, but I'm sure I ran
several thousand people back from the beginning. And I can
think of one person who did not participate. Virtually
everybody participates. It is kind of like being on an airplane
where all of a sudden you get in this weird, you're sitting next
to somebody and you're both talking about secrets.
[chuckling.]
>> Audience: Also sort of on the vein that obviously this is
incredibly intellectually interesting. How do we think about
bringing it into our own lives. My roommate is dating this girl
for four years. Do I tell him to go look at her emails and -[overlapping speech.]
[chuckling.]
>> Audience: What do you think, we can use this in our lives in
interesting ways that will actually impact it?
>> James Pennebaker: I think the place to start using it in
interesting ways is to try it yourself. And I do not think
writing every day for the rest of your life is healthy. In
fact, I think it's unhealthy. I view writing as kind of a life
course correction. That every now and then -- and I do this
myself every few months, every year or so. Sometimes I'll just
sit down, and sit down and write for a day or two, three days
for 15 minutes a day or 20, whatever, exploring my deepest
thoughts and feelings in terms of what's going on in my life
right now. I think it's a really powerful way to put things in
order. Because we don't usually do that. You know, often we
might have someone we're close to and we can talk to them about
some topics, but there are some topics you can't talk about.
Uh-huh?
>> Audience: So you talked about parts of language which
generally is quite stable. But language in general sort of
evolves over time. There are certain elements of language that
you want "we" -- and there are big constructive elements like
the Internet, right? People, two people sort of behave
differently in the way they express things differently. So some
of the things that we explored in one of the projects that I was
involved in earlier is that when you think about proper nouns
and certain attributes, for example people who like, I don't
know, The Daily Show or Friends, right? In the early 2000s had
people who had [indiscernible] but now people with the same
personality would like certain other things. They would not
like this show because it has moved on, they have moved on in
some sense.
>> James Pennebaker: What you are describing is the distinction
between content and style. Content and function words. I don't
care what you like, your function words are going to be pretty
similar. If you look at the evolution of language, function
words are really similar to Shakespeare's time. If you look at
his content words, geez, that's why it's so hard to read him.
But the reality is, and this is true over the lifetime, and this
gets into some of the issues that I have been doing here at MSR,
which is: Can you identify personality through things like
Twitter or particularly I'm interested in search terms. What
you are suggesting is search terms are going to change over the
course of, you would think even two years. The data I have been
analyzing, man, that's not true at all. There's some features
that do change, but others are really remarkably similar. If we
looked at them over ten years or 20 years, yeah, there are going
to be some changes. But I would bet that the people who are
interested in celebrities today are going to be interested in
celebrities 20 years from now.
>> Audience: This is a related question, right? When we try to
look at how people look at things and sort of the words they use
on Facebook or Twitter, versus what they, the words they use for
search, they are very different things because when it comes to
Twitter and Facebook there is also an element of portraying a
particular persona, right? You don't use the words that you
will sort of use when you are searching. And you can sort of
see that there certain types of topics which never get searched.
>> James Pennebaker: Right, exactly. But the function words
are going to correlate at least point three with each other.
The personality is the same, whether they are doing one or the
other.
>> Audience: As an extension to the evolution question, I think
Twitter has a lot of misspellings, a lot of errors that actually
can stick and become the way people talk. And I'm wondering how
messy the input data is and can you learn anything from this
messiness?
>> James Pennebaker: This is a statistical question. If you
are looking at one million people, who cares? You know, we can
handle error. Error, error in terms of finding truth is a
function of sample size or number of data points. And as we get
-- if you're looking at a particular person or a small group of
people, then these are things that it starts to make a
difference. But once we move to scale, that's not something
that I personally lose any sleep about because I've analyzed
data that was really dirty and really clean, the same data set,
to see would my conclusions be substantively different? And
I've never found that to be the case.
Uh-huh?
>> Audience: You talked a bit about the role of writing, but
when people do this expressive writing is it important that it
be shared? That they think it is going to be shared?
>> James Pennebaker: No, no.
>> Audience: And are verbal confessions like what you do with a
psychologist in treatment, therapy, is that sort of the same
role as writing distinctly different?
>> James Pennebaker: I think they are very similar. The
difference is when you are sharing with somebody else, you are - there is a high threat there. So even talking to a therapist,
there's some things that a lot of people won't talk about. It's
just too threatening. If you can completely trust this person
you're talking to, whether it's a close friend or a therapist or
whatever, then it probably can be as good as writing, as long as
that person is supportive of you and is accepting of you.
That's the beauty of writing. Writing is, you are -- the
audience is you. When we do our research, we make it clear that
they'll never get feedback; they'll never be linked with what
they are writing. We did studies, we had people write on a
magic pad. You know, those things that you had as a kid, you
write it and you lift it up and it disappears? And we get the
same effects writing on a magic pad as writing on paper that you
turn in. So I think the real active ingredient here is
translating the experience into language.
>> Audience: Verbal with politicians, you said you did an
analysis of them, but a lot of them have speech writers. So
personalities still come across? Or is it the speech writer's
personality coming in?
>> James Pennebaker: It's a little bit of both. That's why I
prefer press conferences that are not scripted.
>> Audience: Uh-huh.
>> James Pennebaker: Or at least not as scripted. And you
know, it's also interesting. Most good speech writers try to
mimic the personality of the person.
I've got one very quick story. John Kerry ran for president
against Bush in 2004. If you remember him, he was a stick. He
was formal and he just couldn't connect with people. And I read
in the New York Times that his handlers were aware of this. So
what they were encouraging him to do was to use more "we" words
in his speech. I read that and I thought: Oh, Jesus, this guy
is toast!
Now, it's interesting because use of "we" among, in a
politician, is a marker of being emotionally distant. So here
his people are telling him: Okay, now let's try to be really
distant! And he could do it.
[laughter.]
>> James Pennebaker: But what was also interesting was that
language really maps a person. He was rigid and kind of
controlled, and his language was the same. It's kind of part of
the package deal. What we find is in terms of nonverbals and
verbals, they tend to go together. And had he spoken using "I"
words in that same way, would it have made a difference? Beats
me.
>> Audience: A question, Jamie, and we can [indiscernible] see
hands if we ask any more [indiscernible]
This is an extension of our conversation. It gets at the why.
And maybe, I know this is speculation because you have annexed a
really nice finding, finding how trauma in someone is cathartic
and useful in things including health streams in the future.
Why? What is your set of best guesses to speculate why? And
how do we study that to come to the answers?
>> James Pennebaker: Okay. So one of the big problems in
science is that we have all been trained to think that there is
a single answer that is best, a particular theory that is better
than others. Writing is interesting because it is a cascade of
phenomena that drive it. And this would be one thing that
because there have been enough studies now we know some of the
reasons it occurs.
First of all, you have had an upsetting experience, a traumatic
experience. Think about what's happening. You obsess about it.
You think about it. You're walking down the street. You're
thinking about it. You're talking to a friend and you're
sometimes thinking about it then as well. You don't have
working memory. You don't have free memory to devote to the
conversation. You don't have free memory to devote to reading
and studying the way you usually do. And, by the way, people
who have had a traumatic experience have more problems in their
relationships, they do more poorly in school, they have
difficulty sleeping, and on and on and on.
We know that after writing, people have greater working memory.
There have been several studies showing greater working memory
immediately afterwards. We also know after writing they sleep
better. Sleep is one of the most powerful markers of health and
it's also linked to immune function. There have been now a
number of studies showing that writing is associated with
enhanced immune function. And given our MD in the room, immune
function is a dirty concept because the immune concept is so
complex. The fact is that sleep is good for you. We also know
that sleep, sleep disruption is associated with depression. So
here we now know greater working memory, people sleep better.
The other issue is that we also know that writing about
something helps to organize the experience. Putting it into
words, think about what happens when you have a traumatic
experience and you don't, you are not able to talk to somebody
about it or you can't write about it. What happens is you're
walking down the street and you're thinking: Oh, I should have
done this! Then you start thinking about this feature of it.
You go a little bit further and you think: Oh, God, I felt so
guilty about this. And then you start ... but you don't put it
all together. What writing does is it forces putting things
together in a coherent, meaningful way.
One other, one other issue is merely writing about it helps to
acknowledge that it occurred. The number of people who have a
traumatic experience, and what they do, they work to not think
about it, to pretend it didn't happen. And just labeling it
makes a difference. There has been some nice work on using FMRI
data at UCLA showing the mere labeling of an experience brings
about, is associated with a change.
All of this, what's so interesting about this, it's also
associated with social changes. So, for example, we've done a
couple of studies where we give people beepers. Not beepers,
it's a device that I developed with one of my graduate students
called the EAR, the Electronically Activated Reporter. It used
to be an actual tape recorder. Then it became a digital
recorder. Now it's just a cell phone. It comes on for 30
seconds, goes off for 12 minutes. We ask people to wear it for
several days. In our writing studies we would have them wear
this for two days before they were assigned a condition. And
again a month after the writing.
What we found was that people who were in the experimental
condition who wrote about trauma spent more time talking to
others afterwards, laughing more. They were more socially
engaged. And here was the cool part, from my perspective. We
gave them a million questionnaires and we asked them how, tell
me about your life, how much you talk with others, and so forth.
We get no differences in questionnaires. People don't see that
their lives have changed. Because self reports are self
theories. People are working on what they think their life is.
And it is not -- and we are now seeing objectively that there
are these changes. So this is, these are -- this is why there's
not a single answer because writing is bringing about all of
these changes that are really, each one on its own is often hard
to measure, hard to conceptualize, hard for even the person
themselves to see. But I think that's the complex answer.
>> Audience: Okay. We'll have time for you to talk to Jamie
right after. So thanks very much for your time.
[applause.]
Download