>> Meredith Ringel Morris: All right. So thanks... Winter Mason. Winter is a faculty member actually in...

advertisement
>> Meredith Ringel Morris: All right. So thanks for coming. It's my pleasure to introduce
Winter Mason. Winter is a faculty member actually in the business school at Stevens Institute
of Technology in New Jersey where his research focuses on what we call computational social
science. Before he joined Stevens Institute he worked at Yahoo's research lab in New York City,
which many of us are very familiar with, for several years and his background before that is in
psychology and cognitive science. That's what he did his doctoral degree in. So we're excited
to have Winter come talk to us today about some of his exciting work on the emergence of
conventions in online social networks like Twitter. Thanks.
>> Winter Mason: Thank you. Thank you Meri, and thanks for having me. I love this work. I
love talking about it. I'm very excited about it and I hope that you all get as excited about it is I
am. So this is work with Farshad Kooti who was kind of the lead graduate student on this, you
know, did a lot of the heavy lifting. He's now at USC. Haeryun Yang, Meeyoung Cha are out at
Kaist and Krishna Gummadi’s at the Max Planck Institute in Saarbücken. And so the topic of this
research is social conventions and so a social convention is essentially a behavior -- let me get
the exact quote. "Shared beliefs in the social group about how things should be done in a given
context." Right? So this is a nice classic example. There are two variations on the same social
convention which is greeting, right, so in some cultures the standard way is to, you know, offer
your hand. In other ways it's to bow, but there's a lot of other social conventions that you can
think of. You know, how much you tip a waiter, right, which side of the road you drive on, how
you order author names on a paper. All of these are just social conventions that have been
established. And sometimes the social conventions are actually top-down, so some
government institutions says hey, it's a lot better for everybody if we all decide to drive on the
right side of the road than the left side of the road and so you have these kind of institutional
determinants, but in most conventions you're actually talking about something that comes
from the bottom up and just emerges from social interactions. An example that, and what we
are focusing on is linguistic conventions. So, you know, there's a bunch of different ways of
saying hello. You might say hello, howdy, good day, yo and these linguistic conventions even
more so than just sort of the general class of social conventions tend to be emerging from the
bottom up, right? It comes from the way the people talk to each other and people inventing
different variations on the same sort of mode of social interaction. But the process by which
this is, this happens that social conventions actually emerge is not very well understood. In the
past it's been studied within small laboratory conditions where, you know, maybe there's some
game set up and laboratory participants are playing some game in which some sort of
conventional behavior has to be established. There's some sort of theoretical or agent-based
models that sort of posit that there is some sort of like economic gain involved in terms of
there's a reward if you actually coordinate on using the same convention type of thing, but in
terms of real empirical work where you're studying convention emerging, there hasn't been a
lot done, mostly because the opportunity hasn't been there. You know, people, especially
when you're talking about linguistic conventions, it's just people talking and saying things. How
are you going to find out who's saying what to whom? Well conveniently, we have Twitter,
right, and so on Twitter there is, there's actually a convention that's just a general linguistic
convention that we use in our academic writing, which is how do you quote somebody and
attribute the quote to that person. Now, you know, the conventions within academic writing
for that are pretty well established, right? We’ve got the block quotes. We got the citation.
We've got the bibliography in the back et cetera, right? But on Twitter there was no such,
when it first got started, there was no such sort of rule of thumb about how do you actually
quote somebody and attribute the quote to the original source. And so these days, of course,
this is called re-tweeting and in fact it's been institutionalized in that Twitter has a re-tweet
button that you click and it automatically reposts that person, takes that person's post and
distributes it out to the people that are following you. When Twitter began though this did not
exist at all and if you wanted to copy another person's post and attribute it to them, you had to
literally cut and paste their post, stick it into the comment box and then write some additional
information to indicate where you had gotten that tweet. And so in this work what we're going
to focus on is sort of what are the origins of the different variations? How did people kind of
invent the different variations for doing this act of reposting and attributing to the original
source? Were these sort of kind of initial adopters sort of inventors different from the typical
user in any meaningful way, and how did certain conventions end up becoming widespread and
established in the Twitter community? So as I indicated there's, we chose this convention and
we chose Twitter for a particular reason which is that the data was there and provided a lot of
useful features. For instance, on Twitter these information sharing channels are explicit, right,
you have the following links. You know who's getting information from whom in the context of
Twitter, right? The re-tweeting convention, the reason why we chose that is because it is
actually specific to Twitter so people aren't going to be using this exact thing where they're
talking about for instance, at usernames which is very Twitter specific thing. They're not going
to be doing this in a context outside of Twitter, and so this means since we have a lot of data on
Twitter, this kind of historical data set of tweets, this means that basically all of the uses of this
are also going to be contained within Twitter. People aren't going to be using it outside of
Twitter because it doesn't make sense to use it outside of Twitter, and that means we actually
have information about how it was used. So the data set that we're using is this near complete
data set from March 2006; that's basically when Twitter began to September 2009. This
encompasses 54 million users, 1.9 billion tweets and 1.7 billion follow links. The follow links are
a snapshot of the network, so the way that this data was actually obtained was my colleagues
at the Max Planck Institute in 2009, it was possible to query a single Twitter user by their
integer ID and get all of their tweets, all of their tweets basically, right? And so they just went
through all of the integers for all of the, up to the maximum integer ID that existed in Twitter at
the time and got all of the tweets and all of the follow up links at that time. So the crawler I
think actually went, took about a month and so they got this data set and this is really excellent
because as I was saying, you know, we've got all of the uses of this convention, so we really can
at a very micro level see how these different variations are being invented, how they are being
spread and shared because we have the follower links and we feel pretty confident that nearly
all of the uses and nearly all of the instances of these different variations are actually available
in our data set because it's unlikely that people are going out to outside blogs. Now, one caveat
on this, towards the end of the data set there was a time when people started saying oh,
writing guides about how to use Twitter and these guides about Twitter included things about
what is the right way to re-tweet something, but that's much towards the end of our data set
and mostly what I'll be talking about today is the stuff that was at the beginning of the data set.
So the variations that we looked at, there are actually many more than I'm presenting and that
I'll be talking about today, but what we did in order to try to get a representative sample is we
chose four that were the most popular across our entire data set and then three others that
were sort of in the middle ground range that had been used a reasonable number of times but
sort of, around average number of times, number of uses, right? And so we, you know, how we
determined whether something was a re-tweet, it is possible that there are ways of doing this
that we did not capture in this study, right, because the way that we looked for it was instances
where somebody said token at username and then some text or some text token at username,
right? And then we took each of these and verified that indeed they kind of matched up with
some previous tweets with somebody that they were following so we could see that it was
actually, you know, being used for this function of reposting and attributing to the source. And
so that's how we got our collection of tokens to determine what was, what people were using
to indicate that they were re-tweeting. And also going forward when I talk about adoption, I'm
really talking about a very, very lightweight definition of adoption unless I explicitly say
otherwise, which is simply that you've used this variation at least once. This measure of
adoption really just means you have, you know that they are aware of it and that they've used
it at least, and that’s at least some form of endorsement. I think that, you know, we've done
some other work that I won't really talk about where we varied this threshold and by and large
these results hold true regardless of kind of what you count as how many uses is required to be
a real adoption. Okay. Origins, early adopters, majority acceptance. I'm going to talk about
origins now. So basically what was the first time that these were used? This is the timeline and
the very first one was via in March of ‘07. This is just under six months from when at username
was first introduced in Twitter. So Twitter first started the year prior and then about six months
later people started using at username to indicate that they were referencing something and
Twitter incorporated that into their system so that there was some automatic links between at
username and the actual user profiles and then about six months after that you have the first
use of somebody actually reposting and attributing to the original source. And the way that
they did that is with the word via. About seven months later we had HT which is borrowed
from the blogs. I'll talk about this independently. I'll just walk through these examples. But the
important point is that the last one that we look at is a year and a half after the first one that
appears. Yeah?
>>: So was the via thing kind of more a snowballing kind of the thing that happened or was it
independent of the [indiscernible] network?
>> Winter Mason: No I don't think I talk about this so I will answer your question now and not
put it off till later which is we see, you know, we looked at what is the probability in what you
would call independent adoption, which is to say that somebody uses this without us ever
having observed one of people that they follow having used it prior. And you do see a lot more
of that with via and this makes sense because it's a natural language word. It's what you would
kind of naturally, independently use. We actually see a lot of that also with RT, but that turns
out to be because of this data in the later part of the data set, so if you were restricted to the
early part then you see a lot more of this being strongly based on exposure rather than this
independent adoption. So this is the first use of via and I'm trying to think this actually might
be a slight error actually because this is the first use of via, but I don't think it's an actual re-
tweet. We have another one that is an actual re-tweet where it's -- because this is where the
guy got information and that guy is -- you know, he got that information elsewhere and I'll talk
about sort of that different usage of versus like exact copying versus just kind of a modified I
got your information from somewhere. But there is, but basically also in March we see the first
one where it's an actual copy from the previous tweet and as I was saying it makes sense it
started from natural language. HT, this is the first use of HT. HT was actually borrowed from
the blog community, so in the blogs if you were citing another person’s blog you'd say hat
tipped to so and so and so HT was sort of a shortcut that was used in the blogs and people that
were bloggers were using Twitter and so they came over and used the same kind of convention
for indicating reposting. It was so six months after via appeared that you see the first tweeter
specific variation and so this is the full word re-tweet and it's actually like, you know, reposting
this entire thing from this person that's talking about the standup show, you know, and it's
really -- one thing that's nice about it, about re-tweet for the people that use it is that it is
actually specific to the Twitter community. So people, you know, if you're using re-tweet, if you
said re-tweet to somebody that had never heard of Twitter before, they would have no idea
what you are talking about. So if you use re-tweet then that's sort of is a signal that you are
part of the community and you know what's going on and you're like I'm on Twitter and I am a
Twitter user and I tweet, you know. There's also re-tweeting which appeared about the same
time and kind of has the same pattern of usage and in fact the very first usage of RT was pretty
clearly an adaptation to the constraints that Twitter imposed which is to say 140 characters. So
this guy, T David, actually had used the word, the full word re-tweet previously and the full
word re-tweeting previously, you know, and he was, so that was sort of what his convention
normally was but then this tweet is exactly 140 characters. So in order to like to meet that
constraint he had to cut down in various places and, you know, the way that he decided to cut
down was to shorten re-tweet into RT. And so it's sort of like it's got this nice aspect of it’s still
re-tweet and so it's still like part of Twitter but now it's like shorter and fits these constraints.
Towards the end, the last variation, or the latest variation that we look at is this recycle symbol
and something that's interesting about the recycle symbol is okay, first off it's actually as you
can guess, it's actually kind of difficult to reproduce, because it's Unicode and you have to know
the right Unicode combination order in order to make it appear in Twitter. Or you can cut and
paste and then if you cut and paste then you can use it. This actually got a reasonable amount
of traffic. Like I said these were, this was one of the ones that were selected from the average
population, but the reason why that is, or a large reason why that is is because it's first usages
were actually by the founders of Twitter, Biz Stone and Evan Williams, right, and in fact they
explicitly in the URL and in one of the early tweets they used this, Evan Williams is like hey guys
we should use this recycle symbol as a way of indicating re-tweeting because it's only one
character, right? So it had this really like explicit sort of top-down kind of hey guys we should
use this advocacy sort of thing. But if any of you use Twitter you probably haven't seen this
recycle symbol very much for that usage and you'll see in fact that it did not end up being quite
popular. So with all, so with inventions of technology, there's a lot, there's often this question
of what are the original users like? What are the early adopters like? There's -- yeah?
>>: [indiscernible] one question I had [indiscernible] do you have an idea of where the different
[indiscernible] remember a client [indiscernible] recycle item and so you kind of push the
button and [indiscernible] one of those, it would be interesting to see when these clients
implement that sort of mechanism for pushing the adoption of x or y…
>> Winter Mason: And we do not have the data. We have thought about that and we don't
have the data. We have some data about the clients that people were using but that doesn't
actually tell us about whether that was the default for the client. I mean, so we have some data
going that way, but that, as far as we can tell or it appears based on our data that those sorts of
kind of the beginnings of the institutionalization of the conventions didn't happen until basically
2009. But we don't have something, we don't have all the clients and we don't when
TweetDeck said okay, we're going to start using this. Okay. So with inventions or technological
innovations there's always this question about who are the innovators, who’s the bellwether?
With social conventions there isn't really a lot of understanding of that, like who are the people
who are actually inventing these things, that are coming up with the new, you know, whazzup?
Right, all right, that when came from Budweiser probably, but like these different ways of
saying hello, where did good day come from and why is that really popular in Australia. Like
who were the people to really come up with these things first? And in these results it turns out
to be kind of to some degree biased by the fact that this is happening in the context of a
technological innovation which is Twitter. So this is a word club. What we did is we took a
random set of people of the same, I think there was 1000 people, randomly and then 1000
people who were the first to use these variations. And we looked at their profile bios and in
this word cloud the farther to the right it is more often it was used by the early adopters.
Farther to the left but more often it's used by random users and the size of the word is the
frequency of the usage overall. And so what you can see is that these, I'll use a pointer, you
have these really nice clear, you know, media social web, geek, developer, technology,
marketing, these guys that were early to adopt these variations were actually, represent
characteristics that you see in early adopters of technology. I mean these are guys that are kind
of on the edge and they are trying out things and in fact, we see that also in these things in
terms of, you know, their likelihood of using these just little features that you can have on
Twitter with respect to, you know, do they have a bio? Do they have a profile picture? Have
they changed the background on their Twitter profile? Have they added this sort of
information in terms of their profile? And you see that these early adopters are way more
likely to have used these features than the random user. So there's this real clear signal that
these early guys are very kind of, they are actually the type of people that are seeking out new
things and trying out the new things and really being really innovative across the board.
>>: Is this compressed across the three years of data that you have, the word chart that you
showed?
>> Winter Mason: So these, this, yes. The profiles come from the time of our crawl. So this is
what those profiles were like at the end of that 2009.
>>: So I mean I was generally wondering because this is generally aggregated over time, so I
was wondering if there's the affect of these geeky people actually being on Twitter even the
early part in the initial days and then like some of the people on the left side of the spectrum
joining later and could that have happened?
>> Winter Mason: We did do matching on age. So matching on how long they had been on
Twitter. I don't think that that is reflected in this word cloud, but I know that we looked at
those comparisons, for instance, on this one, we looked at that for the -- I don't think these
numbers reflect that but I know that we looked at it and it was basically the same.
>>: [indiscernible] early adoption and you personalized it by when they joined or when they
first to use via or whatever?
>> Winter Mason: That's right.
>>: I see and how do you, what [indiscernible]
>> Winter Mason: We literally for any given, for all of the convention variations that we looked
at if you were one of the first thousand people to use it, you know, yeah.
>>: Do they normalize it for amount of use? I mean is it the reflection of a person or sheer
amount of use?
>> Winter Mason: Oh, I see what you're saying. So you mean like…
>>: [indiscernible] I might just have happened to have seen it. It's more likely that I had seen
RT and therefore I am willing to use RT even though it's not really, even though it might be a
little bit conflated, as you can imagine someone who uses it all the time that only talks about
love and life or whatever… Does that make sense?
>> Winter Mason: Yes. I mean when you're looking at the early on users they are actually more
active than the typical user. In fact, one of the things I was going to say is that you see also that
they have more followers, right, which is another indication of activity. So I think that, so these
guys were definitely more engaged in Twitter at the time without a doubt. Is that why they are
more likely to use the variation? It's unclear. I mean that seems kind of it's like, I think that you
can fairly say that they are very involved and that is at least correlated with the fact that they
were the first to use it, you know, but these are the guys that are doing it kind of thing.
>>: Did you guys also look at the exposure size for these [indiscernible]
>> Winter Mason: And I'll talk about that. So we also in addition to kind of asking who are
these guys, we kind of wanted to see what is the pattern by which these variations spread. And
so we sort of created this diffusion network where, you know, as you would expect an adopter
as is noted in the graph, and there's the link from A to B. If A was exposed to the variation by B,
so being A is following B and A adopts it after B uses it, so we can assume that that person kind
of was exposed. Surely there's noise in that assumption but, you know. And so when you look
at the first 500 adopters of re-tweet you see, you know, these as people use it, I know it gets
added to the thing and the ties represent diffusion network. So I'll walk through that
visualization again. And so one-person used it and it took a long time for other people to use it
and then it kind of blows up and you kind of get -- this is a force directed layout and so you sort
of see you've got one general kind of cluster in the middle. When you look at the same thing
for RT, you've got the one guy that used it first in that instance that I told you, and actually
there's quite a bit of time before the next person uses it although that's immediately followed
by another one. And once it starts being used, you start to see sort of different communities
using it.
>>: So was the second person that used RT someone who followed the first guy who used it for
was the second person also independently invented it?
>> Winter Mason: Yeah. And in fact I was saying that it seems like the credit for it kind of being
shared and becoming more popular probably belongs more to the second guy than to the first.
But you see somebody that followed the first guy use RT, so there's some exposure, there's
some usage there that is getting kind of transmitted from the first guy too. But the thing that
we think is kind of interesting about this is that you see kind of these separate, more separate
clusters than you did in the re-tweet variation and there's some question -- and in fact if you
look at via it's got the same sort of thing similar to re-tweet where it kinds of seems like more
than one cluster. And one hypothesis that we don't really have an answer for, but one
hypothesis is that it's possible that RT kind of in the initial stages was bridging different
communities and therefore had kind of a wider exposure in terms of the eventual people that it
could reach. Again we don't have any kind of conclusive results on that yet, but yeah. So in our
early adopters, basically even amongst the first 500 or a thousand people, the average number
of exposures, you know, for each of these variations was between about 3 to about 6. So in
other words, before one of these early adopters would use this variation they would have to
see it 3 times or 4 times or 5 or 6 times. The clustering coefficient, this is for the diffusion
graph, right, so if you just take the diffusion graph of the first 500 people and you look at the
clustering within that, it's actually relatively high. Having .23, .3 that's actually relatively high.
And then we looked at this measure of criticality which is to say if you pick out one-person and
assume that, you know, everybody that is following them and not following anybody else that
uses it will not be exposed, what user could you pull out of the graph that is going to affect the
most people. And so basically the criticality is really low in this, so for one of the variations the
most critical user, you pull out one person and only half a percent of people wouldn't have
been exposed if you pull out that person. In one of them it's almost 5%, but still it's only 5% of
the total adopters, you know, amongst these early adopters. And so, you know, that kind of
suggests that there really isn't any single critical user, any single user who is critical to the
process of diffusing this, these variations, right? And this is really worth commenting because
this is actually very different from what you see when you look at, for instance, the diffusion of
URLs, and other studies of the diffusion of technologies. In this case a lot of the times you see
these very critical users where if you take that person out, a lot of people downstream will
never have been exposed to it, hypothetically in this model. And so, and the diffusion networks
of URLs tend to be very sparse. They tend to be very treelike, you know, and so it seems that
there's at least the suggestion that the process by which these conventions, these social
conventions are being spread is actually different than the process for things like URLs are
spread. And so it's actually liked different social phenomenon. It's not just information sharing.
So, you know, probably if you are a Twitter user you have some foreshadowing of what the
majority acceptance of these different variations are like and I've alluded to some of them. But
basically, so this graph is, note, it's on a log scale and it's the number of new adopters per week.
So this is not cumulative; this is per week the number of adopters. And you see here, you
know, this is where via started, this green line is where via started. And you see that it actually
gets pretty big. It gets very popular towards the end. Oh and just to be clear this is in May
2008, for the re-tweet variation there were one hundred new adopters per week, just to be
clear. And the thing to note is that at this end the other thing to note is that up here, you
know, RT is getting towards the end of the data set, RT is getting what is that one hundred
thousand new users per week? Via is getting maybe let's say 50,000, right, maybe actually
probably one hundred, not even 50,000. Maybe 15,000, so that's the importance of the log
scale is my emphasis here, right? And the next highest one is only getting about 1000 per
week. So I just want to emphasize that. We do this so that you can see the differences, but if
you look at this on a not log scale, all the rest have kind of disappeared or are completely
swamped by RT and via. So some points that, the variations have, you know, these different
growth rates, so via started kind of was growing very slowly at the beginning and actually -- I'll
remove that very quickly. This HT also had this sort of slow growth. RT had this very rapid
growth. So did re-tweet and re-tweeting. They both had this very rapid growth. And then at
the end you see, you know, these are still accelerating rapidly, right, and they still have this very
positive trend. Re-tweet has sort of flattened out in terms of the new users and you actually
see re-tweeting declining. So they have different kind of growth patterns throughout the time.
And yes, I've already said only two variations became dominant. And so one of the things that
this sort of suggests is that it might actually be hard to predict what would become dominant in
the end just from this initial growth data. You have, via started earlier, way earlier than
anything else and yeah, it became very popular towards the end, but it had this very slow
growth rate. Re-tweet and re-tweeting had this very rapid growth just like RT, but RT continued
to dominate and continued to have that growth. So this actually makes the problem of saying
okay. How do we determine what is going to be the finally dominant variation? How do we
determine what that's going to be? One of the ways that we looked at that was to look at the
probability of switching from one variation to another, so this is for all of the variations that we
looked at across the entire to data set. And so you see that RT has this very high, these include
the self loop, so if you, given that you've used RT you have basically a 76 probability of using RT
again. If you use re-tweeting, you have a 43% probability of the next one that you use being RT.
And so you see across all of these variations there's a pretty high tendency to reuse the
variation that you've used before. That's true kind of across-the-board. You also see sort of
affinities among the variations, so re-tweet and re-tweeting are way more likely to go to RT
than they are to go to via. Recycle is way more likely to go to via than it is to go to RT. But this
data is, so I think this is sort of interesting in terms of telling the story of where people are sort
of converting to when you go from one to another, which one are you likely to end up at. But
this is somewhat complicated by the fact that this is aggregated across the entire data set. So
when we look at it from the kind of, we look at these slices. Starting in May 2008, this is 4
months after RT was introduced, but at this point all the variations have roughly equal numbers
of new adopters per week, so they are sort of in the same growth phase and we eliminated, we
are not looking at the HT, R/T and recycle because those are the ones that were sort of, you
know, not very popular, not as popular as these 4, so we kind of just focused on these 4 for this
time series analysis. And so in the beginning you see, you know, there's not as much -- via has a
high tendency for reuse just like it does across the entire data set, but in the early on if you
used re-tweeting you were really likely to use re-tweeting again. If you use re-tweet you were
really likely to use re-tweet again, maybe you'll go over to re-tweeting, but these paths up to RT
aren't really there presumably because there wasn't yet a lot of exposure to RT. As you go
forward, let's see where is it? So after 4 months, so if you look at this, the self cycle for RT, as
you go forward, you know, maybe around July it jumps up from around .5 to .6, .5 5 to .65, but
then in September it's up to .82, so now there's this sort of stickiness to RT that is suddenly
there. People are way more likely to keep using it at this point. And then in the following
month, we see that, so here we see that it's around, the probability of transitioning from retweet or re-tweeting to RT is like .07, .08, but suddenly in October you start to see this
movement where people that are using it are kind of more likely to go to RT. And as you go
forward that number only goes up. And so by the sort of end at this time series analysis, you
see the path from via to via is lower, but the path from via to RT, from all of these to RT has
gone way up and so this is basically we're seeing through these, this time series of transition
probabilities this eventual convergence on RT as the real convention and this is people, you
know, as the transition probability goes up through time people are really kind of funneling into
RT. And so one of the consequences of this is that as the popular variations get into, in order to
be popular they have to sort of hit the more usual user, so if you look at this. This is a CDF of
the in degree, the number of followers of the users of each of these variations and you see that
the in degree of the followers of not RT and via, all of the rest is pretty high. These are the
people that are active, that have a lot of followers on Twitter. They are using a lot. If you look
at the number of tweets they have a very similar pattern, so as the in degree. But RT and via,
you know, these very popular ones have this order of magnitude less number of followers for
the users and this is because they have moved out of the really active, I've got 1000 followers
and I'm a social media guru type people, to the I'm just a typical user of Twitter and I'm
following my family and maybe one celebrity type people. So whether this is, this is not exactly
a causal statement but this is, you know, a real clear kind of -- this really is in line with this twostep theory where, you know, initially you've got the core people, right, and then once it
actually breaks outside of that core you're starting to hit these peripheral people and these
peripheral people have a lower in degree and the lower number of tweets.
>>: What's your just…
>> Winter Mason: I'm not making causal statements [laughter].
>>: Not I'm not asking for a causal statement. I'm asking for your, I want to know what your
intuition is as to why those are the ones that got to the periphery.
>> Winter Mason: So that is the…
>>: So are these sheer numbers again? These were just the most likely?
>> Winter Mason: I think it is because -- yeah. Well I think it's because these variations had
features that made them very attractive to users. And in fact kind of the, that is exactly the
question that we are really interested in answering is why is it that RT and via became popular
when the others didn't? And so one-way that we went about trying to do that was saying okay.
If we can predict what a single user is going to adopt, if we can figure out what are the features
that are important for a single user to adopt a variation, then, and apply that, right, across the
entire set of users, you know, maybe we can determine what is the causal factor, what is at
least a likely factor for why RT became the one that was popular. So basically the model that
we're doing is saying okay, this guy has been exposed to some different variations from
different people that Bob is following and Bob has to decide what variation he's going to use
when he wants to re-tweet one of these tweets. And so we've got a lot of different information
about Bob that we might be able to use about this, so we know how many times he's been
exposed to a variation. We know how many people, independent people have exposed him to
that variation. We know stuff about who his friend, how many followers he has, how active he
is and we know sort of when he is doing this action. And so we actually categorized these
features into 3 different general categories. We've got personal features which are just about
Bob. We've got social features which are about, you know, who he's following and what he's
been exposed to. And that we have these global features that are just about kind of when
things started and I'll talk about that. So an example of a personal feature is just the number of
followers that he has. We know that, like I said, RT and via, you know, have, the users of those
tend to have this across the entire data set tend to have a lower in degree, so, you know,
maybe there's some information there. In the social features, the key ones that are going to
talk about are the number of exposures and the number of adopter friends and so this is, you
know, if I have -- it's the difference between, I have 3 different people that used RT once versus
I have one friend that used RT 3 times. And so this is the probability of adoption given the
number of adopter friends. So it actually gets pretty noisy pretty quickly depending on kind of - so it's smoother for RT and via because we have more data there because more people used
it, so you're more likely to have somebody that has 30, 25 friends that adopted it. And recycle,
it gets really noisy really early because it's a relatively small data set. But you can see that it's
got this at least somewhat interesting shape where, you know, for all of them except maybe
these kind of low data instances, there is this sort of peak somewhere where there seemed to
be like if you have this many adopter friends, that's sort of the maximum probability of
adoption and less than that or more than that it is less. And I think that's kind of an interesting
result.
>>: Question. I would totally think [indiscernible] upwards [indiscernible]
>> Winter Mason: So I think that it's, there's basically like as the, at some point you are going
to be like that it doesn't matter how many people use it; I'm not going to use it, right?
[laughter]
>>: I want to know what property it is for being stubborn. [laughter] that's awesome.
>> Winter Mason: Yeah. And then we just look at the general adoption date. So when did you
actually adopt it? When was this introduced? So this is the total set of features that we used,
so we also use number of tweets, the number of URLs, when they joined and where they said
they were located if that was available. And so what we do is we calculate the information gain
for each feature, so given that you have some baseline prediction, how much information do
you actually get by adding this feature into the model, essentially? And so we, this is the
change in entropy given the feature. And what we see is that this is the kind of the top-ranked
features and so a couple of things I want to point out. First off, the number of exposures is
more predictive than the number of adopter friends, right? So it doesn't matter how many
different friends use it. What matters is how many times you've seen it. This is nice because
this conforms to the notion and the literature on social conventions which is to say that you,
when you want to know whether something is normative, the cue that you have is whether
you've seen it a lot. You mimic behavior that you see a lot if you, if it's a normative behavior, if
it's something that you are conforming to. And also geography is really not important, so
where they are located never even shows up in this. It's just not significant at all. And then of
course the best feature is sort of the date. This sort of makes sense. It's really kind of like okay.
If you are at the end of our data set, if you are, you know, the first time they use it is in July
2009, you are way, way more likely to use RT just on the basis of, you know, how many people
have used it in the past. So this is actually, the fact that that's the most predictive is sort of
expected and also a little disappointing because it's not really useful or informative. This sort of
highlights one of the limitations in this study, which is the fact that we are only looking at one
variation that happened one time on Twitter. It's really excellent because we have all this great
data as to when people use it and it's really great microlevel data, but in the end it boils down
to something that is like a very sophisticated case study, because there's only this one instance
that we're looking at of a social convention forming. So why is it that RT was introduced when
it was versus when via was? We can hypothesize but we don't actually have sort of a empirical
data about why that is the case, right? And so that's sort of like, you know, in my opinion a key
limitation to this. Nonetheless I think that because we have this really interesting data and
because it is, it really is talking about this convention and we know a lot about it, I think there's
still a lot of value that we are getting out of it. So we also used a bunch of different classifiers,
Bayesian models, boosting, decision trees trying to find out what would kind of give us the best
prediction accuracy in terms of what somebody, which variation somebody is going to adopt,
and we found that kind of just as you would expect bagging different classifiers is a good
ensemble. Yay. And then we did feature selection to see if we could find the best subset of
features and so this is the accuracy that we got, which you can see right here and basically our - I should've said the, what we're doing here is saying given that somebody adopts, can we
predict whether or not they will adopt this variation?
>>: Adoptive variation means it's the very first one-day he used or did anyone ever use it just
once or so one they end up using it at the end of the [indiscernible]
>> Winter Mason: No. It's if they use it. And so it's based on…
>>: So I could get adopter of all of those?
>> Winter Mason: Correct. Because basically we're focusing on a single act of adopting, a
single usage. We're saying okay. We know that you used a variation here. Can we predict
which variation it's going to be? That's the prediction problem. And so it's, you know, did you
use, so given that I know that you used a variation, can I say whether or not you used recycle
icon. And we do that with 99.9% accuracy and does anyone want to guess why?
>>: Because hardly anyone uses it.
>> Winter Mason: Exactly [laughter]. Right. Because these are highly imbalanced classes. The
probability of using it, well, I'll talk about that in a minute, is, you know, the probability of using
the recycle icon is about 99%, right, so being able to predict it with 99% accuracy is nothing -I'm sorry. The probability of not using recycle icon is 99%, right?
>>: So does that mean that if there was someone who used via and then switched to RT for
once, but after that they went back via, you consider it as 2 different transitions?
>> Winter Mason: Yes. That's correct. Yeah. And we did. We have actually looked a lot at the
kind of switching behavior and we've kind of classified users into 3 different types. One is the
pure monogamist, so they only use one variation ever. The other is the serial monogamist, so
you use one for a while and then you switch and use another for a while. And then you have
the polygamist who just kind of like uses whatever, you know. [laughter]
>>: I did have one thought earlier is there any way to weed out if they do re-tweet and then I
re-tweet you? And then so I'm using your convention but only because, you know what I
mean? Maybe it doesn't happen that often. I don't know.
>> Winter Mason: It happens, but what you see is like RT, like again, this isn't in the time when
you have to copy and paste the entire thing. And so you see people, you do see a higher
propensity to use the variation of the person that you are following or that you are re-tweeting,
but it's actually really a sticky issue and very hard to tease apart. So okay. How do you deal
with unbalanced classes? There's cost functions that you can put into the learning algorithms
and, you know, these sorts of things. What we decided to do is just say okay. Let's just focus -let's just artificially make the classes balanced so that we can see. Because one of the things
that we are really interested in, the goal of this prediction exercise is to a large extent to figure
out what features are useful. And so okay, let's just force the class to be balanced and then
here you get this sort of accuracies that you would kind of be more likely to expect. So we are
still actually doing, even on the balanced data set, we are still actually doing pretty well with
the recycle icon which is good to know. We have a harder time, you know, predicting these and
remember the baseline here is 50%, so we're not really doing too great with re-tweet band retweeting. And in fact when you look at this -- so one of the things, you know, we're using sort
of followers and who you are following and who you are getting these things from and so one
of the key things to this is defining what the tie is. Maybe, for instance, if we only look at the
people that you have strong ties with in some way, maybe that would be a better signal. And it
turns out that it's a little bit better for some of them, you know, for most of them but it's not
like crucial. So these 2 lines here are when we increase the number of kind of mentions
between people and then this is the mentioned graph and you see that these tend to dominate.
But the normal which is just the follower links, you know, there's not a whole lot of boost that
we get from that.
>>: [inaudible]
>> Winter Mason: Well it's yeah. I think that it's, if it were the mentioned graph plus the
following graph intersection of it, that might be different. So wrapping up. Basically, you know,
these conventions emerged in this very nice organic bottom up manner and that's something
that we found particularly interesting. That's the reason why we wanted to study it. We see
that these, you know, early adopters by a lot of different metrics that we looked at, even ones
that I didn't present today, you know they really are core members of the community. They are
tweeting a lot. They're using mentions a lot. They're the first to adopt these different profile
features on Twitter. They're using these tech words in their profiles. They are clearly this type
of person. These social conventions are spreading through these, and this is something that I
think is really key, is that these conventions are spreading through these dense and clustered
networks and there's not really any critical user and this is different from how other
information spreads. And so I really think that that speaks to the importance of thinking about
conventions and these variations in the way different than just information transmission. And
then of course as they got popular, they reached outside of this core community. One thing
that is kind of interesting is that the final kind of reach of these variations of RT and via doesn't
seem to be related to when they started or like how early they started or the rate of growth at
the beginning. And so this sort of makes it more difficult a proposition to try to understand
why RT and via became the ones that ultimately dominated. We did see that nearly all of the
adopters for all of these variations had been exposed to the convention through their friends
on Twitter and so this is clearly, you know, there is some normative aspect. There is some sort
of social, this really is a social convention because people are sort of doing what they been
exposed to. Okay this slide is okay. We see that even with, even when users adopted multiple
variations, they tended to stick to one. We actually see more of the pure monogamist and the
serial monogamist than we do the polygamist. Of course, as you would expect the longer
variations because of the ecological constraints were more likely to be abandoned and our
hypothesis is that the reason why, kind of the theory that we had was that the reason why
these like RT and via became popular, and this is speculation, but kind of from the basis of what
we've seen, we feel like it comes from, you know, there's this community aspect to it. So you
sort of, there's this normative kind of this convention, like so this conventional thing. You have
these exposures and that's really important, but there's also these sort of features of the
variations themselves that are important, be it that they are from natural language or that they
are, you know, relevant to the community of Twitter or that nt that they are short, because
they fit into the constraints of the Twitter ecosystem. Okay. Thank you. [applause].
>>: So you started with this like Japanese person [indiscernible] how did the story that why
that particular Japanese-style [indiscernible] could be because it was more advantageous to not
shake somebody's hand because of transmitting the disease, for example. So in this case of
theory, I'm thinking that you mention here the length of the re-tweet icon or whatever it is, it's
one of the different, you know, environmental traits that these, did you think that these mean
more…
>> Winter Mason: Absolutely.
>>: So one of them is length and another one is readability and so RT [indiscernible] then via
and via is a word [inaudible] theory don't understand what RT is and another one is ease-ofuse. If it was [indiscernible] the recycle button would survive, but it was hard to use. So I
wonder if you will add to your model things like that likes [indiscernible] for example, the one
arrived the shortest one or the ones with the highest readability RT and then [indiscernible] R
slash T came to be afterwards but it didn't take off.
>> Winter Mason: And also I think that HT to a large extent has the same readability as RT and
it's the same length. They are very like, if you just look at them as characters they are very
similar, you know. And so understanding the difference between HT and RT is also very
confusing. And so we thought about this, but a lot of these features are really kind of difficult
to quantify and so we wanted to just like focus on these more easily quantifiable features first.
>>: As you were doing this did you guys ever consider that if re-tweet and re-tweeting and RT
were really the same I mean from a social convention and that RT, because re-tweeting
happened -- they all have the exact [indiscernible] re-tweeting, oh let's go shorter, re-tweet,
same curve. Let's go a little shorter. RT. So from a social convention standpoint, couldn't they
be considered the same and then they just for practicality.
>>: [indiscernible]
>> Winter Mason: That's right. That's what I was about to say. I was trying to…
>>: RT came before R/T.
>> Winter Mason: No, no, no. Yes. But also re-tweet came before re-tweeting.
>>: Can you go back?
>> Winter Mason: Yes. Let me get to the -- yeah. So re-tweet and then… Re-tweet and then
re-tweeting.
>>: [indiscernible] take off. The one that you had there just for a second. Right there. So retweeting actually even though it may not have happened the initial one first, it actually takes
off. I'm just curious.
>> Winter Mason: In fact, also, where is it?
>>: I'm just curious if you guys considered…
>> Winter Mason: Also. As I kind of said here, you know, you have this… No. Here you have
this clear like these guys, you know, they really are related through this transition probability.
So I definitely -- they are very much the same and in fact they are used by the same kinds of
people. But yeah. And in fact I think that's important and I think I totally agree that these are
sort of -- you couldn't get to RT without going through these.
>>: But they do play different relationship -- this just some personal -- I see that via is typically
used when you are not quoting the person but you are just saying I heard this from Scott and
I'm going to read through it for
>>: It's still used [indiscernible]
>>: Right [indiscernible] you have a conversation that you had whereas someone mentioned
something and I tweet about it and I say [indiscernible] even though I'm not actually saying
what she posted.
>>: Also [indiscernible] is quite popular these days so [indiscernible]
>> Winter Mason: I need to move it over to your. Bigger. Nope that did not work for some
reason. Okay. Hold on. There we go. So basically, so [laughter]
>>: [indiscernible] awesome [laughter]
>> Winter Mason: So basically what we have here is we took a set of messages and looked at
the message that came before it. We tried to find the message that came before it and kind of
the minimum edit distance, so basically the distance to the closest previous one is really kind of,
you don't really see a lot of difference. Sort of like definitely not between RT and via when you
are talking about these segments of about 20 characters. And so to me that suggests that even
though currently I agree we see a lot more of that where via and MT is a modified tweet and via
is used for information that maybe you didn't even get through Twitter. You got it through that
person in a conference room, right? Even though that is maybe how it's turned into now, I
don't think that that -- it doesn't seem in our data that that's actually how it was used initially.
>>: Was via used more at the end? That's kind of how I remember it that people tweeted it at
the end.
>> Winter Mason: Oh. So we look at that. Yeah. So the actual position. So we looked at that
and it's actually not as consistent as you would expect. You know, I observed the same thing in
my networks, but when you look at the data it's just not that consistent.
>> Meredith Ringel Morris: Thank you.
>> Winter Mason: Thank you very much. [applause]
Download