23114
>> Bill Dolan: Well, I'm really thrilled today to introduce Alan Ritter. Looking around the room, I think probably everybody knows him already from one of his two summers here first with Sumit and then in our group.
And Alan is a student of Oren Etzioni, fifth-year grad student in the computer science department at UW. He's done a lot of really amazing work in a bunch of different areas, all loosely surrounded around natural language understanding, particularly working with big, noisy data often from social media. I guess not exclusively, but a lot of your work has focused on data like Twitter built a lot of tools and released a toolkit for doing things like part of speech tagging, big noise data and doing a lot of interesting annotation of that kind of social media data.
Today Alan is going to be talking about modeling conversations in social media.
I'll let him describe what that is. But this is really Alan's work. I want to emphasize even though there are other names out there, but I think you're the one who drove this and I think it's pretty interesting. So I'll turn it over to you.
>> Alan Ritter: Thanks, so much Bill. Okay. So my name is Alan Ritter. I'm going to talk about modeling conversations in social media. Like Bill mentioned this is sort of joint work with Colin cherry who is at NRC now and Bill.
Okay. So recently there's been sort of an explosion in the number of users that are having short, informal conversations in social media websites like Facebook and Twitter. And this basically presents a huge amount of conversational data.
Sort of like a larger volume than has been previously available.
So I think this basically opens up new opportunities for data-driven modeling of conversations. And so these are conversations that can be on any topic and, again, they're sort of informal.
So in this work I'm going to focus specifically on Twitter conversations. And so
I'd like to start out just by making it totally clear what I mean by that. So many of the posts on Twitter look something like this where the user is just announcing some kind of information to their followers.
And so by themselves these aren't really conversational, I wouldn't say.
However, about 20 percent of the posts are actually in response to some other post. And these tend to form these sort of short conversations.
These are similar to the kinds of conversations you'd see on Facebook and other social networking sites as well. So what's the motivation for this. Why should we study conversations that are happening in social media. So I think there's sort of two different things that are motivating our work here. So the first would be this idea of learning conversational agents from data. So just given that there's this huge volume of conversations that are available in electronic format, there's this interesting goal, I think, of trying to learn a conversational agent, just strictly from data.
And so the second thing I think that's interesting here is building new user experiences for users of social networking sites. And so people are having conversations on these websites and so maybe by using conversation models we can build sort of new user experiences. Like maybe we can do a better job of doing predictive text input or maybe people want summarizations of conversations if they're really long.
Or maybe we can even do something like detect flirting and display it, you know, which of your friends have been sort of flirting with each other or something like that.
Okay. So in this talk I'm going to talk about two different conversation modeling tasks. So in the first part of the talk I'm going to talk about learning to generate replies to status messages in Twitter and in the second part I'll talk about automatically inducing dialogue act in Twitter conversations.
So jumping right in, so the task here is given an arbitrary Twitter status message we want to be able to generate an appropriate response to it. And so in order to do this, we're going to make use of millions of naturally occurring conversations to learn a model of responses.
Okay. So how might we even go about doing this? Well, from looking at a few conversations, we noticed that there's often some interesting relationships between the words in the status message and in the response. So, for example, in this sentence pair the word "it" in the response is clearly responding to the soup and the status but there's interesting relationship between smells and looks and gorgeous and delicious.
So these kind of naturally occurring parallelisms led us to ask whether it might possible to directly translate the status message into an appropriate response using techniques from statistical machine translation.
Okay. So why should SMT even work at this task? So I think it's worth pointing up front that translation and conversation are really two different things. And so in particular in conversations the source and target sentence aren't semantically equivalent to each other as is the case in translation.
So I'm not claiming that we're going to be able to learn a really deep model of semantics here to have intellectual conversations but I think we can actually do a pretty good job of learning these high frequency response patterns.
For example, if we see a phrase like I am in the status message we might see you are in the response or a phrase like airport might trigger a phrase like safe flight in the response.
And this is really just a first step towards learning conversational models from data. So I think there's a couple of advantages to adapting existing techniques to this problem. So people have put a lot of effort into these machine translation
algorithms so they have good performance. They scale to large datasets, which I think is important for modeling conversations.
And they also are based on probabilistic models. So they're going to give us a probability distribution over possible responses and it's going to make it easier to integrate them into various end tasks applications in a relatively principled way.
So I think there's a couple of interesting applications of this. So one might be language generation and dialogue systems, and so currently most dialogue systems rely on either canned responses or templates to generate output to present to the user. But the idea here is maybe by taking the user's utterance into account, in addition to the dialogue state, we can generate more natural and varied discourse.
So the second and maybe more immediate application would be something like conversationally aware predictive text entry. So, right, so imagining that someone has just sent you a text message and you're typing a response to it using some noisy input mechanism, I think we can actually do a better job of predicting what you're trying to type by taking the message you've just received into account. So, for example, if someone texts you I am feeling sick we should be able to do a pretty good job of predicting how you might respond without even seeing any input from the user.
Okay. So we crawled the Twitter public API and gathered about 1.3 million conversations. And so I think it's relatively easy to get more data here, too. And
I think some people are actually working on that. But one thing I want to point out about this data is that there's no need for conversational distanglement.
So, for example, in IRC chat, there's multiple different threads of conversation happening simultaneously. And it's not always obvious which post is responding to which whereas in the Twitter data it's part of the data. So each post is providing a link to the post it's in response to.
Okay. So like I mentioned we're proposing to adapt statistical anyone translation to this response generation task. And so in SMT the task is given some foreign text we want to translate it into English. And in order to learn a model to do this, we have access to large parallel corpora of paired foreign and English sentences.
So I think our situation is actually pretty similar. So our goal is given an arbitrary user utterance we want to generate an appropriate response. In order to do this, we have access to large corpora of naturally occurring conversations.
Okay. So at a high level, how phrase-based translation works is that we first segment the input sentence into phrases. And then translate each phrase in the input into a phrase in the response. Potentially with some reordering. And so in order to generate a good response, we want both good translations at the phrase level. So here we have who wants to translating is want to, dinner is translating as yum. Come over translates as be there and so on.
But we also want a response that has a high score according to the language model. And this just sort of ensures that it looks like a fluent response.
Okay. So in a little bit more detail. The responses are scored using a log linear model with features based on the phrase translation probabilities and language model. And we use the Moses decoder which performs a beam search to find the best response according to the model.
Okay. So this all sounds well and good, but we ran into a couple of challenges in adapting the SMT out of the box to conversations. And so there's sort of a wide variety of different reasons why this is sort of more difficult than translating between languages. But I think the main point to take away here is that in conversations the source and target sentence aren't semantically equivalent to each other. So this just sort of makes this problem more difficult.
Okay. So the first issue we ran into is that the most strongly associated phrase pairs in the data are really just identical phrases. So without doing anything to discourage this, we just learn a model which parrots back the input status. And this is just one example of that.
And so here you can see it's almost repeated except that instead of it smells gorgeous you see you smell gorgeous. I mean, this is kind of okay, but I don't think this is really a good response. I don't know. This is pretty typical for what happens.
So in order to deal with this, we did sort of two different things. The first is we filtered out phrase pairs where one is a substring of the other, and also introduce a feature which penalizes similarly phrase pairs. This seems to take care of this problem pretty well.
Okay. So the next issue we ran into is that word alignment doesn't really work very well in these conversations. So word alignment is typically the first step in the machine translation pipeline and it's used for extracting phrase pairs from the sentence pairs. So we tried running Giza plus plus on our conversation data and noticed it produced very poor quality alignments. And again there's a number of reasons for this but I think it's going to be easiest to explain by showing some examples.
So this is one example where word alignment is relatively easy. And so as a person looking at this, I think you can see a pretty clear word alignment that makes a lot of sense. So, for example, I here is aligning to you. Get aligns to get and off aligns to off and what time aligns to at five. If all the sentence pairs in our data looks something like this I think word alignment would actually work pretty well.
So is problem is we often see sentence pairs which look more like this, where there's a large number of words in the source sentence that are unaligned in the source target and we have this large phrase pair which we can't decompose any further into more fine-grained alignments in kind of a reasonable way.
And so these difficult cases tend to confuse the IBM word alignment models, and this just leads to really poor quality alignments.
Okay. So in order to deal with this, we basically just don't use word alignment.
Instead we generate all possible phrase pairs for each sentence pair.
For example, if we have this sentence pair I'm feeling sick and hope you feel better, we're going to generate phrase pairs that look something like this. And so some of these are going to be useful. For example, we get feeling sick and feel better. But then a lot of them are not going to be very useful. And in general we're going to generate on the order of M times N phrase pairs for each sentence pair. So this is going to produce a huge phrase table and most of it is going to be kind of garbage.
So we need some way to prune it down to a manageable size. So in order to do that, we're using Fisher's exact test, which is a standard statistical test. So basically for each phrase pair it looks at four different numbers. So it looks at the number of times we see the source and target phrase together in a sentence pair. The number of times we see the source but not the target. And vice versa.
And in addition to the number of sentences pairs where we see neither the source phrase nor the target phrase.
And then it just computes the probability of observing this table or one that's more extreme but consistent with the marginals assuming a model of independence between the phrase pairs.
And so basically it's just measuring how strongly associated they are. And this is similar to other statistical tests like the ki squared except it produces accurate P values even when the expected counts are really small. This is really important in our case because most of these counts are very small. So we often see phrase pairs that only appear together once in the data.
Right. So we basically just keep -- we compute the statistic for all the phrase pairs, and then keep the 5 million which have the highest rank and just use these as our phrase table. And so I think this actually produces pretty good phrase table entries. So here I'm showing a few of the highest ranked ones. You can see we're getting things like sick is translated as feel better. Interview translates as good luck my dad translates as your dad and so on.
So at this point I've pretty much described our MT approach to generating responses. So now I'm just going to quickly mention an information retrieval baseline that we compare against. And so here the idea is to take the input status message and find the most similar conversation in the data and then just return the response that's associated with that.
And so we're looking at two different IR-based approaches. So in the first we're measuring similarity between the input status message in the statuses in the data and IR response, we're directly measuring vector space similarity between the status message and the responses.
So these are just sort of two different approaches to how you might do this.
Okay. So to compare different approaches to response generation, we performed a human evaluation using annotators from Mechanical Turk. And we basically did a series of pairwise experiments between systems where we basically randomly sampled 200 status messages and generate replies using the two different systems and asked the Mechanical Turk users which of the two responses is better.
Okay. So here I'm showing the results from the evaluation. And there's a few more results in the paper. But I think these are the important ones. So basically each row of this table is representing one experiment that's comparing two systems. System A and system B. And then in the third column, I'm showing the fraction of cases where the majority of Mechanical Turk users preferred the result that was generated by system A.
And I've bolded the winning system in each case. And then also there's a sort of -- I'm showing a measure of agreement between the annotators here as well.
Okay. So just to summarize. The MT system is outperforming both of the IR systems. But when we compare it to actual human responses from the data, of course, it loses. And I think this is kind of what you'd expect. But what I think is kind of interesting here is in about 15 percent of the cases, we're generating replies that the human annotators actually think are better than are automatically -- than the actual human responses. So I think it's worth sort of digging in a little bit to see what these responses look like that are preferred over the actual human responses from the data.
And so in some cases I think we're actually just generating a really good response. So, for example, in this first row here the status is I want to go a bonfire right about now. And the response to that we generate that does sound fun bonfire, I want to go. I think this is kind of a really good response.
In other cases, for example, the third column or the third row, excuse me, our response is kind of makes sense. It's not -- it's on topic but it seems like it maybe doesn't make sense but when you compare it to the actual human response it seems to be more coherent. I don't know.
Okay. So actually have a demo that I can show here, too. And so this is on line.
It's linked off my home page so you're welcome to go and play around with it if you're interested. But basically I have a few examples here that I found that work pretty well. But you can type in whatever you want here.
So, for example, if we look at you know who wants to get some lunch, it's generating I want to get me some chicken. And then down at the bottom here you can actually see where each of these phrases are coming from. So I want to comes from who wants to, like words 0 through 2 in the input.
Get me some comes from get some. And then chicken comes from lunch. And, right? So there's other examples here as well. For example, I'm feeling sick translates feel better soon. And you're welcome to sort of go on here and play around with it if you like.
Okay.
>>: The training --
>> Alan Ritter: Yes, it was about 1.3 million conversations. And then we actually gathered, I think, another one million conversations or so to have a bigger language model.
>>: Training meaning --
>> Alan Ritter: Right. For the language model actually we just collected responses so they didn't necessarily need to be pairs. But, yeah. So, actually, if there's any other questions, this is a good time, because I was about to sort of move on to the next task.
Okay now I'll move on and talk about automatically inducing dialogue acts in
Twitter conversations. And so here the goal is to basically take a conversation and tag it with labels which basically describe for each post what role it's playing in the conversation.
And these are traditionally referred to as dialogue acts or speech acts and they provide sort of a shallow semantic representation of the conversation. So, for example, this particular conversation you could say it starts out with a status message followed by a comment and then a thank you. And you can imagine sort of a lot of different conversations on different topics which follow the same general structure.
So dialogue acts have been shown to be useful in a number of different applications, from conversational agents to dialogue systems and also dialogue summarization and detecting flirting. So I think there's sort of a useful thing to study in general.
And so traditionally people have approached this problem by first gathering corpora of conversations. And most previous work has focused on speech corpora, for example, telephone conversations or recorded meetings. And then, of course, you need to design annotation guidelines which determines the set of dialogue acts that you're annotating with, and then you can have some annotators go through and manually annotate the data.
So when we look at Internet conversations, there's been actually a little bit of previous work on modeling dialogue acts, dialogue acts and Internet conversations. But here there's actually a lot more variety, I think, in the style of conversations people are having.
So people are having conversations in e-mail, on Internet forums, IRC and now
Facebook and Twitter are really popular. I think it's worth emphasizing here that the tags that people have designed for speech corpora aren't always appropriate for these Internet conversations.
So, for example, they're missing things like meeting requests and status posts but then they have other tags like back channel disruption and floor grabber which aren't necessarily appropriate.
And so I think really each of these different styles of conversation kind of needs its own set of dialogue acts. And this is kind of motivated us to look at unsupervised approaches for inducing dialogue acts. So what do we need to look at in order to determine what's the right set of dialogue acts to use? So, first of all, there's often predictable transitions between dialogue acts.
For example, a status is likely to be followed by a comment or if we see a question, it's pretty likely that it's going to be followed by an answer. And then, of course, we need to look at words in the post to try and figure out what dialogue act they should belong to. But there's also these other words which aren't, don't really tell us anything about the dialogue act but they're specific to the topic of conversation. They're sort of sprinkled around the different dialogue acts. So in looking at different ways to model dialogue acts in an unsupervised way, we notice some work on content modeling from the summarization community.
And so here the goal is to model the order of events that are reported in news articles. So, for example, in an article about earthquakes, it's likely to start out by talking about the location and time of the quake, and then maybe it will mention its magnitude on the Richter scale and talk about damages and injuries that resulted. So there's sort of this predictable sequential structure of how people report events. So in order to model this they use the sentence level hidden
Markov model where the hidden states are emitting whole sentences, and they learn the parameters in an unsupervised way using Verterbe EM.
So as a first approach to modeling dialogue acts, we can basically just adapt this content modeling approach directly and just rename the hidden states as acts and have each act generate one of these Twitter posts in a conversation.
So we want our models to discover a set of dialogue acts that provides a night semantic representation of conversations and not -- we don't want it to just cluster together posts that are all on the same topic and sort of self-transition with high probability and partition the set of conversations into topics. And so when we apply the content modeling approach directly to these Twitter conversations, this is actually what tends to happen. This is one example of a few posts from one of these topic focused clusters. You can see these are all posts about food.
They don't have any kind of underlying dialogue act in common. So this is actually a big problem.
So when we noticed this our goal basically became to separate out words in the post to those which indicate the dialogue act and those which are specific to the topic of the conversation. And in order to do this, we're applying an LDA-style model where each word in the post is generated from one of three different sources.
So it could be generated based on the dialogue act. It could come from the topic of the conversation, or it could just come from sort of a general English vocabulary, which is a flexible way to model stop words.
And this is similar to some recent models that have been proposed in summarization. Okay. Just to remind you, the content model is a sentence level
HMM where each hidden state is emitting a bag of words. What we're proposing to add to this are a set of hidden variables, one for each word which basically just determines the source from which it's drawn.
So it could be coming either from the dialogue act or multinomial which is specific to the conversation and represents its topic, or it could be coming from this general English multinomial.
Okay. So for inference we use collapsed Gibbs sampling where we sample each of the hidden variables in turn, conditioned on all the others and integrating out parameters.
So one issue we ran into here there's a lot of hyperparameters that we need to set. So rather than just setting these heuristically, we used this sliced sampling approach where we sliced the hyperparameters like hidden variables. And this improved performance a little bit, I think.
So now I'm going to sort of walk through a qualitative evaluation where we're visualizing the parameters that the models learned to sort of show what dialogue acts it's discovered and we also did a more quantitative evaluation in the paper but I'm just going to skip over that in the interests of time here.
Okay. So what I'm showing here is a visualization of the transition matrix from the HMM, and so here there's an arrow drawn between two dialogue acts, if the probability of transition is higher than a threshold.
And I've added labels to them which are just sort of my interpretation of the meaning of the act.
Okay. So here I'm showing a word cloud for one of the dialogue acts. And what's going on here is the size of each word is proportional to its probability given the act. So for this particular act you can see the word I and my have probability.
We're also seeing words like going, get, I think getting is in there, too. And then temporal bands like today, tonight, tomorrow, night, and so on.
And so I think this really just represents a user talking about what they're currently doing, which is kind of the standard status post you'd expect to see on
Twitter.
Okay. So in response to that, we often see a question. And so here you can see the question mark and WH question words, and what's also kind of interesting is
the word you here is very prominent in contrast to the previous post where we saw I.
Okay. So here's another kind of question. And this is one that begins a conversation. And so, again, you can see the question mark and question words but then we're also seeing words like does anyone know why, if, who, how, and so on.
And so I think you can just imagine this is a user broadcasting a question to their followers. So it's sort of like a question to a group of people. And then there's also this reference broadcast state. And I should mention that as part of the preprocessing for the data, we replaced all the user names and URLs with these special tokens, and then you're also seeing the word RT here which has special significance on Twitter. It stands for retweet. It basically means you're reposting another Twitter user's post. This is a user sharing some interesting information that they found.
And so in response to that, we often see this reaction state where the exclamation mark has high probability and we see words like thanks, ha-hah,
LOL. I think it's pretty clear this is representing a user responding to some interesting or funny information they've just seen.
>>: [inaudible].
>> Alan Ritter: Huh?
>>: SMI.
>> Alan Ritter: I replaced a number of different emoticons. It's sort of like the smiley --
>>: Okay.
>> Alan Ritter: Okay. So I'd like to take just a couple of minutes and talk about some other Twitter work I'm doing. This here doesn't have to do with conversations but it's focused on information extraction.
And so, right, I think it's worth asking why would you want to do information extraction on Twitter. What can you get there that you can't get in news articles or some other source, for example. But I think what's interesting about Twitter here is that these are really short messages that are easy for people to write and people are writing tweets on mobile devices. They often provide the most up to date information about events that are actually taking place in the world.
And what's also interesting is that there's a lot of users talking about the same events. So this provides kind of a natural measure of how interesting or important something is.
Okay. So, of course, this is kind of a double-edged sword because there's so many redundant and relevant messages it can easily lead to information overload.
So I think there's a pretty strong motivation to have some kind of automatic text processing techniques that extract and aggregate together the most important information. And so, of course, people are already doing this to some extent.
So, for example, if you go and look at the Twitter website they show these trending topics which are basically just short phrases that are relatively frequent in the current stream of tweets. So I think what you can -- how you can look at what we're doing here is to sort of move down these trends and extract some sort of more structured representation of events that can allow interesting queries or visualizations of the data.
Okay. So in order to make this a little bit more clear, I'd like to just show a quick demo. And so what we're trying to do here is to automatically extract a calendar of popular events that are coming up in the near future.
And so what we do actually is pretty simple. We just extract named entities from tweets and this is being done sort of continuously in real time. And this is using a named entity recognizer which we've trained on some in-domain annotated
Twitter data. We found we sort of needed to do this to adapt named identity recognizers like Twitter, out of the box they network sort of very well. So then the second thing we were doing is to automatically extract and resolve temporal expressions. So, for example, if we see a phrase like "next Friday" we can actually figure out which calendar day that's referring to because we have the timestamp from when the tweet was generated. Then we count the number of items that the entity co-occurs with each date and plots the highest ranked entity on the calendar.
So I can just pop that open right now. And again this is linked off my Web page.
You're welcome to go play around with it. So today is September 28th. And so you can see a lot of people are talking about the Kindle Fire, which is this new
Kindle device which Amazon is announcing today.
And it's also Rosh HaShanah which I think is the Jewish new year. I'm sorry if
I'm mispronouncing that. But I think what's more interesting is to look at events that are happening in the future.
So, for example, tomorrow you can see people are talking about is National
Coffee Day. And we can actually click on this to sort of drill down and get a little bit more detail. And so here I'm just showing tweets that mention National Coffee
Day in addition to September 29th.
And you can see there's all these restaurants that have free coffee, like Krispy
Kreme and stuff like that.
>>: So you go to Web -- active before to see that --
>> Alan Ritter: Yeah, that was one that I was actually going to go down to. So, right, you can see people are talking about the iPhone announcement and Tim
Cook, who is the new Apple CEO. If you click down you can drill down.
Here I'm showing entities that co-occur frequently with Tim Cook in reference to that date, and we're also trying to extract some event words, so you can see like words like hold, unveil, host, announce, unveiled. This kind of gives us a summary of the event.
I don't know. Yeah, but anyway this is online. You're welcome to play around with it. Okay. So okay so just to wrap up now, so we proposed applying statistical machine translation as an approach to generating responses to status messages in Twitter. And I talked about a couple of the challenges involved in doing that and presented some initial solutions to a couple of these.
And then I also talked about some work on unsupervised induction of dialogue acts and I really talked quickly about this work on adapting NLP tools and information extraction to Twitter and showed this calendar demo. Thanks.
[applause].
>>: I have a question the task triggers, comparing against the human one, how did you phrase the test, were you asking them to determine which ones were human or were you saying which ones were the more --
>> Alan Ritter: That's a great question. So it is kind of a -- I should say first it's an ill-defined task, like which one is better. I think I said something like it should be on the same topic and it should also make sense in response to it. So I was a little bit vague.
>>: Because if they both made sense --
>> Alan Ritter: Exactly right. So they could either both be good or they could both be bad. In which case it's -- so that's why I think the inner annotator agreement is a little bit lower than you would expect for like a corpus annotation task. But each experiment is done on like 200 messages. So averaging over a lot of different judgments. But this is a great question. And I think -- so it's an ambiguous task, too, but I think this is actually something that Mechanical Turk users are good at.
>>: Absolutely. You can be exclusive especially for the human, the one where you put the negative, and you say which one is coming from the human, generated by a machine.
>> Alan Ritter: Right.
>>: That 15 percent might be more -- it might tell you a slightly different story.
>> Alan Ritter: Yeah.
>>: But I think the way you ask --
>>: Similar challenges when you look at MTL, when both -- both system generated output and languages were quality so we're looking at terrible output and tasks to judge, is that even -- I can't relate to one of them there.
>>: Actually I was going to ask similar, so it depends on the goal of the task. I thought it was interesting how you pointed out it's not just -- you're not just trying to come up with something that can chat but, for instance, something that could do better speech recognition and so forth. So in that case it seems like the evaluation would be some likelihood measure of the correct answer in your model, did you look at that.
>> Alan Ritter: We haven't done that I'm interested in doing that in future work like look at end task applications. It's something like word error rate or likelihood or something for that I think would probably make sense. But, yeah, I'd love to do that as future work. But, yeah, we haven't yet.
>>: So for the Turk experiment it's measuring for what task would you say that's measuring for?
>> Alan Ritter: Right. I guess that's a good question. I mean, I guess it's just measuring how often we can generate a response which sounds reasonable.
>>: Like a chat? Chat box?
>> Alan Ritter: Yeah, it's more measuring for like a chat fodder data agent kind of scenario.
>>: Because that would even be a different question. If you -- because that could be a model, just have one and then just ask the question is this a reasonable response to this or not as opposed to comparison. A different number which is also interesting it might be like, oh, yeah, 50 percent of the things are just responses.
>> Alan Ritter: But I still think that's useful for the predictive text input response scenario. Because you could imagine like, okay, maybe we just want to show you a whole response. Be like do you want to respond to this, you know?
>>: It's very likely that the more human it sounds, the more likely the correct answer is high, like the other models. I'm sure the measures would be all correlated but it seems nice that you can do a quantitative evaluation without having to involve Turkers, if you just set it up like that.
>> Alan Ritter: That's a good point. It's something that I'm interested in doing.
>>: The calendar -- have you noticed anything surprising in the predictive capability, are you able to predict the future?
>> Alan Ritter: Oh, yeah, right. So --
>>: Copy public lists or reporting on Twitter?
>> Alan Ritter: That's something we've thought about a little bit, can you predict how frequent something is going to be on the actual day maybe using features of sentiment or the language that people are using about a particular event. Maybe that's more informative than just the frequencies alone or even looking at the time series of does it look like this is -- based on how far it is in the future and how many people are talking about it now.
>>: You haven't seen it -- the end of the world accurately predicted.
>> Alan Ritter: Not accurately. [laughter] there was actually a couple of months ago I gave a talk at Twitter, I think it was the next day it was like Judgment Day or something like that, there was this big thing that people were talking about, yeah.
>>: But that turned out to be wrong.
>> Alan Ritter: Yeah.
>>: [laughter].
>>: Still coming.
>> Bill Dolan: All right. Thanks.
[applause]