>> Andres Monroy-Hernandez: Welcome to this presentation. I have the pleasure to introduce Axel Schulz. He is a PhD student at and works also as a research associate in the Telecooperation Lab at the Technical University in Darmstadt. Is that how you pronounce it? And also works at SAP Research. So I met him at [indiscernible] USM earlier this year in the summer and I thought he was doing really interesting work analyzing Twitter using machine learning and other methods to understand small scale incidents like car accidents and so on. And I think that he is using a lot of data from Twitter so that’s also connected to some of their work [indiscernible]. So. >> Axel Schulz: Thanks. So first thank you for inviting me here to share my research with all of you. The topic for this talk will be microblogging during small scale Incidents. As said before my name is Axel Schulz. I’m working part-time for the Technical University of Darmstadt and I am also employed by SAP which is famous more or less for your piece of their business software. And I have programmed what I found PhD students to stay at SAP and to work with the university and that’s what I am doing and I am in the last month of my thesis. I just have to write down here but I think most of you know, writing down is not that easy. You want to write the next paper and go on and go on researching, especially if you have an interesting research topic. But, everything has to come to an end. But that’s to my person. First I want to show you and want to tell you something about Darmstadt. Darmstadt is located near Frankfurt in the Rhine-Main area where a lot of computer science and research companies are located. The university itself is the only technical university in our state, not in Germany but in Hessian. We have two main campuses. One for social sciences and the other one for computer sciences or technical sciences with over 25,000 students. Around 2,000 students are studying computer science. And also SAP founded a research lab in 2006 because of the location close to the university with a lot of collaborations with the technical university. So that’s why SAP is there. SAP further on, where am I working at? It’s the HCI Research Group. We’re doing more or less computer interaction research focused on smart interactions, flexible collaboration using for example surface for crisis management applications. We can talk about this later if you like. Information exploration exploring large datasets and the main topic currently is context-awareness. Bringing context-awareness to business applications is very important, especially for SAP and we will look how to define to infer the context and to find the information that the user really needs. And my research is also related to this because finding the relevant information and the large amounts of information that is out there is what I am more or less doing. On the other hand, the Telecooperation Lab, with Professor Max Muhlhauser, is also related in the HCI topic. We are doing cooperation research and pure networks area, smart area networks, smart sensing. Also interaction topics like talk ‘n’ touch or tangible interaction. And also we are related to smart privacy and trust research topics which is the so-called cased research department, which is a special department focusing on privacy and trust issues. So this is where I am from, what my, a lot of background of my both employers. Now let’s have a look at my talk. What is the motivation of this talk? As decision makers for example here in Seattle you have to decide what to do when an incident occurs. For example if there is a fire in a building or a car crash happening, then county emergency management staff gets their information that is provided by on-site rescue squads. So they are communicating by radio and they are telling oh there is fire, there are two persons injured. We need a lot of rescue squads or anything else to help. On the other hand we have traditional [indiscernible] systems, traditional emergency management systems that are very specialized for decision making in crisis management. And all of this information that is around there published to wire 911 calls in these systems and decision makers know how to handle all of these information sources. So they have their own situational picture of the situation that is going out there. On the other hand we have bystanders, citizens that are reporting information about what is going on in the city. For example, they publish microblogs around for example, your neighborhood topics or they are publishing microblogs that are related to the incidents, for example, to incidents that are already known to the decision makers and on the other hand to previously unknown incidents. And currently we have the situation that for decision making this additional information is not usable and this results in a fragmented situational picture. Because if this tweet is, this tweet information about the incident is shared, it was not reported before by the onset rescue squad, then a decision maker misses important information. So summing up, currently valuable is the generated content is not used by decision making or for decision making in emergency management. Because if you ever look at social media it is unstructured, completely unstructured. It is wherever [inaudible]. We have microblogs with 140 characters. We have Facebook posts. We have many other different types of sharing information. Also, we have large amounts of information. And for real-time decision making in crisis management there is absolutely no time to have a look at all of the tweets that are around there. So this is the situation that we have. And my vision and the vision of my PhD thesis is to make or to identify, to learn information and the large amounts of information that is out there. And to make it usable for decision making in crisis management. The result of this is that a decision maker gets an enhanced situational picture because he has new information that was not there before. So to sum up the situation before was user generated content was not usable and with the things that I want to present here is that we make user generated content somehow usable for decision making in crisis management. From this we have three research questions. The first one is how to classify user-generated content. We have, if we take for example events, we have to classify user-generated content based on the spatial, the temporal and the thematic dimension. Second, we have to identify the relevant information which information is the information that I need. And third, how do we get more information out of user generated content. For example if we have multiple microblogs talking about the same incident, then we should aggregate this information to one event so the decision maker has only to click one event cluster and sees oh there are five microblogs related to this event. During this talk we will focus on everyday small scale incidents like car crashes, fires in buildings, shootings that are happening every day. They have limited spatial and temporal extent. They are happening maybe car crashes over after thirty minutes and only a few people are affected. On the other hand compared to large scale events, the amount of available information is rather low because for example only one or two tweets are published per small scale incident, which makes it much more difficult to detect the important information that is out there. The agenda for this talk will be after the motivation and vision that we have seen before. I want to introduce you a general pipeline for processing microblogs for incident detection. I want to show you how microblogs are preprocessed, then how we classify and how we refine our classifier for detecting incident types. And then I want to show you what to make with the information that is out there based on our classifiers. We can do a lot of interesting things. The first step is that I want to show you, much, a very general overview of how pipeline for instant detection looks like. This is the pipeline that I developed for my PhD thesis. The first step is that we collect different information about the things that are going on. For example we assume that humans act as so-called soft sensors. They share information as bystanders and use their smart phones for example to share texts, to share videos, to share pictures about the ongoing incident. The problem here is we don’t know which person to trust. We have curiosity issues. People are just telling oh there is an incident but they don’t tell where the incident is. So we have a lot of problems when this information comes in. Furthermore, we might have some semi-structured information. For example we have specialized mobile applications that are or were developed in the so-called participatory sensing environment. We ourselves developed so-called Incident Reporter App which is a specialized mobile application that can be used by people to share incident reports. We are also working on Noisemapping. Noisemapping is for example you are walking around a city, you collect noise measurements and this helps us to infer noise sources, which is very important for different management [inaudible] to understand when something is loud and why something is loud in the city. On the other hand, as mentioned before, we have very unstructured information like social media. And in our, in my talk, I’m going to focus on microblogs as one example. As the next step, when all this information comes in we preprocess the information and then automatic preprocessing and filtering step. In this case the definition of an event is important. An event can be defined as something that happens at a particular time and space. So we have three dimensions. The first dimension is the type of the incident or the type of the event. We have the spatial extent and the temporal extent. And this brings us back to our initial question. For differentiating certain events, we have to identify tweets that, or we have to identify the spatial, temporal and thematic dimension of each tweet to differentiate noise from the relevant information. This is done in the automatic preprocessing and filtering step. The next step, when all the information is somehow collected and inferred we can use machine learning to predict or to classify this user generated content for example for differentiating incident types. In our case we differentiate three incident types, which are fire, shooting, car crash and no incident. In this case good training data is needed. And to obtain good training data counted as only one or two, which is called crowd sourcing. Crowd sourcing means that we can provide labeled information or we can use the crowd to label tweets for training our initial classifier. Also the crowd might be used for labeling wrong information because the classifier is not 100% percent [inaudible]. We can use crowd sourcing for refining. But also in this case crowd sourcing is limited because during time critical situations we don’t have the time to call the crowd help us now because we have an incident here to label the information. This is not possible. So our approach is that we want to combine the wisdom of the crowd that is out there with the power of the algorithms that is given for example by using machine learning. As a last step in our pipeline we can provision the now structured information by accessing the so-called virtual sensor. We call this a virtual sensor because you can just say deploy sensor in the city of Seattle that detects small scale incidents. And the sensor can collect information that is around there and presents and processes the information and presents information as a structured information base. In our case, the structured information base might help us to improve the situational picture of the decision maker. So this was the general pipeline and we now have a closer look at all of these steps. And the first step is how to identify, how to thematically, temporally and spatially classify usergenerated content. If we have this tweet, which is called retweet @people oh no friday afternoon heavy traffic accident on I90 right lane closed. This tweet, part, these are parts of tweets that’s really out there but I constructed some additional points because it’s easier to show the things that are happening now based on the, yeah, pre-constructed example. The first of these that we are doing, we preprocess this tweet based on textual processing. First we remove the retweet mentioned. We remove the @-mentions so that people or the person at this address, we remove special characters that might be present. Later on we resolve abbreviations. For example the onoe can be resolved by oh no, which is helpful. For example people might state, might use different abbreviations for different things and so we can use the real word set up behind these abbreviations. The next step, we annotate and replace spatial mentions. For example, the I90 is a spatial mention and we just replace it with the common @ location text. This is important because if we do, or if you want to infer a textual similarity, then comparing I90 to I80 or I75, which might be around there, is not easy or is not possible. But using a common n-gram word like @LOC we are able to have some textual similarity. The same stands for temporal mentions. So we are detecting in our tweet Friday afternoon as a temporal mention and we replace it with a @date text. We can do the same with time mentions that mark the present in microblogs. For the “on” we apply the standard text processing steps like Stanford lemmatization function or the POS tagger. The POS tagger is important because in our case we found out that only nouns are valuable for incident detection and for different use cases other word categories might be beneficial. As mentioned before text similarity is not always sufficient. For example if we have this example like @LOC mention and traffic accident @LOC lane closed and car collision @LOC then we find textual similarity based on the @LOC word. But the textual similarity based on the rest of the tweet is not high. On the other hand we might detect some other higher level concepts of words. For example, accident and collision are somehow related. In our case we use linked open data which is a source for a lot of interlinked information extracted from Wikipedia for example and annotated with categories and types to infer more common concepts. For example accident and collision share the concept for the category accidents. And if we use this more general category, then we can detect not only text similarity but also conceptual similarity between two tweets. This is done using the so-called FeGeLOD framework. FeGeLOD stands for feature generation based on linked open data. And this framework DBpedia Spotlight is used for detecting entities like traffic, accident, I90 and lane. And here are some examples. There are more than, I think more than 50 links that are extracted. Here the types of accidents, road or the categories causes of death are extracted. And we can use these higher level concepts for finding conceptual similarity to other tweets for example. So our idea is to make use of the named entities and the higher level concept, concepts that are behind these named entities later on for machine learning and to add some additional machine learning features. As mentioned before, temporal filtering is important. It is not obvious why this is important. In our case we assume that the creation date of a tweet is not necessarily the tweet when the incident occurs. For example I can say oh my brother was involved in a car accident last year. And this is not relevant for us now. What we do is we extract, or we try to extract, the incident date or the real incident date based on the content of the tweet. Also as show before we use this mechanism to extract or to replace the temporal concepts in tweets. For this we use the HeidelTime temporal tagger which was developed for Wikipedia. For example we customized it to be usable on microblogs and if we have a tweet that was from before that might be created on Tuesday the 19th in February and have the temporal mention Friday afternoon, then we can extract the time or can infer a time like the last Friday, the Friday that was before this Tuesday and say oh the incident was more likely on the last Friday and not on the Tuesday when the accident, or when the tweet was created. Also as shown before, we can use this approach to annotate the message to replace the temporal mention. There’s another important step. Spatial filtering is necessary because we want to know where the incident occurs. And when we use microblogs, then we have the problem, the major issue that only 1% or around 2% of all tweets are explicitly geotagged. Thus a lot of tweets cannot directly be used for a use case. Currently no simple approaches are applicable like using the IP address of the user because we don’t have this information or just using a Twitter search API because to search API is also incomplete and also error prone. When inferring locations of tweets we have to find good mechanisms to know where the incident occurs or where the location of the user was when he sent the tweet. And if we cope with this research topic then we have to cope with toponym resolution. Toponym resolution means that we have tweets track location information which are proper names in text and we have to disambiguate it. In this case we differentiate two problems. The first problem is the geo/geo disambiguation problem. For example if we extract Paris, the proper name Paris from a text, then we don’t know which city is meant because we have 23 cities in the USA. So which city is referred by the user if we post a tweet the car crash happened in Paris last year? We don’t know. Furthermore, we have to differentiate or we have to cope with the geo/non-geo disambiguation problem, which is for example if a proper name like Vienna is used, then it can refer to a city, but also can refer to a person name. This whole topic spatial processing is one of my main research topics so I want to show you how this is done in our research. I find this very, very interesting because microblogs are, except for only a few microblogs, have geo tags and inferring geo tags from microblogs is a very interesting research topic. We found out that for microblogs we have a lot of spatial mentions. We call these spatial indicators. For example, we have the tweet message. The message might contain toponyms or it might be a check in like we have in foursquare, a check in at a certain location. Here we have a link. We have followed the link and then we know at which venue the person checked in. When the user has a user profile he can enter the location field to say this is my own location. He can enter a website. He can use for example the top level [indiscernible] to infer which is the country the user is most likely from or we can use the time zone. And for all of these spatial indicators that are out there, different means of processing are necessary. What we do is we combine the information that we can extract from one tweet, which is a lot, to infer the location where the tweet was sent. So the general idea is for every spatial indicator we can extract the polygon in the world. Here is one polygon. Here is one for France. Then we have another one and another one maybe for Paris and another one for a small district of Paris. What we do is that we say, we assume, that intersection of all of these polygons and the polygon, the resulting polygon with the highest height in the end is the resulting polygon. For example here we have three polygons and if we intersect three polygons then we have a resulting polygon with height 3 which is much more than these two polygons which might have a height 1. So you can say this is the location that is most likely to be the location where the tweet was sent. So summing up this approach, we extract spatial indicators from microblogs. We map them to polygons around the world we built using the sequel service spatial extractions. We built up a large collection of polygons describing administrative areas at six levels of accuracy. For example we have the Manhattan as a polygon. We have the states. We have the USA. We have the time zones. All these polygons are in our data base and then we can extract or infer which polygon might be related to France or which polygons might be related to France or the city Paris. Then we assign a height to each polygon. A height allows us to model the certainty of certain spatial indicators. For example if we know that a person checked in at a certain point, then this location is very accurate. If we have the time zone which is very long, which is a very large polygon, then we know this spatial indicator is very inaccurate. Also we might get quality measures provided by external services as confident scores. For example if they extract locations for the city Paris and we obtain 20 results, then these API’s might provide us quality scores or confident scores for each estimation. And what I said before, we stack these polygons that are resulting based on each other so we have 3D shapes which have height. And the highest area of this intersection of our polygons is our final estimation of the location. So coming back to our example, we have these tweeter check in where a city is mentioned. We have the user profile. We extract different spatial indicators using different API’s. For example we call the foursquare web page. We use the DBpedia Spotlight for annotating the entities. We use [indiscernible] called GeoNames for inferring location mentions and the location fields, and so on and so on. Then we infer the coordinates that are related to these spatial indicators and map them to polygons. Then we provide them with quality measures. For example as said before, foursquare should [indiscernible] as a very high quality. On the other side we might have the time zone which has a much lower quality in the end. Then we take all of these polygons that are resulting [indiscernible] off each other and the highest polygon is our assumed position and the confidence. It has a certain confidence. This is the whole approach. I did not mention this before but if you have questions, then I will stop at that point. Maybe later on we have too many stuff to discuss. And we evaluated this approach with 1 million tweets. Yes? >>: It seems like another clue you could use is the fact that people could only move so fast. So if I tweeted an hour ago, not that long, but I can’t be that far away. But could that be another clue? >> Axel Schulz: Yes we could infer so-called mobility patterns of people. But the important point here is that we make use of only one tweet by one person. We don’t need all the other tweets. It’s not that easy to collect tweets or we can collect tweets of certain user profiles. We can say give us tweets for this user but doing this for all the users that are around, it’s not valuable in the end. I know your idea and it’s very good and it’s done our research, but for real-time detection in the end it’s not applicable I would say. As I said, we evaluated our approach with 1 million tweets. Having the device location so the device location is a ground truth and we found out that we are able with our approach to estimate the real location of a tweet with a median accuracy of 30km, which is quite good on city level. We could do this for around 92% of all tweets that are contained in our set. When we talk about incident detection, then we might directly say oh 30km is not enough because knowing that the incident occurred somewhere in Seattle. Okay, nice to know, but we want to know at which intersection did the incident occur? But it’s important to mention that 30km is much better than knowing nothing about the location. So this is the first step to infer the city where something might have happened. Then we have to cope with the problem of street-level geolocalization. So we want to know where the incident actually occurs. >>: [indiscernible] tweets that you are trained on all have geo tagged [indiscernible]. So I compared that with non geo, like with a set of features that are in those tweets with non geo tagged tweets because you might expect people to know that their tweets are being geo tagged don’t but locations. >> Axel Schulz: Actually no. Because how would you get a ground truth for these tweets? >>: Well I’m just saying so you compare a set of tweets that have been geo tagged and say like okay, ten of them have the location … >> Axel Schulz: I know, I know what you mean. I haven’t tried this before. I will check this but, it would be interesting to see if the things that people are reporting about differentiate, yeah. I will do this. Yeah I will write this down later. Another question? No. So street-level geolocalization is the next step for this. We retrained a special model, a special model for Seattle using the Stanford Name Entity Recognition tool kit to geocode or to identify named entities and then we geocoded these named entities on street or building level. For example if we have this tweet, we can detect I90 has spatial mention as a proper name and then we can infer the location or geocode the location of the I90 which is quite easy because there are a lot of API’s around here which give us the location or the polygon surrounding this highway. On the other point we can use this approach to detect named entities and to replace them with [indiscernible] location mention. So for Seattle we were able to detect location mentions within accuracy of around 90%, which was because we trained a model for Seattle. I don’t know if the same model might perform on different cities. There are similarities between different cities but besides the more general approach that I showed before to infer the city, such as street-level approach, has to be developed for every city on its own because the ways people talk about locations might differ across the cities. So for the automatic processing and filtering step we have seen how to temporally and spatially classify user-generated content and how to thematically preprocess it. The next step we want to know what the actual type of the incident is. This is found in the automatic classification step for incident types. Yes? >>: Before you move on, I have a quick question. So you said you removed all [indiscernible]? >> Axel Schulz: Yes. >>: Did you look at [indiscernible] because you might expect, and actually I’ve done some stuff in my own work that shows like incident related tweets are much less likely to have mentions of other people in them. So that might actually be valuable information. Just the fact that you are talking about a certain other person might mean you’re not talking about an event. >> Axel Schulz: Okay, yes. [indiscernible] >>: Or perhaps the mention of a specific thing, like the Seattle PD or something like that. >> Axel Schulz: Yes. No, actually we don’t use those features but we will now come to the next slide and then we can discuss these ideas. Because what we do or what we want to do is we have to extract features for our machine learned problems. So we have to transform our set of tweets into features that might be valuable for machine learning. In our case we experimented with word unigrams, character n-grams because the Twitter guys told us that character [indiscernible] grams are most valuable for doing machine learning which is not true for our set but maybe for their datasets. We used syntactic features like # of words, # of characters, # of “!”, # of “?”. We experimented with sentiment features. We used the text similarity scores. We used our spatial and temporal features that we extracted before and we used the linked open data features that are the concepts. And as she said maybe we could extend this with the absence of @ mentions anything else here. It would be nice to test this. >>: [indiscernible] Retweet attributions might not be as useful but like, mentions of specific like emergency management organizations or like just [indiscernible] when directing someone at someone might mean it’s not about that incident. >> Axel Schulz: Yes. But the problem for example if we would use the mention of certain emergency management organization, it could result in a problem that our model is over fitted to what this mention is. >>: [indiscernible] city is so big anyway, so maybe it … >> Axel Schulz: Yes. You could try this out. [indiscernible] >>: [indiscernible] >> Axel Schulz: Yes. We tested all of these different combinations of these features and different machine learning methods using this the [indiscernible] Meka tool kit extract [indiscernible] Vector machines and valuated our approach with the dataset or training dataset collected in the city center of Seattle and also in the city centers of Memphis, Tennessee. Here are the numbers. The first set consists of I think 1,200 tweets and the second, the test set consists of 1,608, I think, tweets related to incidents, to a certain incident types or to which are not related to incidents and we found out that the best combination of features is for you to use word 3 grams plus [indiscernible] filtering. For example only nouns are valuable or valuable in our case plus concept features syntactic and TF-IDF scores. Then we can achieve accuracy of around 82% for a test set. >>: What did you say is LOD? >> Axel Schulz: Add the concept to each [indiscernible] of the data features. So if they replaced concept of accident or roads. That 82% is yeah, quite good score I would say. Though the dataset is not that large but it’s not easy as mentioned before to find good incident related tweets and to label them. It’s no fun to collect 6 million tweets and to identify these 100, 200, 300 tweets that are valuable for incident detection. So, yes sir? >>: You had some measure of how often each type you were able to [indiscernible]. You still get like 70%, maybe more. Because your test dataset if 70% [indiscernible]. So I’m just curious do you have some rough indicator of you know, other than the car accidents, how many were classified correctly of the fire? >> Axel Schulz: How many are there in the real world? >>: [indiscernible] If your machine learning just said everything is done, [indiscernible] you did 75% or something, how long … >> Axel Schulz: I had the baseline [indiscernible] >>: No, no. I’m just curious, so [indiscernible]. >>: A [indiscernible] I think is what you want. >>: Yeah. >> Axel Schulz: Okay, I don’t have it here but we can maybe look [indiscernible] >>: I was generally curious if [indiscernible] >> Axel Schulz: It was. The computer metrics look good but the main problem was, well not the main problem, the issue was that mostly the decision was between incident or not an incident. Not car crash or fire. There is another research we are currently doing because we found out that I think 90% of all incident related tweets are indicating only one incident type. But with 10% of all tweets that state two types for example, the car crash where the car burns, this is a different problem. In this case the classifier, this kind of classifier is not applicable so we are working with the multi label learning to infer multiple types. Also multi label learning is beneficial because if we know that there is an incident around there, we might infer that the tweet is related or states something about injuries. So we can detect the incident and we can find out that there are no injuries or there are two persons injured. So with multi label learning you can do a lot of other things. Yes? >>: I gather the car crash. The car on fire. Do you count that as two incidents or one incident? >> Axel Schulz: No. In this case the classifier decides for one. Most likely for the car crash in this case I would say. >>: [indiscernible] is that like something that might be coming in, like streaming, or is it … >> Axel Schulz: Whoa, whoa, whoa. Come back to it. [indiscernible] we use to search API. Because we said to search API we get a sufficient sample for the cities. In our case for the training set we collected around 6 million tweets in December 2012. >>: Just geo samples? >> Axel Schulz: No, not just geo … >>: Sort of like key word samples or how did … >> Axel Schulz: Then we did the key word sampling based on incident related key words which are [indiscernible] and then we selected another sample set and then we manually annotated this sample set which [indiscernible] around to, I don’t know around 40,000 tweets. >>: So like the proportion of the [indiscernible] in your datasets is probably higher than [indiscernible]. >> Axel Schulz: Because of the key word search, yes. [indiscernible] >>: [indiscernible] >> Axel Schulz: Okay. These are the results for the simple classifier. The first question that comes to mind I’ll say is are these results transferable to a different city? So can we use the same classifier for a different city? And I prepared these results two weeks ago to show you how our classifier performed. I collected for one hour 90,000 tweets in New York City and applied our classifier that was trained for Seattle. And I could detect only in the one hour period 15 incident related tweets with probability higher than or the probability that an incident is mentioned in this tweet was higher than 75%. So 15 tweets is fine for a city like New York City. For one [indiscernible] I would say, and here’s the proof just one tweet stating there’s a fire in apartment 3A and so on, that was sent by, I don’t know, maybe it’s an official emergency management organization. But, it was detected by our classifier that was trained for Seattle. So the classifier performs without adapting to New York City well on a different city. But also it found this tweet, AlGoesHard causing fires on Instagram. Yeah it might be related to a fire incident and it was identified in this incident related tweet set and this shows us that the refinement of the classifier is needed and should be done for using it on the different dataset. This is what we are currently experimenting with. We use the crowd to reclassify tweets that were created or are created in a different city for relabeling these incident related tweets. And for this we developed mobile applications first, not mobile application but the web application was the first prototype stating a lot of different incidents. For example motor vehicle accident, freeway at a certain place shows the tweets and all the pictures that are extracted from tweets related to this and then the users can go to the platform and vote yes this is correct or this is not correct. Currently I am extending this so that user can say, oh what is the real type of incident mentioned in the tweet. And we can use this mechanism to refine our classifier and I did this with 693 instances and retrained the model and yeah, the accuracy proved for one training set with 1,700 instances. So retraining is very valuable in our case. So combining the power of algorithms with [indiscernible] the crowds should be done. So what we have seen, we can classify incident types with machine learning. Also a classifier applicable for one city can be used for a different city and I’ve shown you how a model can be refined for a different city. Also the model might be refined for different incident types because we are only covering three very generic incident types. Finally, I want to show you how to infer new information based on these pre-classified tweets. The first idea is that we aggregate these incident related tweets to events based on the spatial, temporal and thematic dimension. In our case we used a plain or simple rule-based aggregation algorithm that performs this way. That we assume that two people are at the same location or around a certain radius. For example in a 50m radius around a certain event type or a certain event like a car crash, and in the same spatial extent for example of 30 minutes and we assume it must be the same car crash. It could be that the car crash is over here and the next crash happened there, but in our case we assume that it is the same event. And so we can aggregate microblogs and information that is around [indiscernible] three dimension to one event clustering all the information. And all the future, all the slides I will show you are based on this approach. We used tweet dataset collected in March. We applied our classifier on 600,000 unfiltered tweets. We could detect 347 incidents using our classifier and this is also a very good point to explain to you why we used tweets from Seattle, which is not that obvious. We used Seattle real time fire calls which can be obtained from data.seattle.gov. This is why I like Seattle. Because Seattle is the only city in the world that publishes incident information 15 minutes after the incident occurs or after the information is in the emergency management system on a webpage where everyone can get this information. And we collected 141 real-world incidents for the same time period. So we now have the incidents that were detected in the tweets and we have the incidents that were detected in the real world. And we can compare them to each other. First we analyzed the user types. So which users are reporting about incidents? We differentiated five user groups: emergency management organizations, organizations not related to emergencies, traffic reporters or journalists and citizens as individual users. And we found out, here is a graph where we can see the number of users and the number of tweets for every event type and every user group that most of the tweets are shared by official organizations like emergency management organizations or other organizations which are 56%. But also 33% of all tweets were shared by individuals, which is interesting to see more tweets are shared for shootings. This might be because people are more affected somehow by shootings. Our data said there was a shooting, there was a shooting at a library at the university here in Seattle, some, in this, okay. You haven’t heard of this, but lots of people shared information about the shooting. >>: [indiscernible] >> Axel Schulz: Yeah. And so we had more tweets around this shooting which happened in the city. Also we found out that when individual, if individual users contribute, then they contribute only one or at most two tweets per incident type, which is also very important as I mentioned in the beginning of the talk. Yes? >>: About the 30% the individual users shared, do you have a sense of how many of those contain mentions of organizations? >> Axel Schulz: No. >>: [indiscernible] some of the tweets [indiscernible] like they might be retweeting the emergency organization. Even though … >> Axel Schulz: Yes, yes I know. I know what you mean, yes. We haven’t checked [indiscernible] official sources but we checked and this is the next slide of who is reporting first, which is [indiscernible] covered in this problem. We found out that for 65% of all incident types the organizational users reported first. And only 23% of all incidents are first reported by individual users. In this case, it does not mean that this is correct because with using the Twitter search API we get a biased sample. So these are just numbers that are valid for our dataset, maybe not for the real world. But, what is interesting to see in our case that if individual users are reporting about incidents, then they are reporting much faster. And in our case it was 24 minutes faster compared to other user categories, which means that if we could detect tweets by individual users that were posted by individual users, then we get information that is more, more, more, more, which is more recent information, and more valuable information hopefully. We mentioned before we also analyzed the correlation between real-world incidents and incidents that were detected in our tweet set. And we found out, and this was done using manual comparison based on the three dimensions, that we could detect 81% of all car accidents that are in the Seattle [indiscernible] set, 68% of the fire incidents, 75% of the shooting incidents, humming up to 73% of all real-world incidents which is quite good. So we can say we are detecting three quarters of all the incidents that are in the emergency management system with just using microblogs. So, finally, I have showed you that there are a variety of individual users reporting about incidents. And these users are reporting faster than official sources. Also the correlation between information in tweets and real-world incidents is quite high. So I can say that incident detection is valuable and can contribute to situational awareness of decision makers. Finally, summing up my whole talk, I answered, or I started with the first research question how to thematically, temporally and spatially classify user-generated content. I have shown you how to preprocess and how to filter microblogs based on spatial temporal and thematic dimensions. I showed you how to identify the relevant information using machine learning techniques, how to refine an existing classifier, and I’ve shown you how to infer new information, like aggregating single tweets to events or how to understand how the value of microblogs is for emergency management, how this can be done using our data base, our dataset. So I conclude that small scale incident detection is feasible. We can do this. And also hopefully I have shown you that it is valuable for decision makers. And I hope that someone is continuing with this work. If not me, maybe you are interested in this. No, but maybe we can do some collaboration on this stuff and I think it’s really valuable and really interesting. Especially to see the things that we learned from Seattle are applicable on different cities. And I have heard of different guys talking about, oh here I’ve talked to decision makers in New York City and they want such an application that makes usergenerated content usable. So this research topic is highly interesting and a lot of things can be done. So thank you very much for your attention. [applause] Yes? >>: Do you think you could set up any kind of incentive to get people to tweet more often about incidents like this? >> Axel Schulz: Um… >>: Here’s a suggestion for one. Maybe if you publish all the incident tweets in a central location that anyone could come to right, and I might feel proud that one of mine showed up there. >> Axel Schulz: Okay. It could be done. I’m not sure if this is possible with the existing social networks. I think it can be done with specialized applications that were developed for incident reporting. But it may be, and the idea is nice, and it could be realized but how to identify the user that sent a microblog? It may be that the user does not want the rest to know that he is the one. So, but I can follow your idea or your vision. >>: I can also read your response. >>: [indiscernible] >> Axel Schulz: This is what we, what we also are investigating, to understand the [indiscernible] aspect of these [indiscernible]. So we want to find out how to make intimate management applications viral. How, when are people using this kind of application? When are they sharing instant related microblogs? This is also our answer to most of the cases it is plain altruism of the people there. But maybe you can incentivize people for sending in valuable information. But it shouldn’t be the case that people should be incentivized. >>: So, one of the things we talked about in centering is around how we classify the tweets into overlapping events. And so you had made a comment that if you have an event that’s the same type, the same location, and at the same time, you just sort of assume it’s the same event. And given the time intervals and space intervals involved, you might see all the events in a city and say okay, this is an accident in Seattle. Have you, have you given any thought to being finer grade than that? Or alternatively, doing event detection where events are more common. So say, I don’t know, [indiscernible] a police officer or a [indiscernible] a red light and you know like these people are at the same red light or the same bridge crossing. Or you know, that kind of thing where you have a much more ambiguous which event corresponds to you and whether or not your system can be extended to handle that. >> Axel Schulz: So the question is if I have looked at this research? >>: Yeah. If you or if you have any, if you have looked at this you know problem as an extension to your [indiscernible]. >> Axel Schulz: I think the main problem is the classifier part. Extracting the temporal and the spatial dimensions can be done but building a classifier for some automatic classifier for this type of event may be very difficult. I don’t know. I think it’s very general what you want to do. So building a classifier, we should, well yeah, you should think of what I presented with this refinement approach for interactive classifier that starts with some general types and then becomes much more complex the more labels I have. Yeah, so forget new labels and I can make very high level type. I can extract more fine grain type and through all these iterations this might be possible. I can imagine such a system. The main problem is the research. You need a lot of people to re-label the things and I don’t know if this is even possible. >>: So, and this is sort of a larger question. So have you thought at all about say this system like turns out to be really accurate and it gets implemented in like decision makers in Seattle are using this actively in the [indiscernible] situation. Have you thought at all about the social consequences of that with some events being over reported or areas over reported and some under reported? Because you could, in your dataset you could look at this very easily with the events that you didn’t identify? Do they come from specific areas that have specific demographic characteristics? >> Axel Schulz: Well I haven’t looked at this but also very interesting. Yeah, no, I’m not a social scientist guys so these questions are more or less out of my scope. I just investigated some situational features so the main question remains. Are these tweets valuable in the end? Plus we just say these tweets state some or have some incident types, some temporal and spatial mention about what is written there. >>: So with your test basically you didn’t want to see the events that you did capture if they had specific characteristics or if they were from specific areas? >> Axel Schulz: No. Not from specific areas. We just looked at the content based on these three dimensions here. >>: [indiscernible] >> Axel Schulz: [indiscernible] This would be very interesting. The next type paper for next year. Okay. >>: Are you familiar with tweets and tweet? >> Axel Schulz: Yes. >>: That sounds like a great example of [indiscernible] how to reclassify tweets through translation. >> Axel Schulz: Tweet to tweet. The main problem is that you have to familiar, you have to train the crowd to use these special [indiscernible]. >>: [indiscernible] >> Axel Schulz: Yes, but …. >>: [indiscernible] But yeah I was thinking about tweet to tweet at the beginning of the presentation … >> Axel Schulz: But my idea is if people would use special [indiscernible] text, then they would also specialized mobile application that was designed for the same. And this is much more valuable so we get a lot of information that is needed. We can ask the people here provide us more information and I have seen the value of tweet to tweet but as I said, if I have to train the crowd, I can use much more different and more sophisticated mechanisms and without training the crowd finding this information is, yeah, a far different problem. But, yeah, I am familiar with [indiscernible]. >>: So you talked about ambiguity around temporal mentions. >> Axel Schulz: Yeah. >>: Temporal mentions or spatial mentions, excuse me, spatial mentions. So Paris could be [indiscernible] and it didn’t seem like you were doing ambiguity over abbreviations and temporal mentions. So for example, EMS could be Emergency Management Services or Eastern Mountain Sports. And that has a very different impact on this particular problem. You also, as you dive into ambiguity here, you end up with segmentation problems where you don’t know if like Eastern Mountain Sports refers to Eastern Mountain and sports or Eastern Mountain Sports or Eastern Mountain sports? >> Axel Schulz: Yes. >>: So I am curious, how you decide on the version to use? Do you just take the top abbreviation translation and if there is anything that, how you would use multiple versions if you did get them. >> Axel Schulz: Actually that’s the advantage of this approach. As we said if, if [indiscernible] of a text has several spatial meanings or meanings in the real world, like New York City and NYC [indiscernible] city or whatever else, then we can use the polygon for each alternative. >>: Right so this is how you deal with the other types of the ambiguity. So over topics and over … >> Axel Schulz: [indiscernible] the question is not related to the spatial but to … >>: But how do you use the same kind of things that apply to topic and time? >> Axel Schulz: For topic [indiscernible] we tried this with a common concept. If we have, if we have, for NYC we detect for example several concepts, one is related to a city and the other one is related to some movie or whatever, then we use both concepts for our training problem. So we hope that these features in combination with the other features might help us to differentiate what’s valuable and what’s not. But you are absolutely right. This is also … >>: So you sort of take a [indiscernible] and weigh them based on [indiscernible]. >> Axel Schulz: Yes, but actually I think they are, this is one source for errors and as for the temporal mentions, it is not that easy. >>: Sure. [indiscernible] >> Axel Schulz: Yes. Much more difficult in this case. >>: [indiscernible] But also it was interesting when you were talking about geo [indiscernible] >> Axel Schulz: No we haven’t investigated this nor [indiscernible] case but I just had a Syria case. A lot of people are reporting from outside and just retweeting. This is a very important aspect to cope with but in this case I think different means are unnecessary. You have to find some reputation model for users. You have to look at which users are more likely from Syria which are affected somehow, which are from outside of. We can do a lot of things with this on the user level. Yeah. >> Andres Monroy-Hernandez: Well, let’s thank the speaker. If you will hear me out for a few minutes here. [applause]