What`s Happening Now? Entertainment

advertisement
What’s Happening Now?
Entertainment
Author: Matthew Muscat
Supervisor: Dr. Joel Azzopardi
Co-Supervisor: Mr. Charlie Abela
Abstract
Everyday people exchange data back and forth over the internet, describing their ongoing
activities, their plans for the weekend and anything that comes to mind. Most data on the internet
consists of unstructured text containing links to more structured information. To our knowledge,
there is no automatic solution that identifies and extract entertainment events from multiple
sources. Hence, the problem is finding a way we can use this data to extract useful information
such as events. - (Orlando et al. 2013)
In this research, we propose a solution for the automated detection and extraction of entertainment
events, and the representation of these features in a structured format. The method proposed is
divided in a number of steps: firstly, through supervised event classification the system finds out
whether the data retrieved from various RSS feeds (e.g local press news sites) is an entertainment
event or not. Secondly, the system annotates the documents that are classified to be entertainment
events using different NERs included in GATE pipelines to extract named entities (such as Event
Date, Location, Participants and Organisations Involved). Moreover, we eliminate ambiguous dates
and solve temporal expressions on these extracted event details. Furthermore, this event data is
compared to previously extracted events and information aggregation is performed. Information
aggregation is the coalescing of event data from multiple news reports detected to be referring to
the same event.
Finally, these event details need to be represented. For extensibility of the system the use of RDF
model has been employed to represent these events in a semantic way. We even showcased an
RDF API that allows others to perform free text searches on the entertainment events extracted by
the system. In addition to this, we stored these entertainment events in database tuples to utilise
such information for the web interface (front end UI). In this web interface we provide a way to
search for entertainment events by their details using our simple RESTful API.
In this prototype, we mainly focused on the detection and extraction of entertainment events in
Malta. Nevertheless, such a system can be extended to retrieve and extract entertainment events
for other countries. The results obtained based on our evaluation were very promising.
Download