WP3: Real-time Large-scale Data Analysis

advertisement
WP3: Real-time Large-scale Data Analysis
This WP aims at contributing towards building an integrated platform for semantic
urban computing by developing tools for large-scale data analysis in real-time. The
targeted tools implement low-level analysis algorithms that together with the
outcomes of WP4 and WP5 will encompass the technology needed to realise the
SERENDIPITI software framework. Specifically, multimodal analysis will be
performed on the output of physical sensors like CCTV cameras combined with
available information extracted from on-line and social communities as blogs; twitter;
Flickr; youtube; etc. According to the use case described earlier, for effective
navigation of a user around a new place, it is critical to provide the user with “realtime” intelligence on varied aspects of an environment such as traffic and weather
conditions to name a few. In addition, interpreting the real-time changes it is also
critical to provide the users with alternate solutions which closely correspond to
“human cognition”. To achieve this ambitious objective, several innovative research
methodologies extending beyond state-of-the-art will be developed. The general
objectives of this WP are listed below:
 large-scale information discovery from online resources and physical
environments in real-time.
 real-time multimodal data analysis from varied information sources such as real
world (e.g. CCTV) and online resources (e.g. blogs, YouTube)
 multimodal data synchronisation of information from physical and online worlds
 to extract knowledge regarding an environment (e.g. a city such as London/Milan,
Dublin, etc) by analysing and indexing cartographs.
The information provided by this WP will be used in WP4 and WP5. Using results of
WP3, analysis algorithms, suitable descriptors and other appropriate description
models will be developed. Potential description schemes will be proposed to the
appropriate standardisation bodies within WP7 “Spreading excellence”, specifically
activity A7.3.
The WP consists of the following integrative activities and will be coordinated by
QMUL.
A3.1 Distributed intelligent sensing
Leader: QMUL, Participants:
This activity is dedicated to gather and filter all information measured by a dense
multimodal sensor network and available information extracted from on-line and
social communities as blogs; twitter; Flickr; YouTube. The sensor network covers a
wide-range of modalities, like CCTV cameras, simple state/change indicators taking a
binary (on/off) value, e.g. door open/shut in large indoor city scenarios, measurements
of continues variables, e.g. temperature, noise level and complex sensors from
meteorological stations when available.
Activity A3.1 will deal with the system intelligence in its most elementary form. The
approach is hierarchical as in the overall system. Analysis is first confined to specific
amounts of data, e.g., the data sensed at the moment when the analysis is conducted.
Then the resulting information is combined with information from the past to build a
second hierarchical analysis level. Finally, distributed information from several sites
is put together to infer the output information of the distributed sensing subsystem.
Form a technical point of view, the main objective of this activity is to devise and
apply techniques to transform raw data, in the form of multiple time series and on-line
sources into meaningful information using a hierarchical approach. For single sensors
this will simply be a binary state variable as a function of time. For more complex
sensors, such as video and audio recorders, this could be a hypothesis identifying a
highlight happening at space-time coordinates.
A3.2 – Cartograph Analysis of a City
Leader: QMUL, Participants:
This activity focuses on the analysis of city maps and generates an index of
“interesting aspects” of a city. The interesting aspects could vary from a street to a
national monument and this information will be used a pre-knowledge to the
SERENDIPITY platform. The cartograph analysis will exploit the availability of GPS
information for precisely (within few meters, e.g Google Latitude provides 3m
precision in motion and 0.5m precision when resting) localising the movement of
users within the city. This activity consists of the following sub-activities.
 A3.2.1 – Indexing cartograph
 A3.2.2 – Mapping GPS from user to the Cartograph
 A3.2.3 – Real-time tracking of user movement
A3.3 – Traffic and Crowd Monitoring from Real-Time Multimedia sources
Leader: QMUL, Participants:
In this activity, research methodologies for detecting “semantic events” such as traffic
jams, accidents, monitoring crowd behaviour will be developed. These tools will
enable machines to mimic human like cognition in predicting the future consequences
of a specific event by continuously monitoring past events. The presence of CCTV in
city will be used, as a primary source of information for the real-time extraction of
semantic events. The activity consists of following sub-activities

A3.3.1 – Tracking and Tracing




A3.3.2 – Traffic monitoring for semantic events
A3.3.3 – Crowd behaviour monitoring
A3.3.4 – Real-time reasoning following a semantic event
A3.3.5 – Interpretation of consequences of a semantic event
A3.4: Sequential, anytime and on-line learning for Real-time semantic analysis
Leader: QMUL, Participants:
This activity deals with algorithmic issues central to machine learning and knowledge
discovery. One objective is to provide a coherent perspective on resource constrained
algorithms that are fundamentally designed to handle limited bandwidth, limited
computing and storage capabilities, limited battery power, and specific real-time
network-communication protocols. In the targeted application scenario the process of
generating the data is not strictly stationary. In many cases, there is a need for
extracting some sort of knowledge from a continuous stream of data. Examples
include twitter records and 24 hours CCTV streams. These sources are called data
streams. Learning from data streams are incremental activities that requires
incremental algorithms that take drift into account.
Important properties of such algorithms are that they incrementally incorporate new
data as it arrives and that they are able to cope with dynamic environments, deal with
changes in the data-generating distribution -they must process examples in constant
time and memory. The goal of this activity is to adapt sequential learning, anytime
learning, real-time learning and online learning from data streams and related topics
and to achieve processing in real-time.
The objective is to aid the user by providing additional information extracted from the
city events. At technical level this activity focus on developing intelligent tools for
extracting named entities, which could include name of a person, name of a place, a
national monument etc. The tools developed in this activity will exploit online
resources such as Wikipedia, Geonames and DBPedia to provide the user with useful
(or user preferred) statistics. In addition to the extraction of named entities, this
activity will also focus on semantic categorisation of the named entities by tagging
named entities to semantic categories. This knowledge will be subsequently made
available to image analysis tools as a pre-knowledge to aid in real-time image
classification using kernel or biologically inspired algorithms. In addition, this activity
will also extract a list of “to-be experienced” aspects about a city and make available
dynamic recommendations when the user is in the vicinity of these places. This
activity will closely collaborate with A3.6 for interfacing with the user.
A3.5 - Real-time coding and streaming
Leader: QMUL, Participants:
A real-time multimedia stream from different physical or online sources is an
important aspect of SERENDIPIT platform. The provisioning of multimedia services
over heterogeneous environments like in SERENDIPITI is still a challenge, especially
if the multimedia services have to be consumed anywhere and anytime and using any
type of devices. Scalable media coding efficiently provides the adaptation of
multimedia resources according to changing usage environments and to profiles
information (user and terminal profiles). It contains the self-adaptable information
rich media that can able to accommodate streaming and interaction functionalities.
This activity embraces the three different areas of the work: fully scalable media
coding; streaming of scalable media coding in heterogeneous environment; and
Quality of Experience of multimedia coding. In this context, the objectives of the
activity can be identified as:
 The provision of a comprehensive evaluation report on the use of scalable
media coding and multiple description coding techniques and state of the art as
regards scalable media coding techniques.
 Selection of the appropriate algorithms and mechanisms for scalable and
adaptive media stream coding.
 Definition of the framework for interworking between the application
(encoding/decoding) modules and the media transport layer, in order to make
use of the adaptive media transport capabilities and flow control capabilities of
the selected transport protocols.

Implementation of a scalable coding/decoding scheme to be used for real time
transmission of media streams over congestion aware and adaptive flow
control protocols.
A3.6 – Real-time Multimodal Data Synchronisation
Leader: QMUL, Participants:
This activity will collaborate closely with previously mentioned activities in aligning
and synchronising information obtained from multiple modalities and multiple
sources with respect to time and geo-spatial preferences. The outcome of this activity
will (partially) reflect real-time dynamic updates on the events like traffic and crowd
conditions and thereby suggesting alternative routes to the same location according to
user preferences or user models.
WP3: Real-time Large-scale Data Analysis
Workpackage number
3
Start date or starting event:
Workpackage title: Real-time Large-scale Data Analysis
Activity type1 RTD
Participant number
Participant short name
Person-months per
participant:
M1
Objectives
WP3 aims at contributing towards building an integrated platform for semantic urban computing by developing
tools for large-scale data analysis in real-time. The targeted tools implement low-level analysis algorithms that
together with the outcomes of WP4 and WP5 will encompass the technology needed to realise the
SERENDIPITI software framework.
Description of work (broken down in different tasks + role of Partners)
To meet the needs of the above objectives, the following activities have identified in the context of WP3:
A3.1 Distributed intelligent sensing
This activity is dedicated to gather and filter all information measured by a dense multimodal sensor network and
available information extracted from on-line and social communities as blogs; twitter; Flickr; youtube. The
sensor network covers a wide-range of modalities, like CCTV cameras, simple state/change indicators taking a
binary (on/off) value, e.g. door open/shut in large indoor city scenarios, measurements of continues variables,
e.g. temperature, noise level and complex sensors from metereologic stations when available.
A3.2 – Cartograph Analysis of a City
This activity focuses on the analysis of city maps and generates an index of “interesting aspects” of a city.

A3.2.1 – Indexing cartograph

A3.2.2 – Mapping GPS from user to the Cartograph

A3.2.3 – Real-time tracking of user movement
A3.3 – Traffic and Crowd Monitoring from Real-Time Multimedia sources

A3.3.1 – Tracking and Tracing



A3.3.2 – Traffic monitoring for semantic events
A3.3.3 – Crowd behaviour monitoring
A3.3.4 – Real-time reasoning following a semantic event
1
Please indicate one activity per work package:
RTD = Research and technological development (including any activities to prepare for the dissemination and/or
exploitation of project results, and coordination activities); DEM = Demonstration; MGT = Management of the
consortium; OTHER = Other specific activities, if applicable in this call.

A3.3.5 – Interpretation of consequences of a semantic event
A3.4 – Sequential, anytime and on-line learning for Real-time semantic analysis
The main objective of this activity is to provide a coherent perspective on resource constrained algorithms that
are fundamentally designed to handle limited bandwidth, limited computing and storage capabilities, limited
battery power, and specific real-time network-communication protocols. In the targeted application scenario the
process of generating the data is not strictly stationary. In this activity, research will focus on developing
intelligent tools for extracting named entities, which could include name of a person, name of a place, a national
monument etc. The tools developed in this activity will exploit online resources such as Wikipedia, Geonames
and DBPedia to provide the user with useful (or user preferred) statistics.
A3.5 – Real-time coding and streaming
This activity embraces the three different areas of the work:

Fully scalable media coding

Real-time streaming of scalable media coding

Quality of Experience of multimedia coding
A3.6 – Real-time Multimodal Data Synchronisation
This activity will collaborate closely with previously mentioned activities in aligning and synchronising
information obtained from multiple modalities and multiple sources with respect to time and geo-spatial
preferences. The outcome of this activity will (partially) reflect real-time dynamic updates on the events like
traffic and crowd conditions and thereby suggesting alternative routes to the same location according to user
preferences or user models.
Deliverables (brief description ) + month of delivery
D3.1 – Real-time analysis algorithms for Cross-Media Analysis and Annotation (M18)
D3.2 – State-of-the-art report on current multimodal techniques (M12)
D3.3 – Evaluation of real-time scalable and multiple description coding in heterogeneous environment (M24)
D3.4 – Report on multimodal data synchronisation (M30)
Milestones (brief description ) + month of delivery
MS3.1 – Initial report on traffic and crowd monitoring in real time multimedia analysis (M12)
MS3.2 – Real time extraction knowledge by analysing and indexing cartographs (M16)
MS3.1 – Prototype of scalable multimedia content streaming framework (M30)
Download