Title: Adaptive Feature Selection and Statistical Modeling for

advertisement

Title: Adaptive Feature Selection and Statistical Modeling for Content-based Video

Indexing and Retrieval (NEW)

Supervisor: Dr Gao Sheng and Dr Sun Qibin, Institute for Infocomm Research (I2R)

Description:

Retrieve text information from WWW is easier using the search engines like Google,

Yahoo, etc. However, it is still an unsolvable problem for mining video, audio, and image. To make video retrieval easier like text, the first step is to analyze video contents (ie. visual, audio and text) and then build a model to represent it. In this project, we have 2 tasks.

1. Adaptive feature selection using machine learning

In TRECK-Video, one task is to detect semantic features, such as indoor/outdoor, face, people, etc., from video sequence. To do this detection, it is an intuitive idea to fuse the various sources to improve the detection performance. Although a lot of work have been done, most of them are heuristic. From lessons from computer vision,image retrieval, and video analysis, we know that there is not one simple feature which can work well in all situations.For example, sometime color histogram works well and sometimes textture or motion works well.In this task we will explore and develop machine learning techniques to adaptively select the various features most suitable for a specific semantic concept detection.

2. Learn statistical model to represent the relations among the concepts

If these semantic concepts are treated as the meta-data like in XML, next step is to model their spatial and temporal relations so that it is feasible to derive and infer much higher concepts (eg. events) from the video sequence. Currently some work have been done. But when handling the realistic data such as TRECK-video, much work must done. For example, develop the temporal model, use sequential decision theory to improve the semantic detection by applying temporal constraint of video sequence.

The 2 tasks in the project covers large-scale knowledge including video analysis, image analysis, computer vision, audio and text analysis, machine learning, pattern recognition, etc.

Download