A Discriminative Model of Motion and Cross Ratio for View

advertisement
A Discriminative Model of Motion and Cross Ratio for
View-Invariant Action Recognition
ABSTRACT:
Action recognition is very important for many applications such as video
surveillance, human–computer interaction, and so on; view-invariant action
recognition is hot and difficult as well in this field. In this paper, a new
discriminative model is proposed for video-based view-invariant action
recognition. In the discriminative model, motion pattern and view invariants are
perfectly fused together to make a better combination of invariance and
distinctiveness. We address a series of issues, including interest point detection in
image sequence, motion feature extraction and description, and view-invariant
calculation. First, motion detection is used to extract motion information from
videos, which is much more efficient than traditional background modeling and
tracking based methods. Second, as for feature representation, we exact variety of
statistical information from motion and view-invariant feature based on cross ratio.
Last, in the action modeling, we apply a discriminative probabilistic model-hidden
conditional random field to model motion patterns and view invariants, by which
we could fuse the statistics of motion and projective invariability of cross ratio in
one framework. Experimental results demonstrate that our method can improve the
ability to distinguish different categories of actions with high robustness to view
change in real circumstances.
EXISTING SYSTEM:
Human action recognition in video sequences is one of the important and
challenging problems in computer vision, which aims to build the mapping
between dynamic image information and semantic understanding. While the
analysis of human action tries to discover the underlying patterns of human action
in image data, it is also much useful in many real applications such as intelligent
video surveillance, content-based image retrieval, event detection, and so on.
Human action recognition involves a series of problems such as image data
acquisition, robust feature extraction and representation, training classifier with
high discriminative capability, and other application problems that may come out
in practical system running.
DISADVANTAGES OF EXISTING SYSTEM:
In previous work of view-invariant action recognition, it is difficult to obtain very
accurate trajectories of moving objects because of noise caused by self-occlusions
Appearance based methods such as scale-invariant feature transform (SIFT) are not
quite suitable for motion analysis since those appearance-based methods, such as
color, gray level, and texture, are not stable among neighboring frames of dynamic
scenes.
Existing methods of background modeling, such as Gaussian mixture model, suffer
from low effectiveness required for accurate human behavior analysis. For
example, some traditional methods do not work well with the existence of
shadows, light changes, and, particularly, view changes in real scenes.
PROPOSED SYTEM:
A compact framework is proposed for view-invariant action recognition from the
perspective of motion information in image sequences, in which we could properly
encapsulate motion pattern and view invariants in such a model that results in a
complementary fusion of two aspects of characteristics of human actions.
In this paper, we will describe our method in three phases. In the first stage, we
will introduce a motion detection method to detect interest points in space and time
domain, which could facilitate optical flow extraction not only in an expected local
area but also for view invariants–cross ratio from those detected interest points. In
the second stage, the oriented optical flow feature is described to be represented by
oriented histogram projection to capture statistical motion information, and in the
third stage, the optical flow and view-invariant features are fused together in a
discriminative model.
ADVANTAGES OF PROPOSED SYSTEM:
The proposed framework of action recognition has shown good integration and
achieved expected performance on some challenging databases.
MODULES:
1. Optical flow in local area
2. View Invariants extraction
3. Optical flow feature expression
4. Discriminative model
5. Action Recognition
MODULE DESCRIPTION:
Optical flow in local area
Since appearance-based features such as Harris, histogram of oriented gradient
(HOG), SIFT, Gabor, and shape highly depend on the stability of image
processing, they fail to accurately recognize different kinds of actions because of
the non-rigidity nature of the human body or some other impacts in real
applications. Therefore, in our method, after detection of interest points in videos,
we extract motion features from the neighboring area around the interest points and
build the representation of the statistical properties of the local area of the image.
View Invariants extraction
Geometric invariants capture invariant information of a geometric configuration
under a certain class of transformations. Group theory gives us theoretical
foundation for constructing invariants [34]. Since they could be measured directly
from images without knowing the orientation and position of the camera, they have
been widely used for object recognition to tackle the problem of projective
distortion caused by viewpoint variations.
In view-invariant action recognition, traditional-model-based methods evaluate the
fitness between image points and the predefined 3-D models. However, it is
difficult to detect qualified image points that satisfy the specific geometric
configuration required to get the desired invariants.
Optical flow feature expression
Optical flow takes the form of 2-D vector representing image pixel velocity in the
– and -directions. The beginning and ending points of the optical flow vector
correspond to displacement of image pixels. There are mainly two kinds of
methods to extract optical flow from images. The first one is a feature-based
method, which calculates the matching score of the feature points between
neighboring frames and takes the displacements of the matched points as the start
and endpoints of the optical flow vector. However, due to the instability of the
image edges and large displacement of moving human body, the calculated optical
flow could hardly exhibit the real movement of human body. The second one is
gradient-based methods, which are widely used in computer vision tasks. Gradientbased methods assume that the gray level in a local area of the images is relatively
stable between adjacent frames. More importantly, by calculating the image
gradient and optimizing the cost function in an iterative way, we can give a dense
field of optical flow.
Discriminative model
In this module, we have extracted many key points informative for recognition in
spite of the occurrences of noise. Since many points are detected, a nonmaxima
suppression method will be used to select the relatively stable points of interest as
a representation of the current frame, which gives much better performance
particularly for periodic motion patterns.
Action Recognition
Up to now, we have obtained motion feature description and view invariants of
interest points. The remaining problem is how to model the temporal information
from sequential data. Unlike object classification in static image, action
recognition should take into account the temporal dependence and paces of an
action.
HARDWARE REQUIREMENTS:
•
SYSTEM
: Pentium IV 2.4 GHz
•
HARD DISK
: 40 GB
•
FLOPPY DRIVE
: 1.44 MB
•
MONITOR
: 15 VGA colour
•
MOUSE
: Logitech.
•
RAM
: 256 MB
•
KEYBOARD
: 110 keys enhanced.
SOFTWARE REQUIREMENTS:
2008
•
Operating system
:- Windows XP Professional
•
Front End
:- Microsoft Visual Studio .Net
•
Coding Language
: - C#.NET 2008.
REFERENCE:
Kaiqi Huang, Yeying Zhang, and Tieniu Tan, “A Discriminative Model of Motion
and
Cross
Ratio
for
View-Invariant
Action
Recognition”,
IEEE
TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL
2012.
Download