View-invariant action recognition based on Artificial Neural Networks

advertisement
View-invariant action recognition based on Artificial Neural
Networks
ABSTRACT:
In this paper, a novel view invariant action recognition method based on neural
network representation and recognition is proposed. The novel representation of
action videos is based on learning spatially related human body posture prototypes
using Self Organizing Maps (SOM). Fuzzy distances from human body posture
prototypes are used to produce a time invariant action representation. Multilayer
perceptrons are used for action classification. The algorithm is trained using data
from a multi-camera setup. An arbitrary number of cameras can be used in order to
recognize actions using a Bayesian framework. The proposed method can also be
applied to videos depicting interactions between humans, without any
modification. The use of information captured from different viewing angles leads
to high classification performance. The proposed method is the first one that has
been tested in challenging experimental setups, a fact that denotes its effectiveness
to deal with most of the open issues in action recognition.
EXISTING SYSTEM:
The recognition of human activities has a wide range of promising applications
such as smart surveillance, perceptual interfaces, interpretation of sport events, etc.
Although there has been much work on human motion analysis over the past two
decades, activity understanding still remains challenging. In terms of higher-level
analysis, previous studies generally fall under two major categories of approaches.
The former usually characterizes the spatiotemporal distribution generated by the
motion in its continuum.
DISADVANTAGES OF EXISTING SYSTEM:
Action recognition methods suffer from many drawbacks in practice, which
include
(1) The inability to cope with incremental recognition problems;
(2) The requirement of an intensive training stage to obtain good performance;
(3) The inability to recognize simultaneous multiple actions; and
(4) Difficulty in performing recognition frame by frame
PROPOSED SYSTEM:
In this paper, a novel view invariant action recognition method based on neural
network representation and recognition is proposed.
The main contributions of this paper are: a) the use of Self Organizing Maps
(SOM) for identifying the basic posture prototypes of all the actions, b) the use of
cumulative fuzzy distances from the SOM in order to achieve time-invariant action
representations, c) the use of a Bayesian framework to combine the recognition
results produced for each camera, d) the solution of the camera viewing angle
identification problem using combined neural networks.
ADVANTAGES OF PROPOSED SYSTEM:
To establish an effective action recognition method using analysis of
spatiotemporal silhouettes measured during the activities, based on the idea that
spatiotemporal variations of human silhouettes encode not only spatial information
about body poses at certain instants, but also dynamic information about global
body motion and the motions of local body parts. It appears to be feasible to use
features that can be obtained from space-time shapes for exploring the action
properties. In contrast to feature tracking, extracting space-time shapes is also
easier to implement using current vision technologies, especially in the case of
stationary cameras. The proposed method has several desirable properties: a) It is
easier to comprehend and implement, without the requirements of explicit feature
tracking and complex probabilistic modeling of motion patterns; b) being based on
binary silhouette analysis, it naturally avoids some problems arising in most
previous methods, e.g., unreliable 2-D or 3D tracking, expensive and sensitive
optical flow computation, and c) it obtains good results on a large and challenging
database and exhibits considerable robustness.
MODULES:
o Pre-processing Module
o Segmentation
o Action Recognition Module
o Action Detection Module
o Output Module
MODULE DESCRIPTION:
Pre Processing Module:
This is the first module. This module is to convert the input video to images. And
we have to do the image enhancement (i.e.: Noise removal etc…). Image get from
the video is extracted into frames.
Segmentation
Edges can be detected from the frames. For detection we have to convert the image
to black and white using grayscale .For edge detection we have to use canny edge
detection algorithm.
Action Recognition
We have to store the edges into the Files. This module is to find the human action
with the use of stored files.
Action Detection
Comparison with the files in this section with the input video. This module will do
the main processing in this project.
Output
The output in the reports with the human files which has stored in the Action
Recognition module.
Input Design:
* Action images as input
Test walking sequences.
Here we use 10 walking sequences captured in various different scenarios in front
of non uniform background for this experiment. Each reference action type is
matched in these walking sequences to find the best matched action type.
LPP Calculation
LPP implicitly emphasizes the natural clusters in the data. This is a key ideal
property for classification tasks.
Dimension reduction
To examine the relationship between the reduced dimensions and the recognition
rates.
Activity Classification
The performance of our method with respect to challenging factors such as
viewpoints, different clothes and motion styles. Action recognition can be solved
through measuring motion similarities between the reference motion patterns and
test samples in the low-dimensional embedding space.
Test Case:
Given inputs:
 Example images as input
 Store action images to Files
 Compare different action images
Expected Output:
 To establish an effective action recognition method using analysis of
spatiotemporal silhouettes measured during the activities, based on the idea
that spatiotemporal variations of human silhouettes encode not only spatial
information about body poses at certain instants, but also dynamic
information about global body motion and the motions of local body parts.
 It appears to be feasible to use features that can be obtained from space-time
shapes for exploring the action properties.
 In contrast to feature tracking, extracting space-time shapes is also easier to
implement using current vision technologies, especially in the case of
stationary cameras.
 The proposed method has several desirable properties: a) It is easier to
comprehend and implement, without the requirements of explicit feature
tracking and complex probabilistic modeling of motion patterns; b) being
based on binary silhouette analysis, it naturally avoids some problems
arising in most previous methods, e.g., unreliable 2-D or 3D tracking,
expensive and sensitive optical flow computation, and c) it obtains good
results on a large and
challenging database and exhibits considerable
robustness.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
•
SYSTEM
: Pentium IV 2.4 GHz
•
HARD DISK
: 40 GB
•
FLOPPY DRIVE : 1.44 MB
•
MONITOR
: 15 VGA colour
•
MOUSE
: Logitech.
•
RAM
: 256 MB
•
KEYBOARD
: 110 keys enhanced.
SOFTWARE REQUIREMENTS:
•
Operating system :- Windows XP Professional
•
Front End
•
Coding Language :- C# .NET
:- Microsoft Visual Studio .Net 2008
REFERENCE:
Alexandros Iosifidis, Anastasios Tefas, Member, IEEE, and Ioannis Pitas, Fellow,
IEEE, “View-invariant action recognition based on Artificial Neural Networks”,
IEEE 2012 Transactions on Neural Networks and Learning Systems
,Volume:23,Issue:3, 2012.
Download