Research Journal of Applied Sciences, Engineering and Technology 4(24): 5552-5556,... ISSN: 2040-7467

advertisement

Research Journal of Applied Sciences, Engineering and Technology 4(24): 5552-5556, 2012

ISSN: 2040-7467

© Maxwell Scientific Organization, 2012

Submitted: April 10, 2012 Accepted: April 30, 2012 Published: December 15, 2012

Survey on Digital Image Processing in Sports

Girish G. Koundinya, G. Jaikumar, N. Rahul Akash and M.S.V. Subramanian

SASTRA University, Tanjore, India

Abstract: Digital Image Processing has found to be useful in many domains. In sports, it can either be used as an analytical tool to determine strategic instances in a game or can be used in the broadcast of video to television viewers. Modern day coverage of sports involves multiple cameras and an array of technologies to support it, since manually going through every video coming to a station would be a near-impossible task, a wide range of Digital Image Processing algorithms are applied to do the same. Highlight Generation and Event

Detection are the foremost areas in sports where a multitude of DIP algorithms exist. This study provides an insight into the applications of Digital Image Processing in Sports, concentrating on algorithms related to video broadcast while listing their advantages and drawbacks.

Keywords: DIP, event detection, highlight generation, situation recognition

INTRODUCTION computer algorithms to process the digital image. It is a technology used for Pattern Recognition, Feature

Extraction, Classification, Projection and Multi-Scale signal analysis. Some of the commonly used techniques include-Pixelization, Linear Filtering, Hidden Markov

Model, Neural Networks, Wavelets and so on. It has varied applications in different fields. One of the fields where it plays a major role is sports. In this field it has numerous application few important ones being:

C

Digital Image Processing is the technique of using

Event Detection

C Highlights Generation

C Summarization

Event Detection is the process of identifying the important events occurring in a sport such as a goal scoring event in soccer, using systems which work based on certain methods. One of the most common methods used is the Hidden Markov Model (HMM). It is based on the concept of probability. Using these systems we can predict the important events that occur in a particular sport, which will prove to be useful while broadcasting sports videos for the users. It can also be used for generating scoreboards in matches. It can also detect slow motion replays.

In each and every sport, generation of highlights is an important task. Highlights focus on the significant events that occur in a sport, which helps the users to identify the important moments of the match. It includes the process of event detection using the Hidden Markov technique

(HMM). It can be used for slow motion replays too.

Summarization technique is the process of generating a summary for the highlights generated. This helps the users to get an overview of the match. It has the collection of highlights generated and which is concatenated and the summary of the match is obtained.

The main objective of this survey was to understand the various algorithms which are used in different phases of sports video generation. The inception of path-breaking and innovative ideas can only be done once we are aware of the existing mechanism and its flaws. This study provides a critical analysis into the various methods used in the Sports Domain.

Survey on digital image processing in sports: In

(Refaey Elsayed et al ., 2009), an algorithm is introduced using Fuzzy Logic for Shot detection concentrating on

Digital Video Effects like cut, fade, dissolve and wipe transitions. The need for such an algorithm is highlighted by problems like hard cut thresholds and requirement of large volumes of training data. This study proposes method integrating many features like color histogram, edgeness, intensity variance etc. When evaluated under different capturing environments, varied production styles and frame sizes it achieves high degrees of recall, precision and total performance.

In Song et al . (2011) a unique method for detection of Goals in a video is proposed based on changes in the score board. The method comprises of three parts:

C Locating the Scoreboard

C

Extracting the Score

C Detection using score part

Since the score part in video only changes while a goal event occurs, in this method, subsequent frames are

Corresponding Author: Girish G. Koundinya, Student, SASTRA University, Tanjore, India

5552

compared and the number of unchanged pixels is analyzed. The advantage of such a method is in the reduction of training data and structural analyzing of a game video since there is no need to divide videos into shots and scenes. This method doesn ?

t provide satisfactory results considering the interference from

Television Station clock, Match Logo, White advertisement signs and fans in the stadium wearing white clothes and also has a low precision ratio:

C Ekin and Tekalp (2003) Suggests the usage of

Cinematic Features such as shot classes and slowmotion replays in probabilistic event detection methods. Algorithms have been developed for effective sports video processing framework and include automatic dominant color region detection, shot boundary detection and domain-specific shot classification algorithms. The two most essential processes are Cinematic Feature extraction and Event

Detection and Summarization. The algorithms have been developed mainly for Soccer and Basketball.

Apart from the pre requisite to specify the color ratio in the first few frames, the system is fully automatic.

The main advantage of this proposed method is that it can be run in real-time which is essential in sport video analysis.

C Sadlier and O'Connor (2005) Outlines an audiovisual feature based framework for the process of event detection in field sports. By building robust detectors to analyze a video and to indicate the features of significant events. The information gathered by the detectors is then processed by means of a Support Vector Machine, which infers the occurrence of an event based on a previously generated model. On testing, results show that both, high event retrieval and content rejection statistics are achievable through this method and also applicable to a wide range of sports from the field sport domain.

C Using condensation based approach a framework has been proposed in Needham and Boyle (2001) primarily to deal with tracking sports players over a large playing area. The advantages of a condensation approach are manifold compared to that of multiple single point trackers especially in situations where the playing area is cluttered with multiple players.

Kalman filtering is employed in this method by altering the predictive step of condensation. Each player is tracked independently and fitted individually and thus the sampling probability of the whole group is calculated. Results show that the initial method provides 28% of usable tracking information where as the altered method with

Kalman Filtering provided good results of 56%. The flaw of the system is the inability of the bounding box not fitting to the feet of the sportsman.

Res. J. Appl. Sci. Eng. Technol., 4(24): 5552-5556, 2012

C

Pingali et al . (1998) Describes in detail a Real Time

Tracking system which produces spatio-temporal trajectories of the motion of the player and the ball which can provide useful game related statistics. The process of tracking a player is done by dynamically clustering tracks of local features. The difficulties involved in player tracking are differentiating player from background, wide variations in player apparels and the differences in court characteristics. Ball segmentation and tracking is based on the shape and colour features of the ball. A high speed technique is required for ball tracking. The results of this study are satisfactory and are based on stationary cameras, future development is required using multiple cameras for 3d ball localization.

C

Yasuo et al . (2006) proposes an automatic production system of soccer videos using digital shooting technique based on event recognition. Firstly the event is recognized and then the digital shooting technique is carried out which consists of digital camera work and digital switching technique. After this is the camera working module in which the digital panning and zooming are controlled. Next is the situation recognition module which recognizes the situation of the game based on parameters such as ball and players. Finally all these processes are tuned according to the user ?

s preference. The AHP evaluation showed that the proposed system had same score as TV contents for basic composition but showed lesser score in terms of panning and zooming.

C

Babaguchi and Nitta (2003) Represents a unique strategy for Content Analysis in Broadcasted Sports

Video. A broadcast video can be viewed as a set of multimodal streams such as visual, auditory text and graphical streams. Case studies are used to show the effectiveness of this mechanism. This method is used mainly used for Highlight Extraction, Annotation generation, Story Segmentation. Intermodal

Collaboration between

B Text and Visual streams

B Text, Auditory and Visual Streams

B

Graphics stream and External metadata is explained in Refaey Elsayed et al . (2009)

C

A Deeper level analysis of sentence in Closed

Caption and more complicated image analysis such as motion analysis are required in order to address a few problems that arise in this method. A more flexible approach towards integration of each type of analysis should be further explored with aim of providing better results with this method.

The aim of Naphade and Huang (2002) is to find the event which keeps recurring that forms a definitive

5553

Res. J. Appl. Sci. Eng. Technol., 4(24): 5552-5556, 2012 structure ofa video. The algorithm uses Hidden

Markov Model to identify the state transitions in a video and is trained using Expectation Maximization

Algorithm. Experimental results show that the algorithm does an efficient work in identifying the recurring events. Repeated application of the algorithm at coarse granularity can be used for finding the long term structure.

C Pan et al . (2001) describes a method for generating summary of sports video highlights. It does so by localizing the semantically important events by detecting the slow motion replays in sports and also filters the commercial slow motion replays. Hidden

Markov Model (HMM) combined with viterbi algorithm is used. It is based on the moving measure of zero crossing and variations in amplitude of the fields in video. It also detects the boundary of replay segments using an interference algorithm. It is a generic algorithm with good efficiency. But detection of slow motion segments of replays with poor editing effects is difficult.

C The aim of Li and sezan (2001) is to address the idea of summarization in sports videos of a particular class using the concept of “play” which is the basic segment of time during which an important event occurs. The major task is to detect the start and end points of a play. The play is characterized by a set of low level visual features which detects the sequence of events and then inference is done based on the rules of the game. There are also features to remove replay segments of a game. Experimental results show that the following approach works well and also has a good compaction ratio, but it is not a generic algorithm.

C

Ronald et al . (2008) suggests a new approach for classifying mpeg-2 video sequences as

# sport

&

and

% non sport

&

videos. It does so by analyzing new highlevel audiovisual features of consecutive frames in real time. It consist of three presented high level descriptors one low level descriptor, static area ?

and audio noise for classification and a support vector machine for combining results, deriving a probability, that a given video sequence is sport

C Huang-Chia and Chung-Lin (2007) proposes semantics for interpreting Super imposed captions

(SCB) which has a digit object and an annotative object in sports videos. It describes the construction of SCB template automatically and to understand the ongoing game stats descriptions from video caption.

The system has three phases, the first being localization of SCB, the second phase is extraction and recognition of template generated and the third phase is interpreting the semantic meaning using generated template. Experimental results show that the algorithm is robust and performs well and generic, but there is no on-to-one correspondence.

C

Luca and Markus (2008) explains an optimization method for encoding low resolution soccer videos. It aims at maximizing the user perceived quality, considering the characteristics of soccer videos. It consists of segmentation of the picture to recognize:

B

The field

B The audience

B The remaining objects-players, ball etc.

Firstly three groups of image components are defined by means of a segmentation mechanism and then each component is encoded according to the compression degree. Experimental results show that the three groups were encoded separately and there were errors only in regions where the human visual system is less sensitive thereby marginally reducing the user perceived quality.

C

Serhan and Mohamed (2004) gives a generic algorithm for detection of highlights in sports videos and for application in content-based browsing of sports videos. The algorithm suggests a multimodal approach for the detection of highlight events using audio, visual and text features of the video. The segments of video with high energy levels of audio signals are extracted as highlight events. Textual transcripts are used to detect highlight events along with audio modality for reducing false positives.

Experimental results show that visual modality gives a high recall rate and a poor precision to increase precision audio and text modalities are combined with the visual modal for an efficient presentation of multimedia content.

C The main idea of Wei-Ta and Ja-Ling (2008) is to present an algorithm that detects events in broadcasting of baseball videos facilitating practical applications. The event occurs in between two pitches and the events are detected by rule-based or model-based detection. The shot is detected and classified by the colour information and geometric information from the field colour and pitcher detections. This can also be extended to applications such as automatic box score, summarization and highlights generation. The algorithm concentrates mainly on baseball videos and uses the rules of the game extensively for its applications.

C

Jenny and Nandan (2004) Presents an idea for automation of archiving tennis video clips. It makes use of calibrated cameras to extract the trajectory of a tennis ball. This archived video can later be used for training, pattern analysis, broadcasting, etc. This study classifies a tennis game into 58 winning patterns based on the trajectory of the ball. It makes use of an improved Bayesian networks for classification of different patterns. To categorize the landing position in a court there are 13 clusters.

5554

However, it s not possible to calculate the precise landing location to identify which cluster it belongs to. An accurate system cannot be built as the error can come from the player. For simplification purpose clusters were classified into five major patterns.

Since the balls landing position cannot be calculated accurately they have attached a probability to each possible cluster. The result of this study is based on a small number of patterns and large scale testing is yet to be carried out.

Assflag et al . (2003) Helps in detection and recognition of highlights in sports videos. Live logging and posterity logging are the two basic applications for annotation of video database. The steps required to automatically create a video annotation includes temporal segmentation of video streams into shots, detection and recognition of text appearing in captions, interpretation of audio track, visual summarization of shot content and semantic annotation. Annotation can be done either manually or automatically. The role of automatic annotation is to detect highlights and let the assistant producers to work on subjective annotation where automation is not possible. From the result of this experiment based on test done on one hour of videos it can be noticed that those highlights can be easily defined obtain a good recognition rates. Temporal logic is used to recognize the highlights. Experimental results of this study are extremely promising.

CONCLUSION

Digital Image Processing is the use of computer algorithms to perform image processing. Compared to other image processing techniques due to its versatility it has become to most commonly used method. It also very economical. The multi-faceted use of image processing is applicable in the field of sports in a variety of ways. Since most sports are viewed on television, image processing algorithms are required to enhance the experience. We have surveyed the various image processing techniques used in many sports, also have analyzed the algorithms.

Image processing in sports is primarily used to detect significant instances from various video inputs. We have also discussed the merits and demerits of each and have also identified future scope and areas of research for most of the techniques.

REFERENCES

Assflag, J., M. Bertini, C. Colombo, A. Del Bimbo and

W. Nunziati, 2003. Annotation and retrieval of video documents: Sports videos. Proceedings of the 3rd

International Symposium on Image and Signal

Processing and Analysis, ISPA 2003, Sept. 18-20,

Italy, 1: 295-300.

Res. J. Appl. Sci. Eng. Technol., 4(24): 5552-5556, 2012

Babaguchi, N. and N. Nitta, 2003. Intermodal collaboration: A strategy for semantic content analysis for broadcasted sports video. International

Conference on Image Processing, ICIP 2003, Sept.

14-17, Japan, 1: 13-16.

Ekin, A. and A.M. Tekalp, 2003. Generic event detection in sports video using cinematic features. Conference on Computer Vision and Pattern Recognition

Workshop, CVPRW '03, 16-22 June, University of

Rochester, 4: 36.

Huang-Chia, S. and H. Chung-Lin, 2007. Semantic interpretation of superimposed captions in sports.

IEEE 9th Workshop on Multimedia Signal

Processing, MMSP 2007, 1-3 Oct., Hsinchu, pp: 235-

238.

Jenny, R.W. and P. Nandan, 2004. Detecting tactics patterns for archiving tennics video clips.

Proceedings of IEEE 6th International Symposium on

Multimedia Software Engineering, 13-15 Dec.,

Australia, pp: 186-192.

Luca, S. and R. Markus, 2008. Encoding optimization of low resolution soccer video sequences. IEEE

International Conference on Multimedia and Expo,

Vienna Univ. of Technol., Vienna, pp: 841-844.

Li, B. and M.I. Sezan, 2001. Event detection and summarization in sports video. Proceedings of the

IEEE Workshop on Content-based Access of Image and Video Libraries (CBAIVL'01), Washington, DC,

USA, pp: 132.

Naphade, M.R. and T.S. Huang, 2002. Discovering recurrent events in video using unsupervised methods. International Conference on Image

Processing, NY, USA.

Needham, C.J. and R.D. Boyle, 2001. Tracking Multiple

Sports Players through Occlusion, Congestion and

Scale. University of Leeds, United Kingdom.

Pan, H., P. Van Beek and M.I. Sezan, 2001. Detection of slow-motion replay segments. Proceedings of IEEE

International Conference on Acoustics, Speech and

Signal Processing, (ICASSP '01), Illinois Univ.,

Urbana, IL, 3: 1649-1652.

Pingali, G.S., Y. Jean and I. Carlbom, 1998. Real time tracking for tennis broadcasts. Proceedings of the

IEEE Computer Society Conference on Computer

Vision and Pattern Recognition, CVPR '98,

Washington, DC, USA, pp: 260.

Refaey Elsayed, M.A., K.M. Hanafy and S.M. Davis,

2009. Concurrent transition and shot detection in football videos using fuzzy logic. 16th IEEE

International Conference on Image Processing

(ICIP), 7-10 Nov., MD, USA, pp: 4341-4344.

Ronald, G., S. Sebastian, O. Hüseyin, K. Pascal and S.

Thomas, 2008. Real-time detection of sport in mpeg-

2 sequences using high-level AV-descriptors and

SVM. Third International Conference on Digital

Information Management, ICDIM 2008, 13-16 Nov.,

Berlin, pp: 727-732.

5555

Res. J. Appl. Sci. Eng. Technol., 4(24): 5552-5556, 2012

Sadlier, D.A. and N.E. O'Connor, 2005. Event detection in field sports video using audio-visual features and a support vector machine. IEEE T. Circ. Syst. Vid.,

15(10): 1225-1233, ISSN: 1051-8215.

Serhan, D. and A.M. Mohamed, 2004. Multimodal detection of highlights for multimedia content.

Multimed. Syst., 9(6): 586-593, DOI:

10.1007/s00530-003-0130-3.

Song, Y., X. Wen, Y. Sun, L. Zhang, L. Yan and H. Lin,

2011. A scoreboard based method for goal events detecting in football videos. Workshop on Digital

Media and Digital Content Management (DMDCM),

15-16 May, Beijing, China, pp: 248-251.

Wei-Ta, C. and W. Ja-Ling, 2008. Explicit semantic events detection and development of realistic applications for broadcasting baseball videos.

Multimed. Tools Appl., 38(1): 27-50.

Yasuo, A., K. Shintaro and K. Masahito, 2006. Automatic production system of soccer sports video by digital camera work based on situation recognition. Eighth

IEEE International Symposium on Multimedia,

ISM'06, Kobe Univ., pp: 851-860.

5556

Download