Football Player Detection in Video Broadcast
Abstract
The paper describes a novel segmentation system based on the combination of Histogram of Oriented Gradients (HOG) descriptors
and linear Support Vector Machine (SVM) classification for football video. Recently, HOG methods were widely used for pedestrian
detection. However, presented experimental results show that combination of HOG and SVM is very promising for locating and
segmenting players. In proposed system a dominant color based segmentation for football playfield detection and a 3D playfield
modeling based on Hough transform is introduced. Experimental evaluation of the system is done for SD (720×576) and HD (1280×720)
test sequences. Additionally, we test proposed system performance for different lighting conditions (non-uniform pith lightning,
multiple player shadows) as well as for various positions of the cameras used for acquisition.
... Traditional model player detection includes connected component analysis [8], shallow convolutional neural networks [9], histogram of orientated gradients and support vector machines (HOG-SVM) [10], and deformable part model (DPM) [11]. Figure 1 shows different situations in football player detection. ...
... Traditional models generally can detect players in this situation, while they can hardly detect adjacent players (Figures 1(b)-1(d)) correctly in a harder situation. Besides, HOG-SVM needs domain knowledge and more labor work in order to conduct background segmentation [10]. Non-maximum suppression restricts the performance of DPM when detecting close players [12]. ...
The main task of football video analysis is to detect and track players. In this work, we propose a deep convolutional neural network-based football video analysis algorithm. This algorithm aims to detect the football player in real time. First, five convolution blocks were used to extract a feature map of football players with different spatial resolution. Then, features from different levels are combined together with weighted parameters to improve detection accuracy and adapt the model to input images with various resolutions and qualities. Moreover, this algorithm can be extended to a framework for detecting players in any other sports. The experimental results assure the effectiveness of our algorithm.
... Several studies have appeared in this domain. In order to tackle the drawbacks of RGB based field recognition [12,15] and edge detection based methods [11], we have appointed the corner points on the two sides of the field. In the first step, we took a frame from the videos and undistorted them with the distortion matrix of the camera lens to get straight lines on the image. ...
... Player detection is a broad research area in sport video analysis. It can be based on Histogram of Oriented Gradients (HOG) [11], or some feature selection algorithms such as dominant color-based background subtraction [9], and edge detection [3]. Since our videos were recorded from stationary points, we opted for a movement-based background-foreground separation [6] method to separate moving objects from the background. ...
Sports analytics are on the rise in European football, however, due to the high cost so far only the top tier leagues and championships have had the privilege of collecting high precision data to build upon. We believe that this opportunity should be available for everyone especially for youth teams, to develop and recognize talent earlier. We therefore set the goal of creating a low-cost player tracking system that could be applied in a wide base of football clubs and pitches, which in turn would widen the reach for sports analytics, ultimately assisting the work of scouts and coaches in general. In this paper, we present a low-cost optical tracking solution based on cheap action cameras and cloud-deployed data processing. As we build on existing research results in terms of methods for player detection, i.e., background-foreground separation, and for tracking, i.e., Kalman filter, we adapt those algorithms with the aim of sacrificing as least as possible on accuracy while keeping costs low. The results are promising: our system yields significantly better accuracy than a standard deep learning based tracking model at the fraction of its cost. In fact, at a cost of $2.4 per match spent on cloud processing of videos for real-time results, all players can be tracked with a 11-meter precision on average.
... This feature vector can be used to classify objects into different classes, e.g., player, background, and ball. This method is used by Mackowiak et al. [2010] and Cheshire et al. [2015]. ...
Sports analysis has gained paramount importance for coaches, scouts, and fans. Recently, computer vision researchers have taken on the challenge of collecting the necessary data by proposing several methods of automatic player and ball tracking. Building on the gathered tracking data, data miners are able to perform quantitative analysis on the performance of players and teams. With this survey, our goal is to provide a basic understanding for quantitative data analysts about the process of creating the input data and the characteristics thereof. Thus, we summarize the recent methods of optical tracking by providing a comprehensive taxonomy of conventional and deep learning methods, separately. Moreover, we discuss the preprocessing steps of tracking, the most common challenges in this domain, and the application of tracking data to sports teams. Finally, we compare the methods by their cost and limitations, and conclude the work by highlighting potential future research directions.
... Drawing lessons from general object detection approaches, researchers analyzing football video content analyses propose a few detection approaches based on statistical classifier. Of them, the method based on two class support vector machine (TCSVM) [11,2] and the one based on Adaboost are representative [6,12]. From the perspective of selection of classifier, those methods employ two kinds of classifier. ...
An automatic player detection method based on fuzzy decision making one-class SVM is proposed. Detection results of statistical classifier player detection methods are better than rule based player detection methods. However, manually labelled training samples are used in these statistical classifiers based player detection methods. Thus, cost is very important. To resolve this problem, we propose an instinctive player detection method using fuzzy decision making one-class SVM and automatically collected player samples. In this method, one-class SVM (OCSVM) is introduced to train the player detector by drawing lessons from the human object category classification mechanism. Additionally, decision function of OCSVM is improved by dividing the decision value dynamically using the fuzzy decision method, which is able to reduce the detection error caused by the insufficient representativeness of the automatically collected training samples. Finally, a set of criteria is introduced to obtain the training samples automatically, and player detection experiments are performed on these training samples using FD-OCSVM. Experiments show that better detection results are obtained using the proposed method in the scenario of using automatically collected training samples, which improves the automatic degree of player detection.
... In their work, players' positions and trajectory information is analyzed for situations, such as identifying open events, repossessing video clips, and planning implicit/explicit strategies. Slawomir et al. [22] design a novel segmentation system for football player detection in broadcasted videos. The system utilizes Histogram of Oriented Gradients (HoG) and Support Vector Machine (SVM). ...
Augmented Reality (AR) overlays virtual information on real world data, such as displaying useful information on videos/images of a scene. This paper presents an Enhanced AR (EAR) system that displays useful statistical players' information on captured images of a sports game. We focus on the situation where the input image is degraded by strong sunlight. Proposed EAR system consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player and face detection, face recognition, and players' statistics display. First, an algorithm based on multi-scale retinex is proposed for image enhancement. Then, to detect players' and faces', we use adaptive boosting and Haar features for feature extraction and classification. The player face recognition algorithm uses boosted linear discriminant analysis to select features and nearest neighbor classifier for classification. The system can be adjusted to work in different types of sports where the input is an image and the desired output is display of information nearby the recognized players. Simulations are carried out on 2096 different images that contain players in diverse conditions. Proposed EAR system demonstrates the great potential of computer vision based approaches to develop AR applications.
The accurate detection of ice hockey players and teams during a game is crucial to the tracking of individual players on the rink and team tactical decision making and is therefore becoming an important task for coaches and other analysts. However, hockey is a fluid sport due to its complex situation and the frequent substitutions by both teams, resulting in the players taking various postures during a game. Few player detection models from basketball and soccer take these characteristics into account, especially for team detection without prior annotations. Here, a two-phase cascaded convolutional neural network (CNN) model is designed for the detection of individual ice hockey players, and the jersey color of the detected players is extracted to further identify team affiliations. Our model filters most of the disturbing information, such as the audience and sideline advertising bars, in Phase I and refines the detection of the targeted players in Phase II, resulting in an accurate detection with a precision of 98.75% and a recall of 94.11% for individual players and an average accuracy of 93.05% for team classification with a self-built dataset of collected images from the 2018 Winter Olympics. The results for the regular season games of the 2019-2020 National Hockey League (NHL) covering all 31 teams are also presented to show the robustness of our model. Compared to state-of-the-art approaches, our player detection model achieves the highest accuracy with the self-built dataset.
This paper describes a new process of generating a top view figure from football game videos that shows the positions of players to facilitate an efficient game analysis. At present, the top view figure is often created manually at the practical level and requires much time, thereby highlighting the need to automate the creation of this figure. In the proposed process, the top view figure is created in four steps. First, lines are detected from binarized images to recognize the area in front of the goal. Second, by using the recognized area and the predefined image of the football field, a projective transformation matrix is calculated to transform the point of view. Third, the players are extracted from the image by using the selective search method, while the sides of these players is determined based on their color information. The camera movement must also be detected in each frame and its influence must be ignored by tracking the feature points of the audiences’ seats. Fourth, considering the player information, the projective transformation matrix, and the camera movement, the top view figures are created by calculating the actual positions of players. Although the experiment results show few problems, we have successfully created top view figures for all frames in the selected football game video.
In this paper, one novel method to extract flux from two dimensional spectral images which we observed through LAMOST (Large Area Multi-Object Fiber Spectroscopic Telescope) is proposed. First of all, the spectral images are preprocessed. Then, in the flux extraction algorithm, the GRNN (General Regression Neural Network) and double Gaussian function are employed to simulate the profile of each spectrum in spatial orientation. We perform our experiment, with same radial basis function, by GRNN and RBFNN (Radial Basis Function Neural Network) method. The experimental results show that our method performs higher SNR (Signal Noise Ration) and lower time-consuming that is more applicable in such massive spectral data.
In this paper, a novel multiple objects detection and tracking approach based on support vector machine and particle filter is proposed to track players in broadcast sports video. Com- pared with previous work, the contributions of this paper are focused on three aspects. First, an improved particle filter called SVR particle filter is proposed as the player tracker by integrating support vector regression (SVR) into sequen- tial Monte Carlo framework. SVR particle filter enhances the performance of classical particle filter with small sample set and improves the efficiency of tracking system. Second, support vector classification combined with playfield seg- mentation is employed to automatically detect the players in sports video as the initialization of tracker. Third, a unified framework for automatic object detection and tracking is proposed based on support vector machine and particle filter. The experimental results are encouraging and demonstrate that our approach is effective.
A novel thin line detection algorithm for use in low-altitude aerial vehicles is presented. This algorithm is able to detect thin obstacles such as cables, power lines, and wires. The system is intended to be used during urban search and rescue operations, capable of dealing with low-quality images, robust to image clutter, bad weather, and sensor artifacts. The detection process uses motion estimation at the pixel level, combined with edge detection, followed by a windowed Hough transform. The evidence of lines is tracked over time in the resulting parameter spaces using a dynamic line movement model. The algorithm's receiver operating characteristic curve (ROC) is shown, based on a multi-site dataset with 86 videos with 10160 wires spanning in 5576 frames.
Classifying video content into different semantic granularities is a possible way for flexible video indexing, browsing and retrieval. In this paper, a placed kick refinement algorithm is proposed after semantic based event detection or manually annotation. The placed kick event is further classified into following three types: free kick, corner kick and penalty according to the ball and field lines detection and their relationships determination. Firstly, we carry out ball detection in the global shot of the placed kick event. According to the ball detection results, we further determine whether to detect field lines using Hough transform. Finally, the ball and field lines detection results are integrated in decision making stage. Experimental results show the effectiveness of the proposed method.
Detecting pedestrians accurately is the first fundamental step for many computer vision applications such as video surveillance, smart vehicles, intersection traffic analysis and so on. The authors present an experimental study on pedestrian detection using state-of-the-art local feature extraction and support vector machine (SVM) classifiers. The performance of pedestrian detection using region covariance, histogram of oriented gradients (HOG) and local receptive fields (LRF) feature descriptors is experimentally evaluated. The experiments are performed on the DaimlerChrysler benchmarking data set, the MIT CBCL data set and 'Intitut National de Recherche en Informatique et Automatique (INRIA) data set. All can be publicly accessed. The experimental results show that region covariance features with radial basis function kernel SVM and HOG features with quadratic kernel SVM outperform the combination of LRF features with quadratic kernel SVM. Furthermore, the results reveal that both covariance and HOG features perform very well in the context of pedestrian detection.
Detecting lines from a digital image is very important in image processing. An efficient line detection based on randomized method is presented. Different from the previous HT (Hough Transform) - based methods which vote on a parameter space, this algorithm does not need an accumulator for representing parameter space. The main concept in the proposed method is: firstly, it selects two different edge points from an edge image to form a candidate line; secondly, under the given distance tolerance, a strip image region along the candidate line direction can be got, then the number of edge points in the determined image strip region is accumulated; and lastly, the threshold rules will be applied to further determine whether the candidate line is the desired one. Experimental results show that this approach can accurately find the lines in noisy images. Compared with HT and RHT (Randomized Hough Transform), the proposed algorithm has the advantages of fewer storage space and shorter computational time.
The detection of lines in an image is an important task. The well-known Standard Hough Transform (SHT) and Progressive Probabilistic Hough Transform (PPHT) are two of the most efficient algorithms for line detection. SHT can detect almost straight lines in the image; moreover, it is highly resistant to noise. Line segments are found effectually by PPHT, but there are a few problems, resulting in this algorithm having lower accuracy than SHT. This paper proposes an extension of this robust algorithm to detect line segments accurately. The proposal contains three extensions: the technique of accumulation, the application of a local maxima rule in the SHT pace, and detection of line segments. The PPHT algorithm is used to compare the experimental results to the results of the proposed method.
In this paper, we propose an original approach in order to improve the results of color image segmentation by pixel classification. We define a new kind of color space by selecting a set of color components which can belong to any of the different classical color spaces. Such spaces, which have neither psycho-visual nor physical color significance, are named hybrid color spaces. We propose to classify pixels represented in the hybrid color space which is specifically designed to yield the best discrimination between the pixel classes. This space, which is called the adapted hybrid color space, is built by means of a sequential supervised feature selection scheme. This procedure determines the adapted hybrid color space associated with a given family of images. Its dimension is not always equal to three, as for classical color spaces. The effectiveness of our color segmentation method is assessed in the framework of soccer image analysis. The team of each player is identified by the colors of its soccer suit. The aim of the segmentation procedure is to extract meaningful regions representing the players and to recognize their teams.
We present a statistical approach for parsing football video structures. Based on video production conventions, a new generic structure called attack is identified, which is an equivalent of scene in other video domains. We define four video segments to construct it, namely play, focus, replay and break. Two middle level visual features, play field ratio and zoom size, are also computed. The detection process includes a two-pass classifier, a combination of Gaussian Mixture Model and Hidden Markov Models. A general suffix tree is introduced to identify and organize attack. In experiments, video structure classification accuracy of about 86% is achieved on broadcasting World Cup 2002 video data.
The dominant color descriptor (DCD) is widely applied in the image retrieval taken as one of MPEG-7 color descriptors. DCD describes the representative color distributions and features in an image or a region of interest through an effective, compact and intuitive format. A novel image retrieval method based on the fixed number's MPEG-7 dominant color descriptor is proposed. The feature extraction process does not need the intervention of the threshold value and the dominant color number is fixed as eight. The histogram intersection algorithm is used to measure features, simplifies the similarity computation complexity. The experiment results show that the precision and recall rate of this method is higher than that of non-fixed number's dominant color retrieval method.