Article

Finish-line photography system based on multi-scale convolutional neural network deblurring

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In sports science research, the dynamic non-uniform blur caused by the movement of running athletes is a challenging problem in computer vision, that seriously affects the judgment accuracy of the finish-line photography system. With the rapid development of deep learning technology, image preprocessing, object identification, and object classification have been widely used and studied. This work proposes multi-scale convolutional neural network image deblurring to eliminate dynamic blur generated by the athletes in the shooting process. This network comprises three end-to-end convolutional neural subnetworks of different scales to recover the blurry athlete image caused by various factors on the field. The system effectively extracts the detailed edge of the image on each scale from coarse to fine. Many experiments show that this method can deblur the image captured by the finish-line photography system in real-time and rapidly achieve a better visual effect in the athlete’s dynamic image.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
When a table tennis ball is hit by a racket, the ball spins and undergoes a complex trajectory in the air. In this article, a model of a spinning ball is proposed for simulating and predicting the ball flight trajectory including the topspin, backspin, rightward spin, leftward spin, and combined spin. The actual trajectory and rotational motion of a flying ball are captured by three high-speed cameras and then reconstructed using a modified vision tracking algorithm. For the purpose of model validation, the simulated trajectory is compared to the reconstructed trajectory, resulting in a deviation of only 2.42%. Such high modeling accuracy makes this proposed method an ideal tool for developing the virtual vision systems emulating the games that can be used to train table tennis players efficiently.
Article
Full-text available
Deep neural networks have achieved impressive successes in fields ranging from object recognition to complex games such as Go. Navigation, however, remains a substantial challenge for artificial agents, with deep neural networks trained by reinforcement learning failing to rival the proficiency of mammalian spatial behaviour, which is underpinned by grid cells in the entorhinal cortex. Grid cells are thought to provide a multi-scale periodic representation that functions as a metric for coding space and is critical for integrating self-motion (path integration) and planning direct trajectories to goals (vector-based navigation). Here we set out to leverage the computational functions of grid cells to develop a deep reinforcement learning agent with mammal-like navigational abilities. We first trained a recurrent network to perform path integration, leading to the emergence of representations resembling grid cells, as well as other entorhinal cell types12. We then showed that this representation provided an effective basis for an agent to locate goals in challenging, unfamiliar, and changeable environments—optimizing the primary objective of navigation through deep reinforcement learning. The performance of agents endowed with grid-like representations surpassed that of an expert human and comparison agents, with the metric quantities necessary for vector-based navigation derived from grid-like units within the network. Furthermore, grid-like representations enabled agents to conduct shortcut behaviours reminiscent of those performed by mammals. Our findings show that emergent grid-like representations furnish agents with a Euclidean spatial metric and associated vector operations, providing a foundation for proficient navigation. As such, our results support neuroscientific theories that see grid cells as critical for vector-based navigation, demonstrating that the latter can be combined with path-based strategies to support navigation in challenging environments.
Article
Full-text available
We introduce a new iterative regularization procedure for inverse problems based on the use of Bregman distances, with particular focus on problems arising in image processing. We are motivated by the problem of restoring noisy and blurry images via variational methods, by using total variation regularization. We obtain rigorous convergence results, and eectiv e stopping criteria for the general procedure. The numerical results for denoising appear to give signican t improvement over standard models and preliminary results for deblurring/denoising are very encouraging.
Conference Paper
Full-text available
We present a novel single image deblurring method to estimate spatially non-uniform blur that results from camera shake. We use existing spatially invariant deconvolution methods in a local and robust way to compute initial estimates of the latent image. The camera motion is represented as a Motion Density Function (MDF) which records the fraction of time spent in each discretized portion of the space of all possible camera poses. Spatially varying blur kernels are derived directly from the MDF. We show that 6D camera motion is well approximated by 3 degrees of motion (in-plane translation and rotation) and analyze the scope of this approximation. We present results on both synthetic and captured data. Our system out-performs current approaches which make the assumption of spatially invariant blur.
Article
The Winter Olympics are a highly competitive sporting environment where subtle improvements in performance can impact the finishing order in many events. Aerodynamic drag is known to be a significant resistive force to human movement in high-speed sports, such as alpine skiing, speed skating and bobsleigh. Aerodynamic drag also represents an important determinant of performance in sports such as ice hockey, snowboard cross and cross-country skiing. From 2000 to 2018, a series of wind tunnel–based research projects were conducted to provide aerodynamically optimized apparel, equipment and wind tunnel simulation training to elite Canadian and American winter sports athletes involved in bobsleigh, skeleton, luge, ice hockey, speed skating, cross-country, alpine and para-alpine skiing, biathlon, ski-cross and snowboard cross. This article reviews the role of aerodynamic drag in winter sports, considers fundamental principles of air flow around bluff bodies and methods of drag reduction in ice and snow sports, while providing experimental results from an extensive database of wind tunnel investigations. Deficits in the literature suggest productive areas for future research to improve athletic performance in these sports.
Article
Image deblurring and super-resolution (SR) are computer vision tasks aiming to restore image detail and spatial scale, respectively. Despite significant research effort over the past years, it remains challenging for joint image deblurring and SR via deep networks. Besides, only a few recent literatures contribute to this task, as conventional methods deal with SR or deblurring separately. To rectify the weakness, we propose a novel network that handles both tasks jointly and in this way boosts the SR performance from blurry input greatly. To fully exploit the representation capacity of our model, dual supervised learning is proposed to impose the constraint between low-resolution (LR) and high-resolution (HR) images. Our model consists of three parts: (i) a deblurring module equipped with channel attention residual blocks that removes the fuzziness from input images, (ii) an SR module to super-resolve the image based on the feature maps from the deblurring module serving as input, and (iii) a dual module which exploits the dependencies between LR and HR images. Extensive experiments indicate that the proposed attention dual supervised network (ADSN) not only generates remarkably clear HR images, but also achieves compelling results for joint image deblurring and SR task.
Article
Biomechanical analysis has typically been confined to a laboratory setting. While attempts have been made to take laboratory testing into the field, this study was designed to assess whether augmented reality (AR) could be used to bring the field into the laboratory. This study aimed to measure knee load in volleyball players through a jump task incorporating AR while maintaining the perception-action couplings by replicating the visual features of a volleyball court. Twelve male volleyball athletes completed four tasks: drop landing, hop jump, spike jump, and spike jump while wearing AR smart glasses. Biomechanical variables included patellar tendon force, knee moment and kinematics of the ankle, knee, hip, pelvis and thorax. The drop landing showed differences in patellar tendon force and knee moment when compared to the other conditions. The hop jump did not present differences in kinetics when compared to the spike conditions, instead of displaying the greatest kinematic differences. As a measure of patellar tendon loading the AR condition showed a close approximation to the spike jump, with no differences present when comparing landing forces and mechanics. Thus, AR may be used in a clinical assessment to better replicate information from the competitive environment.
Article
In sports science research, there are many topics that utilize the body motion of athletes extracted by motion capture system, since motion information is valuable data for improving an athlete’s skills. However, one of the unsolved challenges in motion capture is extraction of athletes’ motion information during the actual game or match, as placing markers on athletes is a challenge during game play. In this research, the authors propose a method for acquisition of motion information without attaching a marker, utilizing computer vision technology. In the proposed method, the three-dimensional world joint position of the athlete’s body can be acquired using just two cameras without any visual markers. Furthermore, the athlete’s three-dimensional joint position during game play can also be obtained without complicated preparations. Camera calibration that estimates the projective relationship between three-dimensional world and two-dimensional image spaces is one of the principal processes for the respective three-dimensional image processing, such as three-dimensional reconstruction and three-dimensional tracking. A strong-calibration method, which needs to set up landmarks with known three-dimensional positions, is a common technique. However, as the target space expands, landmark placement becomes increasingly complicated. Although a weak-calibration method does not need known landmarks, the estimation precision depends on the accuracy of the correspondence between image captures. When multiple cameras are arranged sparsely, sufficient detection of corresponding points is difficult. In this research, the authors propose a calibration method that bridges multiple sparsely distributed cameras using mobile camera images. Appropriate spacing was confirmed between the images through comparative experiments evaluating camera calibration accuracy by changing the number of bridging images. Furthermore, the proposed method was applied to multiple capturing experiments in a large-scale space to verify its robustness. As a relevant example, the proposed method was applied to the three-dimensional skeleton estimation of badminton players. Subsequently, a quantitative evaluation was conducted on camera calibration for the three-dimensional skeleton. The reprojection error of each part of the skeletons and standard deviations were approximately 2.72 and 0.81 mm, respectively, confirming that the proposed method was highly accurate when applied to camera calibration. Consequently, a quantitative evaluation was conducted on the proposed calibration method and a calibration method using the coordinates of eight manual points. In conclusion, the proposed method stabilizes calibration accuracy in the vertical direction of the world coordinate system.
Chapter
Convolutional neural network (CNN) depth is of crucial importance for image super-resolution (SR). However, we observe that deeper networks for image SR are more difficult to train. The low-resolution inputs and features contain abundant low-frequency information, which is treated equally across channels, hence hindering the representational ability of CNNs. To solve these problems, we propose the very deep residual channel attention networks (RCAN). Specifically, we propose a residual in residual (RIR) structure to form very deep network, which consists of several residual groups with long skip connections. Each residual group contains some residual blocks with short skip connections. Meanwhile, RIR allows abundant low-frequency information to be bypassed through multiple skip connections, making the main network focus on learning high-frequency information. Furthermore, we propose a channel attention mechanism to adaptively rescale channel-wise features by considering interdependencies among channels. Extensive experiments show that our RCAN achieves better accuracy and visual improvements against state-of-the-art methods.
Article
It is a challenging task to recover a high quality image from the degraded images. This paper proposes a fast image deblurring algorithm. To deal with the limitations of the proximal Newton splitting scheme, a sparse framework is presented, which characterized by utilizing the sparse pattern of the approximated inverse Hessian matrix and relaxing the original assumption on the constant penalty parameter. The proposed framework provides a common update strategy by exploiting the second derivative information. To alleviate the difficulties introduced by the sub-problem of this framework, an approximate solution to the weighted norm based primal-dual problem is derived and studied. Moreover, its theoretical aspects are also investigated. Compared with the state-of-the-art methods in several numerical experiments, the proposed algorithm demonstrates the performance improvement and efficiency.
Conference Paper
Motion blur due to camera shake is one of the predominant sources of degradation in handheld photography. Single image blind deconvolution (BD) or motion deblurring aims at restoring a sharp latent image from the blurred recorded picture without knowing the camera motion that took place during the exposure. BD is a long-standing problem, but has attracted much attention recently, cumulating in several algorithms able to restore photos degraded by real camera motion in high quality. In this paper, we present a benchmark dataset for motion deblurring that allows quantitative performance evaluation and comparison of recent approaches featuring non-uniform blur models. To this end, we record and analyse real camera motion, which is played back on a robot platform such that we can record a sequence of sharp images sampling the six dimensional camera motion trajectory. The goal of deblurring is to recover one of these sharp images, and our dataset contains all information to assess how closely various algorithms approximate that goal. In a comprehensive comparison, we evaluate state-of-the-art single image BD algorithms incorporating uniform and non-uniform blur models.
Article
As a powerful statistical image modeling technique, sparse representation has been successfully used in various image restoration applications. The success of sparse representation owes to the development of the l <sub>1</sub>-norm optimization techniques and the fact that natural images are intrinsically sparse in some domains. The image restoration quality largely depends on whether the employed sparse domain can represent well the underlying image. Considering that the contents can vary significantly across different images or different patches in a single image, we propose to learn various sets of bases from a precollected dataset of example image patches, and then, for a given patch to be processed, one set of bases are adaptively selected to characterize the local sparse domain. We further introduce two adaptive regularization terms into the sparse representation framework. First, a set of autoregressive (AR) models are learned from the dataset of example image patches. The best fitted AR models to a given patch are adaptively selected to regularize the image local structures. Second, the image nonlocal self-similarity is introduced as another regularization term. In addition, the sparsity regularization parameter is adaptively estimated for better image restoration performance. Extensive experiments on image deblurring and super-resolution validate that by using adaptive sparse domain selection and adaptive regularization, the proposed method achieves much better results than many state-of-the-art algorithms in terms of both PSNR and visual perception.