Conference Paper

YOLOv4 Pedestrian Target Detection Based on Embedded Platform

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This work proposes a new approach to improve swarm intelligence algorithms for dynamic optimization problems by promoting a balance between the transfer of knowledge and the diversity of particles. The proposed method was designed to be applied to the problem of video tracking targets in environments with almost constant lighting. This approach also delimits the solution space for a more efficient search. A robust version to outliers of the double exponential smoothing (DES) model is used to predict the target position in the frame delimiting the solution space in a more promising region for target tracking. To assess the quality of the proposed approach, an appropriate tracker for a discrete solution space was implemented using the meta-heuristic Shuffled Frog Leaping Algorithm (SFLA) adapted to dynamic optimization problems, named the Dynamic Shuffled Frog Leaping Algorithm (DSFLA). The DSFLA was compared with other classic and current trackers whose algorithms are based on swarm intelligence. The trackers were compared in terms of the average processing time per frame and the area under curve of the success rate per Pascal metric. For the experiment, we used a random sample of videos obtained from the public Hanyang visual tracker benchmark. The experimental results suggest that the DSFLA has an efficient processing time and higher quality of tracking compared with the other competing trackers analyzed in this work. The success rate of the DSFLA tracker is about 7.2 to 76.6% higher on average when comparing the success rate of its competitors. The average processing time per frame is about at least 10% faster than competing trackers, except one that was about 26% faster than the DSFLA tracker. The results also show that the predictions of the robust DES model are quite accurate.
Article
Full-text available
Manual microscopic examination of Leishman/Giemsa stained thin and thick blood smear is still the “gold standard” for malaria diagnosis. One of the drawbacks of this method is that its accuracy, consistency, and diagnosis speed depend on microscopists’ diagnostic and technical skills. It is difficult to get highly skilled microscopists in remote areas of developing countries. To alleviate this problem, in this paper, we propose to investigate state-of-the-art one-stage and two-stage object detection algorithms for automated malaria parasite screening from microscopic image of thick blood slides. YOLOV3 and YOLOV4 models, which are state-of-the-art object detectors in accuracy and speed, are not optimized for detecting small objects such as malaria parasites in microscopic images. We modify these models by increasing feature scale and adding more detection layers to enhance their capability of detecting small objects without notably decreasing detection speed. We propose one modified YOLOV4 model, called YOLOV4-MOD and two modified models of YOLOV3, which are called YOLOV3-MOD1 and YOLOV3-MOD2. Besides, new anchor box sizes are generated using K-means clustering algorithm to exploit the potential of these models in small object detection. The performance of the modified YOLOV3 and YOLOV4 models were evaluated on a publicly available malaria dataset. These models have achieved state-of-the-art accuracy by exceeding performance of their original versions, Faster R-CNN, and SSD in terms of mean average precision (mAP), recall, precision, F1 score, and average IOU. YOLOV4-MOD has achieved the best detection accuracy among all the other models with a mAP of 96.32%. YOLOV3-MOD2 and YOLOV3-MOD1 have achieved mAP of 96.14% and 95.46%, respectively. The experimental results of this study demonstrate that performance of modified YOLOV3 and YOLOV4 models are highly promising for detecting malaria parasites from images captured by a smartphone camera over the microscope eyepiece. The proposed system is suitable for deployment in low-resource setting areas.
Article
Full-text available
Novel drivetrain concepts such as electric direct drives can improve vehicle dynamic control due to faster, more accurate, and more flexible generation of wheel individual propulsion and braking torques. Exact and robust estimation of vehicle state of motion in the presence of unknown disturbances, such as changes in road conditions, is crucial for realization of such control systems. This article shows the design, tuning, implementation, and test of a state estimator with individual tire model adaption for direct drive electric vehicles. The vehicle dynamics are modeled using a double-track model with an adaptive tire model. State-of-the-art sensors, an inertial measurement unit, steering angle, wheel speed, and motor current sensors are used as measurements. Due to the nonlinearity of the vehicle model, an Unscented Kalman Filter (UKF) is used for simultaneous state and parameter estimation. To simplify the difficult task of UKF tuning, an optimization-based method using real-vehicle data is utilized. The UKF is implemented on an electronic control unit and tested with real-vehicle data in a hardware-in-the-loop simulation. High precision even in severe driving maneuvers under various road conditions is achieved. Nonlinear state and parameter estimation for all wheel drive electric vehicles using UKF and optimization-based tuning is shown to provide high precision with minimal manual tuning effort.
Article
Full-text available
The aim is to better recognize human behaviors from surveillance videos. Human behavior recognition technology based on surveillance videos is researched, given the intellectual development of massive surveillance video data with full coverage. This technology builds a human behavior detection and recognition model using the new Single Shot MultiBox Detector (SSD) algorithm, which improves the recognition accuracy. The constructed model’s effectiveness is verified through comparisons with other traditional human behavior recognition algorithms via the TensorFlow framework. Results demonstrate the SSD model-based recognition algorithm’s accuracy is significantly higher than that of Direct Part Marking and Fast Convolutional Neural Network (CNN) algorithms. SSD’s average speed is 0.146 s/frame, and the average accuracy on different datasets is 82.8%. If the target is close or partially occluded, the SSD algorithm can also accurately detect the central target, and the detection efficiency is twice that of the R-CNN algorithm. The algorithm proposed has a simple structure and fast processing speed, which can solve the problems in target detection. The research results can provide a theoretical basis for the research on target detection related to human behavior recognition.
Article
Full-text available
Detecting concealed cracks in asphalt pavement has been a challenging task due to the nonvisibility of the location of these cracks. This study proposes an effective method to automatically perform the recognition and location of concealed cracks based on 3-D ground penetrating radar (GPR) and deep learning models. Using a 3-D GPR and a filtering process, a dataset was constructed, including 303 GPR images and 1306 cracks. Next, You Only Look Once (YOLO) models were first introduced as deep learning models for detecting concealed cracks using GPR data. The results reveal that this proposed method is feasible for the detection of concealed cracks. Compared with YOLO version 3, YOLO version 4 (YOLOv4) and YOLO version 5 (YOLOv5) both achieve obvious progress even in a small dataset. The fastest detection speed of YOLOv4 models reaches 10.16 frames per second using only a medium CPU and the best mAP of YOLOv5 models is up to 94.39%. In addition, the YOLOv4 models show better robustness than the YOLOv5 models and could accurately distinguish between concealed cracks and pseudo cracks.
Article
Full-text available
Surface quality inspection and control are extremely important for electronic manufacturing. The use of machine vision technology to automatically detect the defects of products has become an indispensable means for better quality control. A machine vision-based surface quality inspection system is usually composed of two processes: image acquisition and automatic defect detection. In this paper, we propose a deep learning-based approach for the defect detection of Copper Clad Laminate (CCL) images acquired from an industrial CCL production line. In the proposed approach, a new convolutional neural network (CNN) that realizes fast defect detection while maintaining high accuracy is designed. Our proposed approach makes four contributions. First, we introduce the depthwise separable convolution to reduce the calculation time. Second, we improve the squeeze-and-excitation block to improve network performance. Third, we introduce the squeeze-and-expand mechanism to reduce the computation cost. Fourth, we employ a smoother activation function (Mish) to allow improved information flow. The proposed network is compared with the benchmark CNNs (including Inception, ResNet and MobileNet). The experimental results show that compared with the benchmark networks, our proposed network has achieved the best results regarding the accuracy and suboptimal results in terms of the speed compared with the benchmark networks. Therefore, our proposed method has been integrated into an industrial CCL production line as a guideline for online defective product rejection.
Article
Full-text available
Most sound imaging instruments are currently used as measurement tools which can provide quantitative data, however, a uniform method to directly and comprehensively evaluate the results of combining acoustic and optical images is not available. Therefore, in this study, we define a localization error index for sound imaging instruments, and propose an acoustic phase cloud map evaluation method based on an improved YOLOv4 algorithm to directly and objectively evaluate the sound source localization results of a sound imaging instrument. The evaluation method begins with the image augmentation of acoustic phase cloud maps obtained from the different tests of a sound imaging instrument to produce the dataset required for training the convolutional network. Subsequently, we combine DenseNet with existing clustering algorithms to improve the YOLOv4 algorithm to train the neural network for easier feature extraction. The trained neural network is then used to localize the target sound source and its pseudo-color map in the acoustic phase cloud map to obtain a pixel-level localization error. Finally, a standard chessboard grid is used to obtain the proportional relationship between the size of the acoustic phase cloud map and the actual physical space distance; then, the true lateral and longitudinal positioning error of sound imaging instrument can be obtained. Experimental results show that the mean average precision of the improved YOLOv4 algorithm in acoustic phase cloud map detection is 96.3%, the F1-score is 95.2%, and detection speed is up to 34.6 fps. The improved algorithm can rapidly and accurately determine the positioning error of sound imaging instrument, which can be used to analyze and evaluate the positioning performance of sound imaging instrument.
Article
Full-text available
The sustainability of ornamental crop production is of increasing concern to both producers and consumers. As resources become more limited, it is important for greenhouse growers to reduce production inputs such as water and chemical fertilizers, without sacrificing crop quality. Plant growth promoting rhizobacteria (PGPR) can stimulate plant growth under resource-limiting conditions by enhancing tolerance to abiotic stress and increasing nutrient availability, uptake, and assimilation. PGPR are beneficial bacteria that colonize the rhizosphere, the narrow zone of soil in the vicinity of the roots that is influenced by root exudates. In this study, in vitro experiments were utilized to screen a collection of 44 Pseudomonas strains for their ability to withstand osmotic stress. A high-throughput greenhouse experiment was then utilized to evaluate selected strains for their ability to stimulate plant growth under resource-limiting conditions when applied to ornamental crop production systems. The development of a high-throughput greenhouse trial identified two pseudomonads, P. poae 29G9 and P. fluorescens 90F12-2, that increased petunia flower number and plant biomass under drought and low-nutrient conditions. These two strains were validated in a production-scale experiment to evaluate the effects on growth promotion of three economically important crops: Petunia × hybrida, Impatiens walleriana, and Viola × wittrockiana. Plants treated with the two bacteria strains had greater shoot biomass than untreated control plants when grown under low-nutrient conditions and after recovery from drought stress. Bacteria treatment resulted in increased flower numbers in drought-stressed P. hybrida and I. walleriana. In addition, bacteria-treated plants grown under low-nutrient conditions had higher leaf nutrient content compared to the untreated plants. Collectively, these results show that the combination of in vitro and greenhouse experiments can efficiently identify beneficial Pseudomonas strains that increase the quality of ornamental crops grown under resource-limiting conditions.
Article
Full-text available
Distracted driver action is the main cause of road traffic crashes, which threatens the security of human life and public property. Based on the observation that cues (like the hand holding the cigarette) reveal what the driver is doing, a driver action recognition model is proposed, which is called deformable and dilated Faster R-CNN (DD-RCNN). Our approach utilizes the detection of motion-specific objects to classify driver actions exhibiting great intra-class differences and inter-class similarity. Firstly, deformable and dilated residual block are designed to extract features of action-specific RoIs that are small in size and irregular in shape (such as cigarettes and cell phones). Attention modules are embedded in the modified ResNet to reweight features in channel and spatial dimensions. Then, the region proposal optimization network (RPON) is presented to reduce the number of RoIs entering R-CNN and improves model efficiency. Lastly, the RoI pooling module is replaced with the deformable one, and the simplified R-CNN without regression layer is trained as the final classifier. Experiments show that DD-RCNN demonstrates state-of-the-art results on Kaggle-driving dataset and self-built dataset.
Article
Full-text available
Vehicle detection is a challenging task in computer vision. In recent years, numerous vehicle detection methods have been proposed. Since the vehicles may have varying sizes in a scene, while the vehicles and the background in a scene may be with imbalanced sizes, the performance of vehicle detection is influenced. To obtain better performance on vehicle detection, a multi-scale vehicle detection method was proposed in this paper by improving YOLOv2. The main contributions of this paper include: (1) a new anchor box generation method Rk-means++ was proposed to enhance the adaptation of varying sizes of vehicles and achieve multi-scale detection; (2) Focal Loss was introduced into YOLOv2 for vehicle detection to reduce the negative influence on training resulting from imbalance between vehicles and background. The experimental results upon the Beijing Institute of Technology (BIT)-Vehicle public dataset demonstrated that the proposed method can obtain better performance on vehicle localization and recognition than that of other existing methods.
Article
Full-text available
Herein, a novel adaptive dynamic programming (ADP) algorithm is developed to solve the optimal tracking control problem of discrete-time multi-agent systems. Compared to the classical policy iteration ADP algorithm with two components, policy evaluation, and policy improvement, a two-stage policy iteration algorithm is proposed to obtain the iterative control laws and the iterative performance index functions. The proposed algorithm contains a sub-iteration procedure to calculate the iterative performance index functions at the policy evaluation. The convergence proof for the iterative performance index functions and the iterative control laws are provided. Subsequently, the stability of the closed-loop error system is also provided. Further, an actor-critic neural network (NN) is used to approximate both the iterative control laws and the iterative performance index functions. The actor-critic NN can implement the developed algorithm online without knowledge of the system dynamics. Finally, simulation results are provided to illustrate the performance of our method.
Article
Full-text available
Pose detection of small targets in poor imaging conditions like heavy occlusion and low resolution is still an open and challenging task in computer vision. For instance, detection of students' poses in classrooms that are even indistinguishable to human eyes remains a rather difficult task. Motivated by the success of convolutional feature merging and locality preserving, the authors propose a pose detection framework combining merged region of interest (ROI) pooling and locality preserving learning. Unlike usual object detection algorithms which use general top-level convolutional features as inputs, their method uses a merged ROI pooling structure to merge semantic feature and high-resolution feature from the last two levels of convolutional feature maps, so that this merged feature is made more expressive than the single-level feature. In addition, the locality feature-preserving learning is used in the last fully-connected layer. Through locality preserving learning, features belonging to the same class would be forced to be closer in the feature space, which enables the model with stronger classification ability. Experimental results show that the proposed method outperforms the state-of-the-art methods.
Article
Full-text available
The deep convolution neural network has shown great potential in the field of human action recognition. For the sake of obtaining compact and discriminative feature representation, this paper proposes multiple pooling strategies using CNN features. We explore three different pooling strategies, which are called space-time feature pooling (STFP), time filter pooling (TFP) and spatio-temporal pyramid pooling (STPP), respectively. STFP shares the advantages of both hand-crafted features and deep ConvNets features. TFP reflects the change of elements on each CNN feature map over time. STPP focuses on the spatial and temporal pyramid structure of the feature maps. We aggregate these pooled features to produce a new discriminative video descriptor. Experimental results show that the three strategies have complementary advantages on the challenging YouTube, UCF50 and UCF101 datasets, and our video representation is comparable to the previous state-of-the-art algorithms.
Article
The use of object detection algorithms has become extremely important in autonomous vehicles. Object detection at high accuracy and a fast inference speed is essential for safe autonomous driving. Therefore, the balance between effectiveness and efficiency of the object detector must be considered. This paper proposes a one-stage object detection framework for improving the detection accuracy while supporting a true real-time operation based on the YOLOv4. The backbone network in the proposed framework is the CSPDarknet53_dcn(P). The last output layer in the CSPDarknet53 is replaced with deformable convolution to improve the detection accuracy. In order to perform feature fusion, a new feature fusion module PAN++ is designed and 5 scales detection layers are used to improve the detection accuracy of small objects. In addition, this paper proposes an optimized network pruning algorithm to solve the problem that the real-time performance of the algorithm cannot be satisfied due to the limited computing resources of the vehicle-mounted computing platform. The method of sparse scaling factor is used to improve the existing channel pruning algorithm. Compared to the YOLOv4, the YOLOV4-5D improves the mean average precision by 4.23%on the BDD datasets and 1.68% on the KITTI datasets. Finally, by pruning the model, the inference speed of YOLOV4-5D is increased 31.3% and the memory is only 98.1 MB when the detection accuracy is almost unchanged. Nevertheless, the proposed algorithm is capable of real-time detection at faster than 66 frames per second (fps) and shows higher accuracy than the previous approaches with a similar fps.
Article
Unmanned Aerial Vehicles (UAVs) are promising technologies within many different application scenarios including human detection in search and rescue and surveillance use cases, which have received considerable attention worldwide. However, adverse conditions, such as varying altitude, overhead camera placement, changing illumination and moving platform, impose challenges for high-performance yet cost-efficient human detection. To overcome these challenges, we propose a novel combination of dilated convolutions with Path Aggregation Network (PAN) as a new deep neural network-based human detection algorithm in real time. Furthermore, we establish a comprehensive human detection dataset with varying backgrounds, illuminations, and contrast and train the proposed machine-learning model on the collected dataset. Our approach achieves both high precision (88.0% mean Average Precision (mAP)) and real time (67.0 Frames Per Second (FPS)) on a commercial off-the-shelf PC platform. In terms of accuracy, the result is comparable to the standard You Only Look Once v3 (YOLOv3). However, the speed is twice as that of the standard YOLOv3. YOLOv4 is slightly more accurate (89.8%) than our approach. However, it is slower (38.0 versus 67.0 FPS) and has more Billion Floating-Point Operations (BFLOPS). The proposed algorithm has also trained with the VisDrone2019 dataset and compared with seven studies using this dataset. The results have further validated the effectiveness of the proposed approach. Moreover, the algorithm has been evaluated on an embedded system (Jetson AGX Xavier), which demonstrates the usefulness of this method on power-constrained devices. The proposed algorithm is fast, memory efficient, and computationally less expensive to achieve high detection performance. It is expected to contribute significantly to the wider use of UAV applications including search and rescue missions to locate missing people, and surveillance particularly for applications running on resource-constrained platforms, like smartphones or tablets. This proposed system is now being used in aerial drone system of Police of Scotland to help them locate and find missing and vulnerable people. The results of the project were broadcasted by BBC Scotland.
Article
Mixed sample augmentation (MSA) has witnessed great success in the research area of semi-supervised learning (SSL) and is performed by mixing two training samples as an augmentation strategy to effectively smooth the training space. Following the insights on the efficacy of cut-mix in particular, we propose FMixCut, an MSA that combines Fourier space-based data mixing (FMix) and the proposed Fourier space-based data cutting (FCut) for labeled and unlabeled data augmentation. Specifically, for the SSL task, our approach first generates soft pseudo-labels using the model’s previous predictions. The model is then trained to penalize the outputs of the FMix-generated samples so that they are consistent with their mixed soft pseudo-labels. In addition, we propose to use FCut, a new Cutout-based data augmentation strategy that adopts the two masked sample pairs from FMix for weighted cross-entropy minimization. Furthermore, by implementing two regularization techniques, namely, batch label distribution entropy maximization and sample confidence entropy minimization, we further boost the training efficiency. Finally, we introduce a dynamic labeled–unlabeled data mixing (DDM) strategy to further accelerate the convergence of the model. Combining the above process, we finally call our SSL approach as ”FMixCutMatch”, in short FMCmatch. As a result, the proposed FMCmatch achieves state-of-the-art performance on CIFAR-10/100, SVHN and Mini-Imagenet across a variety of SSL conditions with the CNN-13, WRN-28-2 and ResNet-18 networks. In particular, our method achieves a 4.54% test error on CIFAR-10 with 4K labels under the CNN-13 and a 41.25% Top-1 test error on Mini-Imagenet with 10K labels under the ResNet-18. Our codes for reproducing these results are publicly available at https://github.com/biuyq/FMixCutMatch.
Article
Porphyrins are important molecules widely found in nature in the form of enzyme active sites and visible light absorption units. Recent interest in using these functional molecules as building blocks for the construction of metal–organic frameworks (MOFs) have rapidly increased due to the ease in which the locations of, and the distances between, the porphyrin units can be controlled in these porous crystalline materials. Porphyrin-based MOFs with atomically precise structures provide an ideal platform for the investigation of their structure–function relationships in the solid state without compromising accessibility to the inherent properties of the porphyrin building blocks. This review will provide a historical overview of the development and applications of porphyrin-based MOFs from early studies focused on design and structures, to recent efforts on their utilization in biomimetic catalysis, photocatalysis, electrocatalysis, sensing, and biomedical applications.
Article
s Traditional methods of breeding soft-shell crabs mainly rely on manual identification, which has high costs regarding manpower and resources. Manual inspection may also interfere with crabs’ molting, causing molting failure, and possibly even death, which is costly and inefficient. This paper combines an improved YOLOv3 algorithm with an adaptive dark-channel defogging algorithm to realize the real-time detection of whether a swimming crab in a single-crab basket-culture system is molting. For learning more features, affine, rotation transformation and local occlusion are used to augment the training data to simulate the difficulty of identification caused by occlusion, in case molting may occur under distorted viewing conditions in real culture environments. A k-means++ clustering algorithm is used to obtain prior boxes matching the size of the carapace throughout the entire breeding cycle, and so improve the Intersection over Union (IOU). The identification network itself can have its network structure pruned and the non-maximum suppression function modified to increase rapidity and accuracy; the improved network can recognize and give early warning of the early stage of molting. The precision of the improved model in clean water reaches 100%, and the running speed was 31 FPS. In turbid water where the prediction confidence is lower than the cut-in threshold of defogging algorithm set as 0.8, the precision of the improved model was over 91%, and the speed can still maintain about 7 FPS.
Article
This study addresses the tracking–learning–detection (TLD) algorithm for long‐term single‐target tracking of moving vehicle from video streams. The problems leading to tracking failures in existing TLD methods are discovered, and an improved TLD (ITLD) tracking algorithm is proposed which is more robust to object occlusion and illumination variation. A square root cubature Kalman filter (SRCKF) is employed in the tracker of TLD to predict the position of the object when occlusion occurs. Besides, this study introduces fast retina keypoint (FREAK) feature into the tracker to alleviate the instability caused by illumination variation or scale variation. The overlap comparison and the normalised cross‐correlation coefficient (NCC) are introduced to the integrator of the TLD to obtain reliable bounding boxes with improved tracking precision. Experiments are conducted to compare the performance of the state‐of‐the‐art trackers and the proposed method, using the object tracking benchmark that includes 50 video sequences (OTB‐50) and TLD datasets. The experimental results show that the proposed ITLD outperforms on both tracking accuracy and robustness. The proposed method can track a moving vehicle even when it is temporally totally occluded.
Article
The present experiment sought to further understanding of the effects of personalised audiovisual stimuli on psychological and psychophysiological responses during exercise in adults with obesity. Twenty-four participants (Mage = 28.3, SD = 5.5 years; MBMI = 32.2, SD = 2.4) engaged in self-paced exercises on a recumbent cycle ergometer and three conditions (sensory stimulation [ST], sensory deprivation [DE], and control [CO]) were administered. Perceptual (attentional focus and perceived exertion), affective (affective state and perceived activation), and psychophysiological (heart rate variability) parameters were monitored throughout the exercise bouts. A one-way repeated measures analysis of variance was used to compare self-reported and psychophysiological variables (main and interaction effects [5 Timepoints × 3 Conditions]). The results indicate that ST increased the use of dissociative thoughts throughout the exercise session (ηp2 = .19), ameliorated fatigue-related symptoms (ηp2 = .15) and elicited more positive affective responses (ηp2 = .12) than CO and DE. Accordingly, personally-compiled videos are highly effective in ameliorating exertional responses and enhancing affective valence during self-paced exercise in adults with obesity. Audiovisual stimuli could be used during the most critical periods of the exercise regimen (e.g., first training sessions) when individuals with obesity are more likely to focus on fatigue-related sensations.
A New Approach to Enhanced Swarm Intelligence Applied to Video Target Tracking
  • J K Castro Ecd
  • Salles
Recognizing human behaviors from surveillance videos using the SSD algorithm
  • J K Pan
  • Hs
  • Y Z Li
  • D Z Zhao
Action Recognition Using Multiple Pooling Strategies of CNN Features
  • J K Hu
  • Hf
  • Liao
  • Xiao Zk
Pose detection in complex classroom environment based on improved Faster R-CNN
  • J K Tang
  • L Gao
Driver action recognition using deformable and dilated faster R-CNN with optimized region proposals
  • J K Lu
  • Mq
  • Y C Hu
  • Lu
  • Xb
Detection of concealed cracks from ground penetrating radar images based on deep learning algorithm
  • J K Li
  • Sw
  • Gu
  • Xy
  • Q Dong
Effects of audiovisual stimuli on psychological and psychophysiological responses during exercise in adults with obesity
  • J K Bigliassi
  • M Greca
  • Jpa
  • Altimari
  • Lr
YOLOv4-5D: An Effective and Efficient Object Detector for Autonomous Driving
  • J K Cai
  • T Y Luan
  • Z X Li
Multi-Scale Vehicle Detection for Foreground-Background Class Imbalance with Improved YOLOv2
  • J K Wu
  • Zy
  • J Sang
  • X F Xia
Moving vehicle tracking based on improved tracking–learning–detection algorithm
  • J K Dong
  • E Z Deng
  • Mt
  • Du
  • Sz
Study on the Evaluation Method of Sound Phase Cloud Maps Based on an Improved YOLOv4 Algorithm
  • J K Zhu
  • Qf
  • Zheng
  • Hf
  • Guo
  • Sx
A deep learning-based approach for the automated surface inspection of copper clad laminate images
  • J K Zheng
  • Xq
  • J Chen
  • Y G Kong