May 2025
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
Publications (697)
April 2025
·
17 Reads
The Visual Computer
Visual object tracking is crucial for numerous applications ranging from smartphones to autonomous vehicles. However, the impact of input noise on tracking performance remains underexplored. This paper presents a lightweight neural network module designed to enhance the robustness of 2D tracking methods against various types of noise. By performing image-to-image translation, the proposed robust tracking module (RTM) standardizes the operational space of tracking algorithms, thereby improving their resilience. Experimental results on benchmark datasets demonstrate the effectiveness of RTM in mitigating performance degradation caused by noise. Additionally, we introduce an evaluation toolkit that facilitates the assessment of tracking robustness against common noise types. The source code of the proposed method is available at https://github.com/iason1907/RTM.
April 2025
·
25 Reads
March 2025
·
2 Reads
Signal Processing Image Communication
February 2025
·
3 Reads
February 2025
·
1 Read
·
2 Citations
IEEE Transactions on Image Processing
Stereo matching has emerged as a cost-effective solution for road surface 3D reconstruction, garnering significant attention towards improving both computational efficiency and accuracy. This article introduces decisive disparity diffusion (D3Stereo), marking the first exploration of dense deep feature matching that adapts pre-trained deep convolutional neural networks (DCNNs) to previously unseen road scenarios. A pyramid of cost volumes is initially created using various levels of learned representations. Subsequently, a novel recursive bilateral filtering algorithm is employed to aggregate these costs. A key innovation of D3Stereo lies in its alternating decisive disparity diffusion strategy, wherein intra-scale diffusion is employed to complete sparse disparity images, while inter-scale inheritance provides valuable prior information for higher resolutions. Extensive experiments conducted on our created UDTIRI-Stereo and Stereo-Road datasets underscore the effectiveness of D3Stereo strategy in adapting pre-trained DCNNs and its superior performance compared to all other explicit programming-based algorithms designed specifically for road surface 3D reconstruction. Additional experiments conducted on the Middlebury dataset with backbone DCNNs pre-trained on the ImageNet database further validate the versatility of D3Stereo strategy in tackling general stereo matching problems. Our source code and supplementary material are publicly available at https://mias.group/D3-Stereo.
January 2025
·
6 Reads
January 2025
January 2025
·
5 Reads
·
7 Citations
IEEE Transactions on Instrumentation and Measurement
Feature-fusion networks with duplex encoders have proven to be an effective technique to solve the road freespace detection problem. However, despite the compelling results achieved by previous research efforts, the exploration of adequate and discriminative heterogeneous feature fusion, as well as the development of fallibility-aware loss functions, remains relatively scarce. This article makes several significant contributions to address these limitations: 1) it presents a novel heterogeneous feature fusion block, comprising a holistic attention module, a heterogeneous feature contrast descriptor, and an affinity-weighted feature recalibrator, enabling a more in-depth exploitation of the inherent characteristics of the extracted features, 2) it incorporates both inter-scale and intra-scale skip connections into the decoder architecture, while eliminating redundant ones, leading to both improved accuracy and computational efficiency, and 3) it introduces two fallibility-aware loss functions that separately focus on semantic-transition and depth-inconsistent regions, collectively contributing to greater supervision during model training. Our proposed SNE-RoadSegV2, which incorporates all these innovative components, demonstrates superior performance in comparison to all other freespace detection algorithms across multiple public datasets.
January 2025
Citations (49)
... Advancements in machine intelligence and autonomous systems have dramatically fueled the integration of environmental perception technologies into daily life and various industries [1][2][3][4][5]. This widespread adoption is prominently seen in applications such as autonomous cars [6], smart wheelchairs [7], and unmanned ground vehicles [8]. Recently, researchers have shifted their focus toward enhancing both driving safety and comfort [9,10]. ...
- Citing Article
January 2025
IEEE Transactions on Instrumentation and Measurement
... Deep learning has made remarkable advancements in fields such as autonomous driving [1]- [3], with real-time object detection technologies, exemplified by YOLOs and DETRs, gaining widespread adoption [4], [5]. However, due to factors such as light attenuation, color distortion, and difficulty distinguishing targets from coral reefs, mud, and other underwater structures, the development of highperformance real-time underwater object detection (UOD) has been relatively slow [6]. ...
- Citing Article
February 2025
IEEE Transactions on Image Processing
... It is widely accepted that there are three major components impacting inference performance: Efficiency, Consistency and Accuracy [42][43][44]. Efficiency refers to how quickly the model processes and returns responses. It is influenced by (a) Response Time (i.e., the time taken for the model to generate a response after receiving a query); (b) Server Busy Rate (SBR, i.e., the proportion of queries that cannot be processed due to system overload or resource constraints). ...
- Citing Conference Paper
October 2024
... Neural networks are a powerful and flexible tool for forecasting [19][20][21]. When determining what exactly should be predicted, it is necessary to specify the variables that are analyzed and used in the forecasting process. ...
- Citing Article
- Full-text available
September 2024
IEEE Transactions on Artificial Intelligence
... Patsiouras et al. [3] ...
- Citing Conference Paper
- Full-text available
September 2023
... A few leading countries have been incorporating UAV swarms in their military attacks [6,7] and they have been proven to be extremely effective. UAV swarms have also been used in the film industry for cinematography purposes [8,9] and in the entertainment sector to create appealing drone shows that grab the attention of tourists [10,11], such as in Dubai. The practical implications of securing UAV operations extend across diverse sectors, showcasing the transformative potential of this technology when effectively safeguarded against threats such as GPS spoofing. ...
- Citing Article
- Full-text available
August 2023
... A typical stage where the NMS are struggle in performing are, when they are operated on the image, depicting the objects in a complex scenes, and in more of complex and several occlusions [17]. To overcome all these complexities faced by the existing approaches, such as the complexity in background, increased computational time, less accuracy rates, intensive calculation protocols [18][19][20], the proposed model used the Deep Pliable YOLOv5 model, for the aircraft detection using the aircraft and airbus dataset, using the Adaptive Spatial Pooling layer for obtaining the features via a parallelised method, which are more advanced than the traditional pooling method in the image recognition. ...
- Citing Conference Paper
- Full-text available
April 2023
... Parallel implementations have significantly improved computational efficiency on embedded GPU architectures, demonstrating more than 40 times speedups while maintaining detection accuracy [29]. Neural attention-driven techniques have emerged to better handle occlusions in pedestrian detection by jointly processing geometric and visual properties through sequenceto-sequence formulations [30]. For overlapping object scenarios, adaptive NMS strategies dynamically adjust IoU thresholds based on object density, substantially enhancing performance for weed detection applications [31]. ...
- Citing Article
- Full-text available
April 2023
IEEE Transactions on Image Processing
... Zhang et al. proposed a multidimensional scaling method based on the Wasserstein-Fourier distance to classify complex time series from a frequency domain perspective . A fast multidimensional scaling method on big geospatial data using neural networks is presented by Mademlis et al., where sampling a small subset of the original dataset is conducted (Mademlis et al., 2023). Multidimensional scaling is also combined with principal component analysis to analyze varietal sedimentary provenance data (Vermeesch et al.,2023), and it is used to investigate and compare the similarity of writing prompts in the IELTS and TOEFL iBT tests (Khademi, 2023). ...
- Citing Article
- Full-text available
May 2023
Earth Science Informatics
... In early approaches, the summarization component was composed of LSTM units that estimated the frames' importance according to their temporal dependence (thus indicating the most significant video parts for inclusion in the summary), while the reconstruction of the video based on the specified summary was performed using trainable auto-encoders (Mahasseni et al., 2017;Apostolidis et al., 2019;Yuan et al., 2020) that in some cases were combined with tailored attention mechanisms (Jung et al., 2019;Apostolidis et al., 2020;Kanafani et al., 2021). In more recent methods, the selection of the most important frames or fragments for the summary was assisted by trainable Actor-Critic models (Apostolidis et al., 2021a;Alexoudi et al., 2023), self-attention mechanisms (He et al., 2019;Jung et al., 2020;Liang et al., 2022), spatio-temporal networks (Wu et al., 2021) or knowledge distillation mechanisms (Sreeja & Kovoor, 2022). A less popular approach for unsupervised video summarization is based on the definition of hand-crafted reward functions about specific properties of the generated summary, and the use of the computed rewards for training video summarization architectures based on reinforcement learning. ...
- Citing Conference Paper
- Full-text available
April 2023