Kaoru Hirota’s research while affiliated with Beijing Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (561)


Gaussian and Non-Gaussian Noise Effects on Data-Driven Modeling: A Comparative Investigation
  • Chapter

April 2025

Bemnet Wondimagegnehu Mersha

·

Yaping Dai

·

Kaoru Hirota



Learning Sequential Variation Information for Dynamic Facial Expression Recognition

April 2025

·

1 Read

IEEE Transactions on Neural Networks and Learning Systems

A multiscale sequence information fusion (MSSIF) method is presented for dynamic facial expression recognition (DFER) in video sequences. It exploits multiscale information by integrating features from individual frames, subsequences, and entire sequences through a transformer-based architecture. This hierarchical feature fusion process includes deep feature extraction at the frame level to capture intricate visual details, intrasubsequence fusion using self-attention mechanisms for analyzing adjacent frames, and intersubsequence fusion to synthesize long-term emotional dynamics across time scales. The efficacy of MSSIF is demonstrated through extensive evaluation on three video datasets: eNTERFACE’05, BAUM-1s, and AFEW, where it achieves overall recognition accuracies of 60.1%, 60.7%, and 58.8%, respectively. These results substantiate MSSIF’s superior performance in accurately recognizing facial expressions by managing short and long-term dependencies within video sequences, making it a potent tool for real-world applications requiring nuanced dynamic facial expression detection.




Outline of proposed X-ray detection system.
Time-domain recorded signals from (a) the first detector, and (b) the second detector set up in Fig. 1.
Frequency-domain of recorded signals from (a) first detector, and (b) second detector.
Example of a signal decomposed into approximation and details parts using DWT.
Result of optimal feature selected by the SA model.

+3

Towards an intelligent integrated methodology for accurate determination of volume percentages in three-phase flow systems
  • Article
  • Full-text available

March 2025

·

10 Reads

Accurate determination of volume percentages in three-phase fluids is paramount for the success of various industrial processes, ranging from oil and gas production to chemical engineering. This study presents a comprehensive approach to this challenge by leveraging advanced signal processing techniques and machine learning paradigms. Our methodology integrates the time, frequency, and wavelet transform features extracted from X-ray-based measurement systems whose structure consists of an X-ray tube source, two sodium iodide detectors, and a test pipe, all of which were simulated using the Monte Carlo N Particle code. The amalgamation of these features provides a rich representation of the fluid composition that captures both temporal and spectral characteristics. To enhance the discriminative power of the features, we employ a simulated annealing algorithm to strategically reduce their dimensionality and select pertinent features. The simulated annealing unit systematically evaluates the contribution of each feature to predictive accuracy. Further, through iterative elimination and re-evaluation, the algorithm refines the feature set, retaining only those with the highest relevance to the three-phase fluid composition. This feature selection process optimises the performance of subsequent machine learning models, streamlining the input space for enhanced interpretability and efficiency. Finally, to determine the volume percentages, we employ a support vector regression (SVR) neural network, which is trained on a refined dataset with capability to handle complex relationships and high-dimensional data. The proposed approach demonstrates superior accuracy in determining volume percentages of three-phase fluids compared to traditional methods, thereby making it an effective and integrated technique to analyse fluid composition in a variety of industrial settings and applications.

Download



Dynamic Multi-Population Mutation Architecture-Based Equilibrium Optimizer and Its Engineering Application

February 2025

·

13 Reads

·

1 Citation

To strengthen the population diversity and search capability of equilibrium optimizer (EO), a dynamic multi-population mutation architecture-based equilibrium optimizer (DMMAEO) is proposed. Firstly, a dynamic multi-population guidance mechanism is constructed to enhance population diversity. Secondly, a dynamic Gaussian mutation-based sub-population concentration updating mechanism is introduced to strengthen exploitation ability. Finally, a dynamic Cauchy mutation-based sub-population equilibrium candidate generation mechanism is integrated to boost exploration ability. The optimization ability of DMMAEO is assessed through a comparison with several recent promising algorithms on 58 test functions (including 29 representative test functions and 29 CEC2017 test functions). The comparison results reveal that the DMMAEO has superiority in the performance assessment of seeking global optimum over other compared algorithms. The DMMAEO is further employed in addressing six engineering design problems and a UGV multi-target path planning problem. The results show the practicality of DMMAEO in addressing engineering application tasks. The aforementioned numerical optimization and engineering application experimental results show that the three enhancement mechanisms of DMMAEO improve the optimization ability of the canonical EO, and the DMMAEO has competitiveness in tackling various kinds of complex numerical optimization and engineering application problems.


Citations (52)


... Gaussian methods have a wide range of applications [14][15][16][17][18] in a variety of areas. Gaussian blur has been widely applied in image processing [19]; however, it has not been applied to face generation field. ...

Reference:

Continuous Talking Face Generation Based on Gaussian Blur and Dynamic Convolution
Dynamic Multi-Population Mutation Architecture-Based Equilibrium Optimizer and Its Engineering Application

... To indefinitely protect multimedia content, both techniques are used for secure content sharing [28]. Their combination is therefore increasing [29][30][31][32]. However, due to the large volume of multimedia data, the efficiency of content sharing is low if the encryption procedure is conducted in the spatial domain. ...

Dual medical image watermarking using SRU-enhanced network and EICC chaotic map

Complex & Intelligent Systems

... Recently, state-of-the-art deep learning algorithms have been proposed for various classification and segmentation tasks (24)(25)(26)(27)(28). These algorithms include convolutional neural networks (CNNs) (29,30), Transformer (31)(32)(33), quantum-enhanced deep learning (34)(35)(36), and ensemble learning (37,38). Among these algorithms, CNNs are some of the most widely used methods that is capable of building strong non-linear relationships between training data and labels (39)(40)(41). ...

Review of medical image processing using quantum-enabled algorithms

Artificial Intelligence Review

... Knapova et al. analyzed survey data from 1,135 participants in industrial environments using regression models to explore the relationship between physical activity intent and actual behavior, providing a basis for improvements in ergonomic design [27]. Li et al. developed a personal information fuzzy emotional inference model (BDFEI) based on deep fusion networks for understanding emotional intent and recognizing emotional behavior in human-computer interaction [28]. Dino et al. conducted a survey study in a Filipino community senior center and constructed a theoretical model of how behavior intent influences elderly physical exercise [29]. ...

Broad-deep network-based fuzzy emotional inference model with personal information for intention understanding in human–robot interaction
  • Citing Article
  • January 2024

Annual Reviews in Control

... .4.2.1 Ensemble learning.Ensemble learning can enhance the accuracy and robustness of dynamic MaE recognition by frame aggregation that can combine temporal dynamic features across multiple frames and frame-level emotion classification results In deep dynamic MaE recognition, the frame aggregation can be implemented at decision-level[91,92] and feature-level[132,134,152,157].The decision-level frame aggregation integrates the classification results of individual frames in the form of class probability vectors. For instance, Kahou et al.[92] explored the decision-level frame aggregation by using averaging and expansion for deep dynamic MaE recognition. ...

Adaptive key-frame selection-based facial expression recognition via multi-cue dynamic features hybrid fusion
  • Citing Article
  • March 2024

Information Sciences

... The rapid progress of the Internet of Medical Things (IoMT), wireless communications, and artificial intelligence has propelled smart healthcare into a paramount role in the everyday lives of individuals [1]. Advances healthcare into a patient-centered remote diagnosis model [2]. With the continuous promotion of telemedicine, an increasing number of medical images, including computed tomography (CT), magnetic resonance (MR), ultrasound, and X-ray scans, are being transmitted over public networks [3]. ...

A disease diagnosis system for smart healthcare based on fuzzy clustering and battle royale optimization
  • Citing Article
  • December 2023

Applied Soft Computing

... On the other hand, starting from the pioneering work in [20], neuro-computing engines are appearing more and more interesting from different points of view. In the meantime, emotional and affective computing has enabled the move from fully rational AI to humanized AI [21]. Great effort by the scientific community is devoted to emotional and affective experience in many fields and contexts, primarily to better understand and model intentions. ...

A broad-deep fusion network-based fuzzy emotional intention inference model for teaching validity evaluation
  • Citing Article
  • November 2023

Information Sciences

... Implementation has resulted in a 65% reduction in payment delays and an 84% decrease in reconciliation queries across healthcare providers [12]. attempts by 91% compared to basic authentication methods [13]. ...

Insights into security and privacy issues in smart healthcare systems based on medical images
  • Citing Article
  • November 2023

Journal of Information Security and Applications

... Multimodal Data Fusion: Deep learning methods implement different fusion techniques, either early or late, or hybrid fusion models, in order to enrich speech, text, and visual data to generate multimodal data that in the end will lead to the development of advanced artificial intelligent systems with more elaborate context adaptations [19,20]. ...

A review of multimodal emotion recognition from datasets, preprocessing, features, and fusion methods
  • Citing Article
  • October 2023

Neurocomputing

... This enables the analysis of the human skeleton in both temporal and spatial dimensions. This method effectively extracts spatial features from the topological graph for learning purposes, exhibiting wide-ranging potential applications in behavior recognition, action prediction, video understanding, intelligent monitoring, pedestrian tracking, human-computer interaction, and various other fields [13][14][15][16] . ...

Shuffle Graph Convolutional Network for Skeleton-Based Action Recognition
  • Citing Article
  • September 2023

Journal of Advanced Computational Intelligence and Intelligent Informatics