Article

Neural Networks for Pattern Recognition.

Taylor & Francis
Journal of the American Statistical Association
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... An artificial neural network [1,2] is commonly defined as a function N( − → x , − → w ), where the vector − → x stands for the input pattern and the vector − → w (weight vector) represents the vector of parameters for this particular network. This function is used in classification and regression problems, and the training procedure refers to the adaptation of − → w by minimizing the so-called training error defined as follows: (1) where the set − → x i , y i , i = 1, . . . ...
... An artificial neural network [1,2] is commonly defined as a function N( − → x , − → w ), where the vector − → x stands for the input pattern and the vector − → w (weight vector) represents the vector of parameters for this particular network. This function is used in classification and regression problems, and the training procedure refers to the adaptation of − → w by minimizing the so-called training error defined as follows: (1) where the set − → x i , y i , i = 1, . . . , M stands for the training set for the training process. ...
... Due to the widespread use of artificial neural networks, a significant range of methods has been developed that identify the optimal parameter vector by minimizing Equation (1). This set of methods includes the Backpropagation method [14,15], the RPROP method [16], the Adam optimizer [17], etc. ...
Article
Full-text available
Artificial neural networks have proven to be an important machine learning model that has been widely used in recent decades to tackle a number of difficult classification or data fitting problems within real-world areas. Due to their significance, several techniques have been developed to efficiently identify the parameter vectors for these models. These techniques usually come from the field of optimization and, by minimizing the training error of artificial neural networks, can estimate the vector of their parameters. However, these techniques often either get trapped in the local minima of a training error or lead to overfitting in the artificial neural network, resulting in poor performance when applied to data that were not present during the training process. This paper presents an innovative training technique for artificial neural networks based on the differential evolution optimization method. This new technique creates an initial population of artificial neural networks that evolve, as well as periodically applies a local optimization technique in order to accelerate the training of these networks. The application of the local minimization technique was performed in such a way as to avoid the phenomenon of overfitting. This new method was successfully applied to a series of classification and data fitting problems, and a comparative study was conducted with other training techniques from the relevant literature.
... ANN, on the other hand, provides a flexible framework and a better insight into the dynamical aspects of the problem can be gained using a properly trained ANN model. If the parameters under question are considered to be random variables then the output of the ANN model approaches conditional expectation of the observed parameter conditioned on the input vector (Bishop 1995) provided a large number of samples have been used. The large deviation compensates for the small probability of occurrence. ...
... Since the lag correlations have been calculated at most two years in advance, the first two years have been kept out only as the input and not as output. Overfitting was taken care of as suggested in Haykin (2002) and Bishop (1995). During the 'training' phase, estimation and validation sets were employed, and the training was interrupted every 10 iterations to assess the performance on new data. ...
... During the 'training' phase, estimation and validation sets were employed, and the training was interrupted every 10 iterations to assess the performance on new data. To prevent overfitting (Haykin 2002;Bishop 1995), the training was finally terminated when the error on the estimation set and the validation set began to converge and the error on the validation set stopped increasing or decreasing (Haykin 2002). This allowed the network to generalise (Haykin 2002) when 'test' data was presented to it. ...
Article
Full-text available
In light of the importance of the formation of dipoles in the Indian Ocean (IO), it becomes pertinent to investigate whether or not such events are inherently predictable. The authors investigate if the formation of a dipole is the result of local weather events or that of the dynamics of the system that generates the sea surface temperature (SST) time series. In the present study, artificial neural network prediction errors in different temporal regions have been analysed to answer the question for the 1997 event. It is found that the phenomenon was a consequence of the state of the SST system as a whole together with the evolution laws. As El‐Nino and intraseasonal oscillations (ISO) are believed to have forced the formation of the 1997 dipole, the prediction errors are also analysed to statistically investigate such possibility. It is concluded that the ISO may provide the stochastic forcing to the Indian Ocean dipole (IOD) which is in agreement with the observations made by dynamical modelling of the system. The model is further evaluated for categorical forecast skills to forecast the anomalous points. The analysis shows that the model is capable of forecasting the anomalous points in the SST time series and that the dipole formation is a result of the deterministic laws governing the IO SST time series.
... Regularization reduces variance by increasing bias, following the bias-variance trade-off [138]. Effective regularizers minimize variance while limiting bias impact [139]. ...
... Early stopping also acts as a regularization method, especially with the MSE function [139]. Here, the term 1 ηt , where η denotes the learning rate and t is the iteration index, serves as the regularization parameter, with effective non-zero weights increasing during training. ...
... Introducing jitter to input data enhances generalization by serving as a form of smoothing regularization, with the noise variance acting as the regularization parameter [139,141]. This technique helps prevent overfitting in large networks that lack sufficient constraints. ...
Article
Full-text available
Machine learning has become indispensable across various domains, yet understanding its theoretical underpinnings remains challenging for many practitioners and researchers. Despite the availability of numerous resources, there is a need for a cohesive tutorial that integrates foundational principles with state-of-the-art theories. This paper addresses the fundamental concepts and theories of machine learning, with an emphasis on neural networks, serving as both a foundational exploration and a tutorial. It begins by introducing essential concepts in machine learning, including various learning and inference methods, followed by criterion functions, robust learning, discussions on learning and generalization, model selection, bias–variance trade-off, and the role of neural networks as universal approximators. Subsequently, the paper delves into computational learning theory, with probably approximately correct (PAC) learning theory forming its cornerstone. Key concepts such as the VC-dimension, Rademacher complexity, and empirical risk minimization principle are introduced as tools for establishing generalization error bounds in trained models. The fundamental theorem of learning theory establishes the relationship between PAC learnability, Vapnik–Chervonenkis (VC)-dimension, and the empirical risk minimization principle. Additionally, the paper discusses the no-free-lunch theorem, another pivotal result in computational learning theory. By laying a rigorous theoretical foundation, this paper provides a comprehensive tutorial for understanding the principles underpinning machine learning.
... Various studies have been conducted in the literature to mitigate the negative effects of visual stimuli on system users. For instance, in [45], a BCI system operating at high frequencies (56)(57)(58)(59)(60)(61)(62)(63)(64)(65)(66)(67)(68)(69)(70) was proposed to reduce the sensation of flicker caused by vibrating stimuli. The system was tested with low-frequency (26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40) stimuli. ...
... Linear discriminant analysis (LDA) is a commonly used machine-learning algorithm for distinguishing and classifying groups of attributes within a dataset. The algorithm aims to maximize the differences between attribute groups while minimizing the variations within each class [68]. In addition to classification, LDA is used for dimensionality reduction. ...
Article
Full-text available
Nowadays, brain-computer interface (BCI) systems are frequently used to connect individuals who have lost their mobility with the outside world. These BCI systems enable individuals to control external devices using brain signals. However, these systems have certain disadvantages for users. This paper proposes a novel approach to minimize the disadvantages of visual stimuli on the eye health of system users in BCI systems employing visual evoked potential (VEP) and P300 methods. The approach employs moving objects with different trajectories instead of visual stimuli. It uses a light-emitting diode (LED) with a frequency of 7 Hz as a condition for the BCI system to be active. The LED is assigned to the system to prevent it from being triggered by any involuntary or independent eye movements of the user. Thus, the system user will be able to use a safe BCI system with a single visual stimulus that blinks on the side without needing to focus on any visual stimulus through moving balls. Data were recorded in two phases: when the LED was on and when the LED was off. The recorded data were processed using a Butterworth filter and the power spectral density (PSD) method. In the first classification phase, which was performed for the system to detect the LED in the background, the highest accuracy rate of 99.57% was achieved with the random forest (RF) classification algorithm. In the second classification phase, which involves classifying moving objects within the proposed approach, the highest accuracy rate of 97.89% and an information transfer rate (ITR) value of 36.75 (bits/min) were achieved using the RF classifier.
... Its use allows the generation of systems that offer assistance in diagnosing diseases, predictions in financial and advertising services, and improvements and automation in the industry [6], [7], [8]. The use of neural networks for pattern recognition allows for the performance of classification tasks on large amounts of data, for which automated processes are required [9] and optimized with previously cataloged data. The combination of texture feature extraction techniques with neural networks has shown promising results in various applications [10], showing the ability of these methodologies to adjust to data of different natures. ...
... , ( − ̅ ) 2 (9) and in the vertical case ...
Article
Full-text available
In this paper, a methodology for real-time image classification on multimedia platforms has been developed. For this purpose, six feedforward neural network models were trained with images from two databases, which were preprocessed by three texture extraction methods: local binary pattern-uniform (LBP-U), gray level cooccurrence matrix (GLCM), and wavelet image scattering (WIS). The databases used consist of 157,448 images of the sections with the thumbnails of the platform content (mosaics) representing 14 classes and 38,214 images with the descriptions of the available content (descriptors) representing 11 classes, where all images have a resolution of 1280x720 pixels. The six models (three for mosaics and three for descriptors) were validated with images from the databases, which were not part of the training process, to obtain their performance metrics. The training and validation process was performed 30 times, and the average results were compared. The most outstanding models for each database were the neural networks trained with the wavelet image scattering method, with metrics of 99.97±0.01% accuracy, 99.99±0.01% specificity, 99.84±0.06% sensitivity, 99.59±0.13% precision and 99.71±0. 08% of F1 score with a response time of 0.7349 seconds for the model trained with mosaics and with metrics of 99.90±0.03% of accuracy, 99.94±0.02% of specificity, 99.58±0.15% of sensitivity, 98.63±0.55% of precision and 99.09±0.30% of F1 score with a response time of 0.6227 seconds for the model trained with descriptor images. The results are very significant due to the high efficiency obtained and confirm the effectiveness of the models with the WIS method for the classification of multimedia platform images with the characteristics of the databases used. It is suggested that the remaining methods be adjusted to improve their performance.
... This procedure was performed using 100 iterations, ensuring complete data sample consideration for model learning in each increment. Model performance was evaluated for each shuffle using leave-one-out cross-validation to calculate regression metrics (Kohavi 1995 (Bishop 1995). Learning algorithms were implemented using scikit-learn libraries in the Python programming language (Pedregosa et al. 2011). ...
... The Multi-Layer Perceptron (MLP) is a neural network model employed for regression analysis to unveil non-linear relationships concealed within data samples (Gardner and Dorling 1998). The MLP comprises interconnected nodes, with connecting nodes governed by a function representing the sum of node input variables, further modified by a non-linear activation function (Bishop 1995). The architecture of MLP involves multiple layers of nodes, incorporating hidden layers positioned between the input and output layers, which help unravel complex variable interactions leading to a more accurate prediction (Amadio et al. 2019). ...
Thesis
Full-text available
Residential buildings are critical for the social and economic function of society. Globally, residential buildings constitute a high proportion of any country's fixed asset wealth but are highly susceptible to flood damage. This thesis investigates residential building exposure and vulnerability to flooding, with a focus on New Zealand, where like many other countries information scarcity poses significant challenges for flood risk assessment. The thesis objective is to formulate a detailed understanding of factors driving residential building exposure and vulnerability in New Zealand, and similar global flood risk contexts. Firstly, a metamodel was constructed to determine how residential buildings have evolved in New Zealand flood hazard areas (FHAs). Object-specific building attributes were simulated using geospatial data integration models and imputation from classification and regression-based learning algorithms. Residential building area and monetary value exposure in FHAs increased nationwide each decade since 1900. Present-day housing monetary exposure is valued at NZD 213 billion, approximately 12% of New Zealand’s housing value. Nearly half of the monetary exposure occurs in five major urban areas. A novel on-site damage assessment approach used for six major New Zealand flood events has created new knowledge on flood vulnerability. Empirical damage data identifies several highly important hazard and exposure explanatory variables (i.e., water depth, flow velocity, area, floor height) driving direct damage at building and component levels. Investigations on how model complexity influences predictive damage model accuracy and reliability have found that more complex ensemble-based regression algorithms such as Random Forests, which consider multiple explanatory hazard and exposure variables outperform more simple univariable approaches. Damage model transfer between geographic locations is highly dependent on the model's capacity to replicate multiple explanatory damage variables driving local residential building damage. The thesis findings encourage transitions from traditional univariable model (i.e., depth-damage curves), toward multivariable model approaches in flood risk analyses. Flood risk model outcome variance is highly influenced by the choice of exposure and vulnerability model approach to estimate residential building replacement value and direct damage. Multivariable exposure and vulnerability model approaches were successfully demonstrated to reduce epistemic uncertainties causing flood risk model outcome variance.
... +C 12 α f 2 (y(n − k 12 )), y(n + 1) = β 2 y(n) + α f 1 ...
... Then there exists r 1 ...
Article
Full-text available
A generalized neural network with four arbitrary delays is considered in this paper. A criterion of both Devaney chaos and Li-Yorke chaos is given under some weak conditions. Furthermore, lower bounds of the parameters of making the network chaotic are effectively determined. For different values of the delays, the proving processes of chaos are different and extremely complicated, while the results are very beautiful. In order to show the chaotic behaviors better, two examples are provided and theirs chaotic phenomena and the trends of the largest Lyapunov exponents for different parameters are displayed.
... The "101" denotes the structure of weight layers. Resnet rephrases the networks as residual framework relative to the source input signal [39,40,[73][74][75]. The Resnet framework exploits the identity shortcuts to capture essential residual information. ...
... A typical ResNet model includes five layers of CNNs, each consisting of convolution, batch normalization, and activation processes. Figure 9 illustrates the ResNet101 model framework used in the proposed ensemble approach, where element-wise computation between the residual R(x) and input x is conducted to better preserve features [39,73,74]. ...
Article
Full-text available
The degeneration of the intervertebral discs in the lumbar spine is the common cause of neurological and physical dysfunctions and chronic disability of patients, which can be stratified into single—(e.g., disc herniation, prolapse, or bulge) and comorbidity-type degeneration (e.g., simultaneous presence of two or more conditions), respectively. A sample of lumbar magnetic resonance imaging (MRI) images from multiple clinical hospitals in China was collected and used in the proposal assessment. We devised a weighted transfer learning framework WDRIV-Net by ensembling four pre-trained models including Densenet169, ResNet101, InceptionV3, and VGG19. The proposed approach was applied to the clinical data and achieved 96.25% accuracy, surpassing the benchmark ResNet101 (87.5%), DenseNet169 (82.5%), VGG19 (88.75%), InceptionV3 (93.75%), and other state-of-the-art (SOTA) ensemble deep learning models. Furthermore, improved performance was observed as well for the metric of the area under the curve (AUC), producing a ≥ 7% increase versus other SOTA ensemble learning, a ≥ 6% increase versus most-studied models, and a ≥ 2% increase versus the baselines. WDRIV-Net can serve as a guide in the initial and efficient type screening of complex degeneration of lumbar intervertebral discs (LID) and assist in the early-stage selection of clinically differentiated treatment options.
... MDNs address non-unique inverse problems, like inferring BPs from R rs , by modeling the output as a probability distribution over possible output values. Therefore, the distribution is modeled using a mixture of Gaussians (Bishop, 1994;1995). Instead of providing the average value of the expected output distribution, like typical multilayer perceptrons (MLPs), MDN provides the full output distribution, enabling the user to study the distribution and choose an appropriate output value, enhancing the overall estimation when the predicted output distribution is multimodal or asymmetric. ...
... This study leverages this characteristic to estimate optically relevant BPs and IOPs through MDNs. MDNs differ from traditional neural networks (Bishop, 1995;Bricaud et al., 2007;Jamet et al., 2012) by approximating the likelihoods of generated estimates as mixture of Gaussians (Bishop, 1994), thereby accommodating multimodal target distributions, a fundamental characteristic of inverse problems where a nonunique relationship exists between input and output features. MDNs offer a distinct approach by modeling conditional probabilities of the target variables based on input data, and thus, acquiring a comprehensive understanding of the probability distribution within the target space. ...
Article
Full-text available
Ocean color remote sensing tracks water quality globally, but multispectral ocean color sensors often struggle with complex coastal and inland waters. Traditional models have difficulty capturing detailed relationships between remote sensing reflectance ( R rs ), biogeochemical properties (BPs), and inherent optical properties (IOPs) in these complex water bodies. We developed a robust Mixture Density Network (MDN) model to retrieve 10 relevant biogeochemical and optical variables from heritage multispectral ocean color missions. These variables include chlorophyll-a ( Chla ) and total suspended solids ( TSS ), as well as the absorbing components of IOPs at their reference wavelengths. The heritage missions include the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard Aqua and Terra, the Environmental Satellite (Envisat) Medium Resolution Imaging Spectrometer (MERIS), and the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the Suomi National Polar-orbiting Partnership (Suomi NPP). Our model is trained and tested on all available in situ spectra from an augmented version of the GLObal Reflectance community dataset for Imaging and optical sensing of Aquatic environments (GLORIA) (N = 9,956) after having added globally distributed in situ IOP measurements. Our model is validated on satellite match-ups corresponding to the SeaWiFS Bio-optical Archive and Storage System (SeaBASS) database. For both training and validation, the hyperspectral in situ radiometric and absorption datasets were resampled via the relative spectral response functions of MODIS, MERIS, and VIIRS to simulate the response of each multispectral ocean color mission. Using hold-out (80–20 split) and leave-one-out testing methods, the retrieved parameters exhibited variable uncertainty represented by the Median Symmetric Residual ( MdSR ) for each parameter and sensor combination. The median MdSR over all 10 variables for the hold-out testing method was 25.9%, 24.5%, and 28.9% for MODIS, MERIS, and VIIRS, respectively. TSS was the parameter with the highest MdSR for all three sensors (MODIS, VIIRS, and MERIS). The developed MDN was applied to satellite-derived R rs products to practically validate their quality via the SeaBASS dataset. The median MdSR from all estimated variables for each sensor from the matchup analysis is 63.21% for MODIS/A, 63.15% for MODIS/T, 60.45% for MERIS, and 75.19% for VIIRS. We found that the MDN model is sensitive to the instrument noise and uncertainties from atmospheric correction present in multispectral satellite-derived R rs . The overall performance of the MDN model presented here was also analyzed qualitatively for near-simultaneous images of MODIS/A and VIIRS as well as MODIS/T and MERIS to understand and demonstrate the product resemblance and discrepancies in retrieved variables. The developed MDN is shown to be capable of robustly retrieving 10 water quality variables for monitoring coastal and inland waters from multiple multispectral satellite sensors (MODIS, MERIS, and VIIRS).
... Inspired by the function of the brain, artificial neural networks have been found to be outstanding computing systems which can be applied in many disciplines. As one of the popular artificial neural networks, Multi-Layer Perceptron (MLP) applies a supervised training procedure using examples of data with known outputs (Bishop, 1995). To understand MLP, we will firstly introduce the single-layer perceptron. ...
... Figure 3(b) shows the multi-layer with three inputs, two hidden layers and two outputs. The MLP is a layered feedforward neural network in which the information flows unidirectionally from the input layer to the output layer, passing through the hidden layers (Bishop, 1995). Each connection between neurons has its own weight. ...
Preprint
Full-text available
Citation recommendation aims to locate the important papers for scholars to cite. When writing the citing sentences, the authors usually hold different citing intents, which are referred to citation function in citation analysis. Since argumentative zoning is to identify the argumentative and rhetorical structure in scientific literature, we want to use this information to improve the citation recommendation task. In this paper, a multi-task learning model is built for citation recommendation and argumentative zoning classification. We also generated an annotated corpus of the data from PubMed Central based on a new argumentative zoning schema. The experimental results show that, by considering the argumentative information in the citing sentence, citation recommendation model will get better performance.
... In this research, first, a simple linear regression model is trained to predict biomass moisture after the drying process using a small dataset. Then, a model for predicting the absolute humidity of the air exhausted from the dryer is developed using LM (James et al., 2013), eXtreme gradient boosting (XGBoost) (Chen and Guestrin, 2016), GBM (Hastie et al., 2001), random forest (RF) (Breiman, 2001), and multilayer perceptron (MLP) (Rosenblatt, 1958;Bishop, 1995) modeling methods, and the root causes behind the undesirable humidity levels are identified. The modeling results have been analysed and visualized using XAI methods. ...
... (Hastie et al., 2001;Chen and Guestrin, 2016) MLP is a type of artificial neural network (ANN) that consists of multiple layers of interconnected neurons, including an input layer, one or more hidden layers, and an output layer. Each neuron in an MLP is connected to every neuron in the adjacent layers, allowing for complex non-linear mappings between input and output features (Bishop, 1995). ...
Conference Paper
Full-text available
The utilization of biomass as a renewable energy source holds significant promise for climate mitigation efforts. Excess heat from Nordic data centers offers opportunities for sustainable energy utilization. This research explores the feasibility of using data center excess heat for biomass drying to enhance the biomass energy value. In this study, the challenge of predicting biomass moisture under demanding measurement conditions is addressed by developing a predictive model for exhaust air humidity from the dryer. This model indirectly describes biomass moisture and employs machine learning methods such as linear regression model (LM), gradient boosting machines (GBM), eXtreme gradient boosting (XGBoost), random forest (RF), and multilayer perceptron (MLP), while enhancing transparency through explainable artificial intelligence (XAI) techniques for analyzing and visualizing humidity fluctuations. Based on this study, it can be demonstrated that tree-based ensemble methods GBM, RF, and XGBoost can accurately predict the humidity of air exiting the dryer with coefficient of determination from 0.88 to 0.89. Weather conditions, supply air humidity, and dryer fan speed emerged as key factors affecting drying efficiency, providing actionable insights for process optimization. Specific thresholds for these features can be defined to facilitate process settings. Moreover, improving system air tightness enhances drying efficiency and mitigates weather effects. The model shows promising predictive capabilities for exhaust air humidity, enabling future dynamic modeling to indirectly predict biomass end moisture, enabling adaptive control of drying processes, optimizing production capacities, and advancing sustainable energy through AI-driven solutions.
... This method provided high accuracy, was effective in high-dimensional spaces, and robust to overfitting, though computationally expensive, especially for large datasets, and required careful tuning of hyperparameters. Additionally, a fully connected neural network with an input layer, several hidden layers, and an output layer proportional to the number of classes was trained for classification (Bishop, 1995). Although this method was very adaptable and could model intricate non-linear relationships, it was also prone to overfitting and necessitated a significant quantity of data and processing power. ...
Article
Iris detection and recognition are critical components in biometric identification systems, offering high accuracy and reliability. This study presents a comprehensive approach to iris detection and recognition using advanced classification algorithms in machine learning, applied to the FRGC dataset. Initially, eye detection is performed using the Viola-Jones face detection method, ensuring robust and rapid identification of the eye region. Following this, iris segmentation is achieved through the Hough Transform, effectively isolating the iris from the sclera utilizing Canny edge detection for precise boundary delineation. To discriminate and classify the intricate texture patterns of the iris, a Convolutional Neural Network (CNN) is employed, leveraging its powerful feature extraction and classification capabilities. By combining these approaches, a high-performance iris recognition system is guaranteed, exhibiting notable gains in processing speed and accuracy. The efficiency of the suggested method is confirmed by experimental results, which also show that it has the potential to be used in practical biometric applications. The work opens the door for upcoming developments in biometric identification technology by highlighting the complementary nature of contemporary machine learning techniques and traditional image processing methods.
... According to the official statistics revealed from one of Chinese major steel-producer, the hydraulic cylinder dominants over 33% of hydraulic components faults [8], however, the above mentioned researches were mainly focused on the fault detection and diagnosis of the HAGC control system, only a little works investigated the hydraulic components such as hydraulic cylinder [9][10], as the key component that makes direct contact with the working rolls to justify the rolling space, the abnormal condition of cylinder could directly influence the rolling process and bring underlying safety issues, a nonlinear model based adaptive robust observer was designed for fault detection and diagnosis of typical faults of electro-hydraulic cylinder (EHC) with verification supported by the simulation results [9]. Unexpected loading of the cylinder is considered as a major impact factor of disfunction of HAGC hydraulic screw down [8], which could caused cylinder faults such as leakages or seals broken, the current state identification process of insufficient loading is based on the reading and processing of the pressure signals, as the direct reading from the pressure monitor only able to provide the static display of the working pressure and the pressure signals are sensitive to fluctuations and noises [10], to some extent it provides helpless suggestions to the engineers in terms of disfunction spotting, a wavelet energy [11] based approach was derived to extract significant features from the pressure signals acquired from the field and acted as the inputs to BP neural network [12] with the unexpected loading caused by the inner leakage being correctly identified [10], however, with wavelet energy and neural network approach employed to deal with the noise issue, large computational burden could be charged to the diagnostic system. ...
Article
Full-text available
The growing requirements of steel products qualities bring even tricky restrictions to the rolling process. With the fast response, reliable control and low maintenance requirements, the hydraulic automatic gauge control system has been widely applied to screw down the roller to maintain the precise rolling spaces for product quality control. Regarding to non-stopped heavy duties charged to the system, unavoidable faults and disfunctions not only influence the product quality, but also bring underlying safety issues. The hydraulic cylinder is the executing component of the roller screw down and it has the dominant percentage of hydraulic components faults. Working with unexpect loading is one of major impact factor that causes several chain-effects happened to the cylinder, as the classical diagnostic process is lack of cross-validating and time-consuming, the paper proposes the potentials of using acoustic emission to fill the dilemmas. The works include the data acquisition process to record the ultrasound acoustic signals from the hydraulic cylinder during it was loaded with 6 types of conditions, a modified image based acoustic emission approach constructed by using 8 significant waveform features was applied to generate visual effects and transform the cylinder acoustic emission signals under various loadings to a uniform format, the subtle differences among various loadings can be observed based on the pixel and intensity changes of the images. By applying the principal component analysis to project the acoustic emission image profiles onto the 3D plane, a clear trajectory can be observed with normal and overload conditions allocated upon the positive and negative sides of the axis. The result provided not only the potential of using acoustic emission for dynamic state identification of the subtle changes, but also opens up the possibility of preventive measures to the cylinder at risks in the future.
... Neural networks, as energetic frontal interdisciplinary studies, have garnered considerable scientific interest due to their applications as diverse as pattern recognition and deep learning in recent decades [1,2]. These applications are closely contingent on the related dynamical mechanism of the designed neural networks. ...
Article
This paper investigates stability switches induced by Hopf bifurcation in a fractional three-neuron network that incorporates both neutral time delay and communication delay, as well as a general structure. Initially, we simplified the characteristic equation by eliminating trigonometric terms associated with purely imaginary roots, enabling us to derive the Hopf bifurcation conditions for communication delay while treating the neutral time delay as a constant. The results reveal that communication delay can drive a stable equilibrium into instability once it exceeds the Hopf bifurcation threshold. Furthermore, we performed a sensitivity analysis to identify the fractional order and neutral delay as the two most sensitive parameters influencing the bifurcation value for the illustrative example. Notably, in contrast to neural networks with only retarded delays, our numerical observations show that the Hopf bifurcation curve is non-monotonic, highlighting that the neural network with a fixed communication delay can exhibit stability switches and eventually stabilize as the neutral delay increases.
... The second equation is just the constraint h(x * ) = 0 itself. For more detailed justifications of this procedure, see Bishop (1995), Appendix C; Milman (1999), Chapter 14; or Cristianini and Shawe-Taylor (2000), Chapter 5. As a first example illustrating the Lagrange method, let f (x) = ax 2 1 + bx 2 2 and h(x) = x 1 + x 2 − 1. ...
... Interpolationbased approaches utilize mathematical and statistical methodologies to estimate radio maps. Well-known methods include radial basis functions (RBF) [10], spline [11], and ordinary kriging [12]. However, these methods often fail to construct high-quality radio maps for real-world environments that are dynamic and geographically complex. ...
Preprint
Fine-grained radio map presents communication parameters of interest, e.g., received signal strength, at every point across a large geographical region. It can be leveraged to improve the efficiency of spectrum utilization for a large area, particularly critical for the unlicensed WiFi spectrum. The problem of fine-grained radio map estimation is to utilize radio samples collected by sparsely distributed sensors to infer the map. This problem is challenging due to the ultra-low sampling rate, where the number of available samples is far less than the fine-grained resolution required for radio map estimation. We propose WiFi-Diffusion -- a novel generative framework for achieving fine-grained WiFi radio map estimation using diffusion models. WiFi-Diffusion employs the creative power of generative AI to address the ultra-low sampling rate challenge and consists of three blocks: 1) a boost block, using prior information such as the layout of obstacles to optimize the diffusion model; 2) a generation block, leveraging the diffusion model to generate a candidate set of radio maps; and 3) an election block, utilizing the radio propagation model as a guide to find the best radio map from the candidate set. Extensive simulations demonstrate that 1) the fine-grained radio map generated by WiFi-Diffusion is ten times better than those produced by state-of-the-art (SOTA) when they use the same ultra-low sampling rate; and 2) WiFi-Diffusion achieves comparable fine-grained radio map quality with only one-fifth of the sampling rate required by SOTA.
... Trained with the backpropagation algorithm and with several hidden layers of artificial neurons, the MLP network is one of the first steps toward DNN (Data Science Academy 2019). This neural network architecture, commonly used for classifying and regressing problems (Bishop 1995), is defined by Eq. (5). ...
Article
Full-text available
Driven by digitalization and accelerated by the COVID-19 pandemic, e-commerce has experienced strong growth, especially in the last 4 years. This transformation has reshaped consumer behavior, business models, and workplace dynamics, where digitalization, such as artificial intelligence and automation, has improved operational efficiency, personalization, and market reach. This study explores these dynamics and provides an overview of e-commerce in the U.S. through a time series approach, analyzing five key variables: sales, employment, hours worked, costs, and the producer price index. It also models and forecasts sales and the producer price index using classic, deep learning, and hybrid methods. The results show that while sales have increased, employment and labor hours have fallen, alongside stable production costs and a reduction in the producer price index over the past 2 years. In forecasting, deep neural networks offer superior predictive performance, although classic methods provide similarly accurate results in series with clear trends and seasonality, making them a more computationally efficient alternative. This research contributes to decision-making in e-commerce by exploring the relationships between sales growth and labor market dynamics, evaluating the effectiveness of different forecasting methods, and highlighting the need for strategic adaptability in a digitalized sector.
... The incorporation of categorical variables, such as Water Year Type (WYT), month, and sub-region, requires appropriate encoding techniques to ensure efficient use by ML algorithms. A commonly used method is one-hot encoding [181], which represents each category as a unique binary vector. However, depending on the specific modeling requirements and dataset characteristics, alternative techniques, such as multi-hot encoding [182], may also be employed to capture additional categorical relationships more effectively. ...
Article
Full-text available
The recent surge in popularity of generative artificial intelligence (GenAI) tools like ChatGPT has reignited global interest in AI, a technology with a well-established history spanning several decades. The California Department of Water Resources (DWR) has been at the forefront of this field, leveraging Artificial Neural Networks (ANNs), a core technique in machine learning (ML), which is a subfield of AI, for water and environmental modeling (WEM) since the early 1990s. While protocols for WEM exist in California, they were designed primarily for traditional statistical or process-based models that rely on predefined equations and physical principles. In contrast, ML models learn patterns from data and require different development methodologies, which existing protocols do not address. This study, drawing on DWR’s extensive experience in ML, addresses this gap by developing standardized protocols for the development and implementation of ML models in WEM in California. The proposed protocols cover four key phases of ML development and implementation: (1) problem definition, ensuring clear objectives and contextual understanding; (2) data preparation, emphasizing standardized collection, quality control, and accessibility; (3) model development, advocating for a progression from simple models to hybrid and ensemble approaches while integrating domain knowledge for improved accuracy; and (4) model deployment, highlighting documentation, training, and open-source practices to enhance transparency and collaboration. A case study is provided to demonstrate the practical application of these protocols step by step. Once implemented, these protocols can help achieve standardization, quality assurance, interoperability, and transparency in water and environmental modeling using machine learning in California.
... The hard negative samplesê ′ i generated by HNS, along with the original positive samples e j , form the training set E T , which is then fed into a multilayer perceptron (MLP)-based classifier [62] for endto-end training. This process jointly optimizes the parameters of both the encoder and the classifier. ...
Preprint
Hypergraph, which allows each hyperedge to encompass an arbitrary number of nodes, is a powerful tool for modeling multi-entity interactions. Hyperedge prediction is a fundamental task that aims to predict future hyperedges or identify existent but unobserved hyperedges based on those observed. In link prediction for simple graphs, most observed links are treated as positive samples, while all unobserved links are considered as negative samples. However, this full-sampling strategy is impractical for hyperedge prediction, due to the number of unobserved hyperedges in a hypergraph significantly exceeds the number of observed ones. Therefore, one has to utilize some negative sampling methods to generate negative samples, ensuring their quantity is comparable to that of positive samples. In current hyperedge prediction, randomly selecting negative samples is a routine practice. But through experimental analysis, we discover a critical limitation of random selecting that the generated negative samples are too easily distinguishable from positive samples. This leads to premature convergence of the model and reduces the accuracy of prediction. To overcome this issue, we propose a novel method to generate negative samples, named as hard negative sampling (HNS). Unlike traditional methods that construct negative hyperedges by selecting node sets from the original hypergraph, HNS directly synthesizes negative samples in the hyperedge embedding space, thereby generating more challenging and informative negative samples. Our results demonstrate that HNS significantly enhances both accuracy and robustness of the prediction. Moreover, as a plug-and-play technique, HNS can be easily applied in the training of various hyperedge prediction models based on representation learning.
... According to [2] and [22]K -means is one of the simplest unsupervised learning algorithms that solve the well -known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. ...
... For instance, logistic regression handles binary classification problems well with sigmoid function [108,109]. SVM and ANN are popular classification algorithms to deal with highdimensional data and complex non-linear relationships [110,111]. Bayesian classification involving calculation of class probabilities grounded in Bayesian principles and updating these probabilities as new features or evidence becomes available has emerged as an effective tool for spectroscopic data [112,113]. The readers are encouraged to refer to the cited sources for additional details of these and other methods. ...
... Before being fed into the DL archirecture, to the aforementioned features a min-max normalization is applied, to ensure that all data falls within the same scale range. This is performed because unscaled input variables can result in a slow or unstable learning process [37]. ...
Preprint
Full-text available
Land Cover (LC) mapping using satellite imagery is critical for environmental monitoring and management. Deep Learning (DL), particularly Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have revolutionized this field by enhancing the accuracy of classification tasks. In this work, a novel approach combining a transformer-based Swin-Unet architecture with seasonal synthesized spatio-temporal images has been employed to classify LC types using spatio-temporal features extracted from Sentinel-1 (S1) Synthetic Aperture Radar (SAR) data, organized into seasonal clusters. The study focuses on three distinct regions - Amazonia, Africa, and Siberia - and evaluates the model performance across diverse ecoregions within these areas. By utilizing seasonal feature sequences instead of dense temporal sequences, notable performance improvements have been achieved, especially in regions with temporal data gaps like Siberia, where S1 data distribution is uneven and non-uniform. The results demonstrate the effectiveness and the generalization capabilities of the proposed methodology in achieving high overall accuracy (O.A.) values, even in regions with limited training data.
... The backpropagation algorithm is commonly used for training, adjusting weights based on the local gradient of the error surface. In summary, the network is initialized with weights, processes input vectors to generate output, calculates error signals, and iteratively adjusts weights to minimize errors until an acceptable level is reached [105]. ...
Thesis
Full-text available
Machine learning (ML) based on neural network (NN) facilitates data-driven techniques for handling large amounts of data, either obtained through experiments or simulations at multiple spatio-temporal scales, thereby finding the hidden patterns underlying these data and promoting efficient research methods. The main purpose of this work is to extend the capabilities of a new solver, called realFluidReactingNNFoam, under development at University of Perugia in OpenFOAM with NN algorithm for replacing complex real-fluid thermophysical property evaluations, using the approach of coupling OpenFOAM and Python-trained NN models. Currently, NN models are trained against data generated using the Peng-Robinson equation of state (PR-EoS) assuming mixture frozen temperature. The OpenFOAM solver, where needed, calls the NN models in each grid-cell with appropriate inputs, and the returned results are used and stored in suitable OpenFOAM data structures. Such inference for the thermophysical properties is achieved via two different methods including the “Neural Network Inference in C made Easy” (NNICE) library, which proved to be very efficient and robust and the execution of Python code directly within an OpenFOAM solver without the need for Python code translation calling PythonFOAM method. The overall model is validated considering a liquid-rocket benchmark comprised of liquid-oxygen (LOX) and gaseous-hydrogen (GH2) streams. The model accounts for real-fluid thermodynamics and transport properties, making use of the PR-EoS and the Chung transport models. First, the development of a real-fluid model (RFM) with artificial neural network is described in detail. Then, the numerical results of the transcritical mixing layer (LOX/GH2) benchmark are presented and analyzed in terms of accuracy and computational efficiency. Results of the overall implementation indicate that the combined OpenFOAM and ML approach provides a speed-up factor higher than seven, while preserving the original solver accuracy. On the other hand, the efficiency and combustion performance of propulsion systems, like internal combustion (IC) engines and gas turbines, is known to be related to the performance of the fuel and air mixing process. Operating conditions and fuels are rapidly changing, therefore new CFD models which accurately account for all physical aspects, still maintaining a simple framework, are extremely important. In this work the drift velocity contribution, which often is overlooked or neglected, defined as the velocity of the dispersed phase relative to the mixture volumetric mean velocity in a single fluid formulation, a key variable in two-phase mixture model. Water test cases are here considered for the study. The present work investigates the structure and the droplet velocity field of a plain liquid jet injected into a high-pressure air crossflow. Because of the large-scale separation between the small features of the interface and the overall jet the diffuse-interface treatment is used in a single-fluid Eulerian framework. A Σ-Y family model is implemented in the OpenFOAM framework which includes liquid diffusion due to drift-flux velocities and a new formulation of the spray atomization. The main objective is to explore the droplet velocity distribution and the jet structure with and without considering the drift flux correction and compare the related results with the experimental data.
... The silver data had 1 ADL observation. The UDL and ADL observations were replaced according to the recommendations of the US Environmental Protection Agency and described in detail in publications: [35][36][37]58 . The actual values and replacements are shown in Fig. 2. ...
Article
Full-text available
The recovery of precious metals from incinerator bottom ash (IBA) is a way of moving towards a circular economy. The paper presents a detailed analysis of the concentration of precious metals in IBA. The average values of precious metals in the samples analyzed are: Ag—6973 ppb, Au—313.90 ppb, Pd—41.26 ppb, Pt—13.81 ppb—all of these values being many times higher than the values of these elements in the Earth’s crust. The time series of the precious metals in the IBA were analyzed to assess the trend, seasonality, and outliers and to detect differences between the designated seasons of the year. Data analysis was performed following the CRISP-DM methodology using statistical and data mining methods. The analyzes confirmed a higher Ag concentration than in comparable European waste incinerator plants. The Au concentration was comparable to those reported in other incinerators, while the values were lower for Pt and Pd. The time series of precious metals shows no trend and seasonality, but numerous outliers. Due to the stationarity of precious metals, recovery can be expected to be constant, and the presence of numerous outliers can increase the potential return on investment.
... Son aşama ise tahmin ve gözlem arasındaki farklara uygun olarak ağırlıkları değiştiren geri yayılım sürecidir. MLP ile ilgili daha kapsamlı bilgiler ilgili literatürden [23,24] bulunabilir. ...
Article
Akarsu ortalama akımları havzanın su kaynaklarının yeterliliği hakkında önemli ipuçları barındırmaktadır. İklim değişikliği ile birlikte yağış ve sıcaklık gibi akarsu akımlarını doğrudan ilgilendiren parametrelerde bölgesel değişimler yaşanmaktadır. Yaşanan bu değişimler ortalama akımlarda da bölgesel farklılıklar görülmesine neden olmaktadır. Bu çalışmada Elektrik İdaresinin kayıtlarını paylaştığı Antalya ili Serik İlçesi Beşkonak Bucağında yer alan Köprüçay istasyonuna ait ortalama akımlar incelenmiştir. İstasyona ait 1957-2011 yılları arasındaki ortalama akımlar Multi-Layer Perceptron (MLP), Destek Vektör Makinaları (DVM) ve Random Forest (RF) makine öğrenme algoritmaları ile modellenmiştir. Çalışma iki kısımdan oluşmaktadır. İlk kısımda 1957-2011 yılları arasındaki veriler hem eğitim hem test kümesi olarak kullanılmış en uygun algoritmaya bu şekilde karar verilmiştir. İkinci kısımda algoritma seçiminden sonra kayıtları mevcut olmayan 2012-2022 yılları arasındaki ortalama akımlar tahmin edilmiştir. Modellemelerde ülkemize ait yıllık ortalama maksimum, minimum, ortalama sıcaklık ve ortalama yağış verileri girdi olarak kullanılmıştır. Sonuç olarak Köprüçay özelinde ortalama akım tahmininde en uygun algoritmanın RF olacağı görülmüştür.
... where, as before, we write ψn(x) and En(x) for the eigenstates and energies of the error Hamiltonian H(x) given by (1). Notice that (6) is exact, despite being derived using perturbation theory. ...
Article
Full-text available
We propose a new data representation method based on Quantum Cognition Machine Learning and apply it to manifold learning, specifically to the estimation of intrinsic dimension of data sets. The idea is to learn a representation of each data point as a quantum state, encoding both local properties of the point as well as its relation with the entire data. Inspired by ideas from quantum geometry, we then construct from the quantum states a point cloud equipped with a quantum metric. The metric exhibits a spectral gap whose location corresponds to the intrinsic dimension of the data. The proposed estimator is based on the detection of this spectral gap. When tested on synthetic manifold benchmarks, our estimates are shown to be robust with respect to the introduction of point-wise Gaussian noise. This is in contrast to current state-of-the-art estimators, which tend to attribute artificial “shadow dimensions” to noise artifacts, leading to overestimates. This is a significant advantage when dealing with real data sets, which are inevitably affected by unknown levels of noise. We show the applicability and robustness of our method on real data, by testing it on the ISOMAP face database, MNIST, and the Wisconsin Breast Cancer Dataset.
... LR as a traditional statistical method is a binary classification algorithm with a decision function represented by a conditional probability distribution P(Y|X) (Bishop, 1995;Hosmer et al., 2013). It maps the results of linear regression operations to values in the interval [0,1] using a Sigmoid function that predicts categories in the form of probabilities. ...
Article
Full-text available
Ensemble learning algorithms show good forecasting performances for financial distress in many studies. Despite considering the feature selection and feature importance procedures, most overlook imbalanced data handling. This study proposes the Easyensemble method based on undersampling and combines it with ensemble learning models to predict financial distress. The results show that Easyensemble sampling presents better forecasting performance than SMOTE sampling. We subsequently conduct Permutation Importance (PIMP), Recursive Feature Elimination (RFE), and partial dependence plots, and the experimental results show that the feature selection procedure can effectively reduce the number of indicators without affecting the prediction accuracy, improve the prediction efficiency as well as save processing time. In addition, the indicators from profitability, cash flow, solvency, and structural ratios are essential in predicting financial distress.
... It should be noted that a "min-max" normalisation was used (i.e., values were normalised between 0 and 1). The input normalisation procedure was implemented to increase the learning rate and ensure faster convergence, as a model with large weights tends to be unstable, suffer from poor performance during learning and be sensitive to input values, resulting in higher generalisation error (Bishop 1995;Goodfellow, Bengio, and Courville 2016). ...
Article
Grasslands play a critical role in providing diverse ecosystem services. Sown biodiverse pastures (SBP) rich in legumes are an important agricultural innovation that increases grassland productivity and reduces the need for fertilisers. This study developed a machine learning model to obtain spatially explicit estimations of the productivity of SBP, based on field sampling data from five Portuguese farms during four production years (2018–2021) and under two fertilisation regimes (conventional and variable rate). Weather data (such as temperature, precipitation and radiation), soil properties (including sand, silt, clay and pH), terrain characteristics (including elevation, slope, aspect, hillshade and topographic position index), and management data (including fertiliser application) were used as predictors. A variance inflation factor (VIF) approach was used to measure multicollinearity between input variables, leading to only 11 of the 53 input variables being used. Artificial neural network (ANN) methods were used to estimate pasture productivity, and hyper‐parameterization optimization was performed to fine‐tune the model. Plots under variable rate fertilisation were significantly improved by up to 20 kg P ha ⁻¹ applied in the same year. Plots under conventional fertilisation benefitted the most from fertilisation in past years. The model demonstrated good generalisation, with similar estimation errors for both the training and test sets: for an average yield of 6096 kg ha ⁻¹ in the sample, the root mean squared errors (RMSE) for the training and test sets were respectively 882 and 1125 kg ha ⁻¹ . These results indicate that the model did not overfit the training data and can be used to estimate SBP productivity maps in the sampled farms. However, further studies are required to asses if the obtained model can be applied to new unseen data.
... To enable machine learning in the context of these challenges, Federated Learning (FL) trains the model on local datasets that remain in each healthcare institution and shares only the model parameters with the central server for model averaging [3]. Although FL has been widely used in the clinical settings [4-6], traditional FL assumes the same data distributions for each silo [7][8][9][10][11][12]. This is an unrealistic assumption because of covariate shifts due to differences in patient demographics, clinical practices, and data collection methods between institutions [13, 14]. ...
Preprint
Full-text available
Federated Learning (FL) has emerged as a promising approach for research on real-world medical data distributed across different organizations, as it allows analysis of distributed data while preserving patient privacy. However, one of the prominent challenges in FL is covariate shift, where data distributions differ significantly across different clinical sites, like hospitals and outpatient clinics. These differences in demographics, clinical practices, and data collection processes may lead to significant performance degradation of the shared model when deployed for a target population. In this study, we propose a Federatively Weighted (FedWeight) framework to mitigate the effect of covariate shift on Federated Learning. Leveraging the data distribution estimated by density estimator models, we re-weight the patients from the source clinical sites, making the trained model aligned with the data distribution of the target site, thus mitigating the covariate shift between source and target sites. To make our approach also applicable to unsupervised learning, we integrate FedWeight into a novel federated embedded topic model (ETM), namely FedWeight-ETM. We evaluated FedWeight in cross-site FL within the eICU dataset and also cross-dataset FL between eICU and MIMIC-III data. Compared with the baseline, FedWeight-corrected FL models demonstrate superior performance for predicting patient mortality, ventilator use, sepsis diagnosis, and length of stay in the intensive care unit (ICU). Moreover, FedWeight outperforms FedAvg in identifying important features relevant to the clinical outcomes. Leveraging Shapley Additive Explanations (SHAP), the FedWeight-corrected classifiers reveal subtle yet significant associations between drugs, lab tests, and patient outcomes. Using FedWeight-ETM, we identified known disease topics involving renal or heart failure predictive of future mortality at the ICU readmission. Together, FedWeight provides a robust FL framework to address the challenge of covariate shift from clinical silos in predicting critical patient outcomes and providing meaningful clinical features.
... A Multi-Layer Perceptron (MLP) is a type of artificial neural network (ANN) that consists of multiple layers of interconnected nodes, or neurons [27] . It is a versatile and powerful model used for supervised learning tasks such as classification and regression [28] (Figure 15). ...
Article
Full-text available
The municipality of Hammam N’bails, located 37 km east of the capital of Guelma province (eastern Algeria), is accessible via RN20 and CW19 roads. It borders the municipalities of Khemissa and El Henancha in Souk-Ahras province. With a population of approximately 16,000 and covering an area of 164 km², this region is characterized by mountainous terrain, with elevations ranging from 112 to 292 meters. The area experiences cold, snowy winters and hot, dry summers, with an average annual rainfall of about 600 mm. Renowned for its natural thermal springs, Hammam N’bails is also a notable tourist destination. The rugged topography of the region leads to frequent landslides, particularly on medium and low slopes. Landslide susceptibility is assessed using raster calculators in ArcGIS and efficient machine learning algorithms, such as Decision Trees, Bagging, Random Forest, SVM, and MLP. Factors considered in the analysis include slope, elevation, geology, aspect, proximity to streams and roads, land cover, and rainfall. The performance of these models is evaluated using ROC-AUC curves, providing a robust method to understand and mitigate geological risks in this area.
... Hence, in general, the neuron will look like this: There are various types of neural networks, each suitable for different kinds of tasks. The NNs have been broadly classified (Cheng and Titterington, 1994;Eva and Friedman, 1994;Bishop, 1995;Chen and Aihara, 1999;Kenol et al., 2002) as: ...
Article
Full-text available
The idea of neural networks has emerged in recent times and has shown great results in complex and nonlinear systems. Many aerospace engineering areas like autonomous systems, adaptive control, and flight dynamics modeling can be better handled with neural networks. Selecting an orbit that consumes the fewest resources and reduces expenditure can broadly be called orbit optimization. In this study, the application of neural networks in orbit optimization in various space missions and satellite ventures is brought together.
... In future research, additional IML methods, such as Accumulated Local Effects (ALE) plots (Apley and Zhu, 2020) and Local Interpretable Model-agnostic Explanations (LIME; Ribeiro et al., 2016) can also be used. Furthermore, one could investigate more complex machine learning models, such as deep learning approaches (Bishop, 1995;LeCun et al., 2015). Additionally, similar to soccer (Groll et al., 2019b), one could also focus on tournament outcomes. ...
Preprint
Full-text available
In this manuscript, we concentrate on a specific type of covariates, which we call statistically enhanced, for modeling tennis matches for men at Grand slam tournaments. Our goal is to assess whether these enhanced covariates have the potential to improve statistical learning approaches, in particular, with regard to the predictive performance. For this purpose, various proposed regression and machine learning model classes are compared with and without such features. To achieve this, we considered three slightly enhanced variables, namely elo rating along with two different player age variables. This concept has already been successfully applied in football, where additional team ability parameters, which were obtained from separate statistical models, were able to improve the predictive performance. In addition, different interpretable machine learning (IML) tools are employed to gain insights into the factors influencing the outcomes of tennis matches predicted by complex machine learning models, such as the random forest. Specifically, partial dependence plots (PDP) and individual conditional expectation (ICE) plots are employed to provide better interpretability for the most promising ML model from this work. Furthermore, we conduct a comparison of different regression and machine learning approaches in terms of various predictive performance measures such as classification rate, predictive Bernoulli likelihood, and Brier score. This comparison is carried out on external test data using cross-validation, rolling window, and expanding window strategies.
... The local singular spectrum analysis (SSA) method [10], [11] has found application in the elimination of high-level EOG artifacts from single-channel EEG signals [12], [13]. Through the proposed method, vectors of feature are generated through sorting of delayed signals, followed by clustering using the K-means algorithm [14]. Additionally, in the process of singular value decomposition, we calculate the eigenvectors and eigenvalues of the covariance matrix for each category. ...
Article
Full-text available
One of the most frequently used techniques for removing background noise from electroencephalogram (EEG) data is adaptive noise cancellation (ANC). Nonetheless, there exist two primary disadvantages associated with the adaptive noise reduction of EEG signals: the adaptive filter, which is supposed to be an approximation of contaminated noise, lacks the reference signal. The mean squared error (MSE) criterion is frequently employed to achieve this goal in adaptive filters. The MSE criterion, which only considers second-order errors, cannot be used since neither the EEG signal nor the EOG artifact are Gaussian. In this work, we employ an ANC system, deriving an estimate of EOG noise with a discrete wavelet transform (DWT) and input this signal into the reference of the ANC system. The entropy-based error metric is used to reduce the error signal instead of the MSE. Results from computer simulations demonstrate that the suggested system outperforms competing methods with respect to root-mean-square-error, signal-to-noise ratio, and coherence measurements.
... After this step, X contains only binary features, with n samples and p 1 binary columns. This transformation preserves the categorical information by expanding each category into its own column, a technique fundamental in neural networks and machine learning [2,15]. More recently, it has also been applied in hypothesis testing [17] and graph embedding [22]. ...
Preprint
Modern datasets often consist of numerous samples with abundant features and associated timestamps. Analyzing such datasets to uncover underlying events typically requires complex statistical methods and substantial domain expertise. A notable example, and the primary data focus of this paper, is the global synthetic dataset from the Counter Trafficking Data Collaborative (CTDC) -- a global hub of human trafficking data containing over 200,000 anonymized records spanning from 2002 to 2022, with numerous categorical features for each record. In this paper, we propose a fast and scalable method for analyzing and extracting significant categorical feature interactions, and querying large language models (LLMs) to generate data-driven insights that explain these interactions. Our approach begins with a binarization step for categorical features using one-hot encoding, followed by the computation of graph covariance at each time. This graph covariance quantifies temporal changes in dependence structures within categorical data and is established as a consistent dependence measure under the Bernoulli distribution. We use this measure to identify significant feature pairs, such as those with the most frequent trends over time or those exhibiting sudden spikes in dependence at specific moments. These extracted feature pairs, along with their timestamps, are subsequently passed to an LLM tasked with generating potential explanations of the underlying events driving these dependence changes. The effectiveness of our method is demonstrated through extensive simulations, and its application to the CTDC dataset reveals meaningful feature pairs and potential data stories underlying the observed feature interactions.
... Since the dataset was highly imbalanced (the number of samples without complications is much larger than the number of samples with complications), we have applied the well-known SMOTE method [8] to address the class imbalance. To set the hyperparameters of each ML model (i.e., the parameters which cannot be tuned during the training phase), we have applied a grid-search cross-validation technique [9]. ...
Article
Full-text available
Clinical risk prediction models are ubiquitous in many surgical domains. The traditional approach to develop these models involves the use of regression analysis. Machine learning algorithms are gaining in popularity as an alternative approach for prediction and classification problems. They can detect non-linear relationships between independent and dependent variables and incorporate many of them. In our work, we aimed to investigate the potential role of machine learning versus classical logistic regression for the preoperative risk assessment in proctological surgery. We used clinical data from a nationwide audit: the database consisted of 1510 patients affected by Goligher’s grade III hemorrhoidal disease who underwent elective surgery. We collected anthropometric, clinical, and surgical data and we considered ten predictors to evaluate model-predictive performance. The clinical outcome was the complication rate evaluated at 30-day follow-up. Logistic regression and three machine learning techniques (Decision Tree, Support Vector Machine, Extreme Gradient Boosting) were compared in terms of area under the curve, balanced accuracy, sensitivity, and specificity. In our setting, machine learning and logistic regression models reached an equivalent predictive performance. Regarding the relative importance of the input features, all models agreed in identifying the most important factor. Combining and comparing statistical analysis and machine learning approaches in clinical field should be a common ambition, focused on improving and expanding interdisciplinary cooperation.
... No desenvolvimento de algoritmos para análise de dadoś e interessante que o melhor subconjunto dos atributos originais disponível seja selecionado, preservando-se toda ou a maior parte da informação dos dados, eliminando-se aqueles que são irrelevantes ou redundantes. Mesmo que, em geral, uma redução na dimensão do vetor de entrada signifique uma redução de informação, em aplicações reais, como a quantidade de dadosé limitada, a maldição da dimensionalidade leva a dados esparsos e pode reduzir a performance de sistemas de classificação (Bishop, 1995). ...
Conference Paper
This article presents a proposal for a tool to combat non-technical losses in electricity distribution systems, based on Machine Learning techniques and consumer data from a Brazilian energy distributor. An exploratory analysis of the data was carried out to identify the variables to be used in training the models, with the purpose of selecting facilities to be inspected on site. The model with the best theoretical result was used on site tests.
... Artificial neural networks are mathematical models that learn and create nonlinear relationships between two datasets, which can discover intricate patterns [15,16]. It mimics the functioning of neuron cells, which receive data from the input layer, process data by active function in a node, and send the data to the output layer. ...
Article
Full-text available
Net radiation is the difference between downward and upward radiation, considering both shortwave and longwave radiation. The net radiation controls the water cycle, plant photosynthesis, the earth’s climate changes, and the energy balance. In this paper, the Artificial Neural Network (ANN) model is developed for estimating daily net radiation from meteorological data that are based on maximum air temperature, minimum air temperature, daily relative humidity, and daily solar radiation. Net radiation and meteorological data collected for 7 years (2017-2023) from Chiang Mai meteorological station (CM: 18.77°N, 98.96°E), Ubon Ratchathani meteorological station (UB: 15.24°N, 105.02°E), Nakhon Pathom meteorological station (NP: 14.01°N, 99.96°E), and Songkhla meteorological station (SK: 7.41°N, 100.62°E) were used to train and test the model. The discrepancy between the net radiation estimated by the ANN and the measured net radiation was presented in terms of determination coefficient (R2), relative root mean square error (RMSE), and relative mean bias error (MBE). The model showed 0.98, 14.48%, and -2.17%, respectively. The result shows that the artificial neural network model is an accurate and easy option for estimating surface net radiation.
... E. Hinton, Srivastava, Krizhevsky, Sutskever, & Salakhutdinov, 2012) and stride are used. In addition to these, many activations such as ReLU, ELU, SELU, Tanh, Softplus, Softmax (Bishop, 1995;Dugas, Bengio, Bélisle, Nadeau, & Garcia, 2000;G. E. Hinton et al., 2012;Lin & Shen, 2018;M. ...
Chapter
Full-text available
Cryptography is a subfield of cryptology. Its main function is to transform the original data into a very different content using mathematical operations and to prevent malicious individuals from accessing the original data content. This explains data confidentiality and holds a very important place in the field of security. In addition to ensuring data confidentiality, cryptology is also used for purposes such as data integrity, authorization, access control, and non- repudiation. The objectives of cryptography in data security have materialized in areas such as the Internet of Things (IoT), blockchain applications, digital signatures, and cloud computing. In these fields, numerous cryptographic algorithms have been developed from past to present to ensure data security. These algorithms, which can be described as traditional, are sufficient in terms of security in today's world. However, the widespread adoption of quantum computers in the near future is anticipated. Due to the high computational power of quantum computers, it is inevitable that the data security provided by traditional cryptographic algorithms will be compromised. Post- quantum cryptography (PQC) studies conducted in recent years aim to eliminate this threat and ensure post-quantum security. --161-- Cryptography has found many different application areas. In this book chapter, cryptography's most widely used areas, such as digital signature applications, cryptographic applications in the Internet of Things, blockchain technology applications, and cryptography in cloud computing, have been presented to the reader with a straightforward explanation.
... Every layer is made up of neurons that are connected to neurons in adjoining layers, and each connection is weighted based on its strength [50]. Similar to other FFANNs, information travels in one direction from the input to the output via hidden layers [51] In MLP, input vector and output class sizes define the number of neurons in the input and output layers, respectively. However, the quantity of hidden layers and neurons inside these layers should be determined experimentally because too many induce overfitting, and too few cause underfitting. ...
Article
Full-text available
Breast cancer is the primary cause of death among women globally, and it is becoming more prevalent. Early detection and precise diagnosis of breast cancer can reduce the disease’s mortality rate. Recent advances in machine learning have benefited in this regard. However, if the dataset contains duplicate or irrelevant features, machine learning-based algorithms are unable to give the intended results. To address this issue, a series of effective strategies such as the Robust Scaler is used for data scaling, Synthetic Minority Over-sampling Technique-Edited Nearest Neighbor (SMOTE-ENN) is utilized for class balancing, and Boruta and Coefficient-Based Feature Selection (CBFS) methods are employed for feature selection. For more accurate and reliable breast cancer classification, this study proposes a soft voting-based ensemble model that harnesses the capabilities of three diverse classifiers: Multilayer Perceptron (MLP), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost). To show the efficiency of the proposed ensemble model, it is compared with its base classifiers using the Wisconsin Diagnosis Breast Cancer Dataset (WDBC). The results of the experiment revealed that the soft voting classifier achieved high scores with an accuracy of 99.42%, precision of 100.0%, recall of 98.41%, F1 score of 99.20%, and AUC of 1.0 when it is trained on optimal features obtained from the CBFS method. However, with tenfold cross-validation (10-FCV), it shows a mean accuracy score of 99.34%. A comprehensive analysis of the results revealed that the suggested technique outperformed the existing state-of-the-art methods due to the efficient data preprocessing, feature selection, and ensemble methods.
... Other layers are called hidden neural layers and they are connected to the input and output layers as shown in Figure 17. Furthermore, the backpropagation algorithm has been widely employed for training ANN [253,254]. The training of ANN enhances its efficiency for feature extraction. ...
Article
Full-text available
The anticipated diagnosis of cancers and other fatal diseases from the simple analysis of the volatiles emitted by the body (volatolome) is getting closer and closer from becoming reality. The promises of vapour sensor arrays are to provide a rapid, reliable, non-invasive and ready-to-use method for clinical applications by making an olfactive fingerprint characteristic of people’s health state, to increase their chance of early recovery. However, the different steps of this complex and ambitious process are still paved with difficulties needing innovative answers. The purpose of this review is to provide a statement of the blocs composing the diagnostic chain to identify the improvements still needed. Nanocomposite chemo-resistive transducers have unique prospects to enhance both the selectivity and sensitivity to volatile biomarkers. The variety of their formulations offers multiple possibilities to chemical functionalization and conductive architectures that should provide solutions to discriminations and stability issues. A focus will be made on the protocols for the collection of organic volatile compounds (VOC) from the body, the choice of vapour sensors assembled into an array (e-nose), in particular, chemo-resistive vapour sensors, their principle, fabrication and characteristics, and the way to extract pertinent features and analyse them with suitable algorithms that are able to find and produce a health diagnosis.
Preprint
Full-text available
for possible open access publication under the terms and conditions of the Creative Commons Attri-bution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). Abstract: Machine learning and artificial intelligence models have become popular for climate change 1 prediction. Forested regions in California and Western Australia are increasingly facing intense 2 wildfires, while other parts of the world face various climate-related challenges. To address these 3 issues, machine learning and artificial intelligence models have been developed to predict wildfire 4 risks and support mitigation strategies. Our study focuses on developing wildfire prediction models 5 using one-class classification algorithms. These include Support Vector Machine, Isolation Forest, 6 AutoEncoder, Variational AutoEncoder, Deep Support Vector Data Description, and Adversarially 7 Learned Anomaly Detection. The models were validated through five-fold cross-validation to 8 minimize bias in selecting training and testing data. The results showed that these one-class Machine 9 learning models outperformed two-class Machine learning models based on the same ground truth 10 data, achieving mean accuracy levels between 90% and 99%. Additionally, we employed Shapley 11 values to identify the most significant features affecting the wildfire prediction models, contributing 12 a novel perspective to wildfire prediction research. When analyzing models trained on the California 13 dataset, seasonal maximum and mean dew point temperatures were critical factors. These insights 14 can significantly improve wildfire mitigation strategies. Furthermore, we have made these models 15 accessible and user-friendly by operationalizing them through a REST API using Python Flask and 16 developing a web-based tool.
Article
Deep learning has revolutionized the field of artificial intelligence, enabling breakthroughs in computer vision, natural language processing, and reinforcement learning. This paper explores the mathematical foundations of deep learning, focusing on the theoretical underpinnings, algorithmic frameworks, and practical applications. We provide a rigorous treatment of key concepts, including optimization, generalization, and neural network architectures, supported by mathematical derivations and illustrative examples. The paper concludes with a discussion of open challenges and future directions in the field.
Article
Full-text available
This study investigated the accurate prediction of the calorific value of municipal solid waste (MSW) using soft computing systems, namely artificial neural networks (ANN), adaptive neural fuzzy inference system (ANFIS), support vector machine (SVM), and multi-layer perceptron (MLP). Calorific value of MSW is a crucial factor that exhibits the energy content of MSW, however, determining calorific value from conventional laboratory methods is quite expensive and difficult. The research focused on proximate analysis parameters obtained from the laboratory and utilized the measured calorific value to develop predictive models. All the models demonstrated a very good correlation between input and output, with consistently strong values of the coefficient of determination (R²). ANN SVM, MLP, and ANFIS models have respective R² values of 0.9397, 0.8195, 0.7212, and 0.9979. ANFIS showed the best correlation with exceptional predictive power. Statistical parameters were determined to compare model accuracy, with ANFIS exhibiting the top performance, followed by ANN, and then MLP, which had the lowest values of mean square error (MSE), root mean square error (RMSE), mean absolute deviation (MAD), and mean absolute percentage error (MAPE) at 8.704E-07, 0.00019, 0.00016, and 1.295E-05 respectively. However, SVM has the least capability to predict calorific values accurately compared to other models. Soft computing Models, particularly ANFIS, demonstrated remarkable accuracy in predicting the calorific value.
Article
Full-text available
Neurodegenerative diseases present complex challenges that demand advanced analytical techniques to decode intricate brain structures and their changes over time. Curvature estimation within datasets has emerged as a critical tool in areas like neuroimaging and pattern recognition, with significant applications in diagnosing and understanding neurodegenerative diseases. This systematic review assesses state-of-the-art curvature estimation methodologies, covering classical mathematical techniques, machine learning, deep learning, and hybrid methods. Analysing 105 research papers from 2010 to 2023, we explore how each approach enhances our understanding of structural variations in neurodegenerative pathology. Our findings highlight a shift from classical methods to machine learning and deep learning, with neural network regression and convolutional neural networks gaining traction due to their precision in handling complex geometries and data-driven modelling. Hybrid methods further demonstrate the potential to merge classical and modern techniques for robust curvature estimation. This comprehensive review aims to equip researchers and clinicians with insights into effective curvature estimation methods, supporting the development of enhanced diagnostic tools and interventions for neurodegenerative diseases.
Preprint
Full-text available
Due to climate change, forest regions in California, Western Australia, and Saskatchewan, Canada, are increasingly experiencing severe wildfires, with other climate-related issues affecting the rest of the world. Machine learning (ML) and artificial intelligence (AI) models have emerged to predict wildfire hazards and aid mitigation efforts. However, inconsistencies arise in the wildfire prediction modeling domain due to the database adjustments required to enable complex and real-time modeling. To help address this issue, our paper focuses on creating wildfire prediction models through One-class classification algorithms: Support Vector Machine, Isolation Forest, AutoEncoder, Variational AutoEncoder, Deep Support Vector Data Description, and Adversarially Learned Anomaly Detection. Five-fold Cross-Validation was used to validate all One-class ML models to minimize bias in the selection of the training and testing data. These One-class ML models outperformed Two-class ML models using the same ground truth data, with mean accuracy levels between 90 and 99 percent. Shapley values were used to derive the most important features affecting the wildfire prediction model, which is a novel contribution to the field of wildfire prediction. Among the most important factors for models trained on the California data set were the seasonal maximum and mean dew point temperatures. These insights will support mitigation strategies. In providing access to our algorithms, using Python Flask and a web-based tool, the top-performing models were operationalized for deployment as a REST API, with outcomes supporting the potential of our solution for strengthening wildfire mitigation strategies.
Preprint
Full-text available
In this book, written in Portuguese, we discuss what ill-posed problems are and how the regularization method is used to solve them. In the form of questions and answers, we reflect on the origins and future of regularization, relating the similarities and differences of its meaning in different areas, including inverse problems, statistics, machine learning, and deep learning.
Article
An empirical study of the economic impact of cybersecurity breaches and computer fraud on Small and Medium Enterprises (SMEs) is presented in this research paper. As the need for digital infrastructure continues to rise, SMEs are ever increasingly finding themselves being hit by cyber attacks, resulting in huge financial losses and operational disruptions. Using SME data from various sectors, the study examines the extent and nature of these impacts. We examine key areas of focus associated with breaches, including direct financial costs, indirect costs (such as reputational damage) and broader impacts on business continuity. The research is done by employing quantitative analysis methods to identify trends and correlations of frequency of cyber attacks and financial resilience of SMEs. These findings highlight the need for robust cybersecurity measures as well as provide lessons for policymakers and business owners on how to reduce risks. As such, it further contributes to the increasingly voluminous SME cybersecurity literature by providing a foundation for further research and strategic policy development.
ResearchGate has not been able to resolve any references for this publication.