Table 1 - uploaded by Damien François
Content may be subject to copyright.
Normalized mean square error on the test set for the nitrogen content pre- diction problem

Normalized mean square error on the test set for the nitrogen content pre- diction problem

Source publication
Conference Paper
Full-text available
In spectrometric problems, objects are characterized by high-resolution spectra that correspond to hundreds to thousands of variables. In this context, even fast variable selection methods lead to high computational load. However, spectra are generally smooth and can therefore be accurately approximated by splines. In this paper, we propose to use...

Context in source publication

Context 1
... results on the test set (NMSE) for the studied methods are given in Table 1. The 10 variables selected by maximizing the mutual information cannot be used to construct a linear model with performances comparable to the ones of the optimal linear models. ...

Similar publications

Article
Full-text available
As spatial correlation and heterogeneity often coincide in the data, we propose a spatial single-index varying-coefficient model. For the model, in this paper, a robust variable selection method based on spline estimation and exponential squared loss is offered to estimate parameters and identify significant variables. We establish the theoretical...
Article
Full-text available
Nonparametric varying coefficient models are useful for the analysis of repeated measurements. While many procedures have been developed for estimat-ing varying-coefficients, there have been few results on variable selection for such models. Recently, Wang, Chen and Li (2007) proposed a group SCAD procedure for model selection in varying-coefficien...
Article
Full-text available
We extend the adaptive regression spline model by incorporating saturation, the natural requirement that a function extend as a constant outside a certain range. We fit saturating splines to data using a convex optimization problem over a space of measures, which we solve using an efficient algorithm based on the conditional gradient method. Unlike...
Conference Paper
Full-text available
In this paper, we compare two approaches for predicting errors in manufacturing processes, to detect if the manufacturing processes contain errors or not. The first approach is based on machine learning techniques and the second approach is based on deep learning techniques. The proposed approaches are applied on a dataset from literature, the SECO...
Article
Full-text available
The predictive value of a statistical model can often be improved by applying shrinkage methods. This can be achieved, e.g., by regularized regression or empirical Bayes approaches. Various types of shrinkage factors can also be estimated after a maximum likelihood fit has been obtained: while global shrinkage modifies all regression coefficients b...

Citations

... In this paper, the noise was reduced by the thresholding method based on the analysis of absolute values of the coefficients. This sufficiently sophisticated type of raw data analysis was effectively implemented for the characterization of highresolution spectrometric problems [67]. Besides, this solution was directly applied in many complex data approximation issues [68,69,70]. ...
Article
In this paper, the different methods of processing of surface topography image (STI) were proposed for the characterization of selected features from the surface texture. The proposed analysis was performed to improve the functionality of the image assessment in the characterization of material contact properties (e.g. wear-resistant, lubricant retention, sealing or friction) according to the results of surface topography measurement. Various type of surfaces from manufactured details was considered, e.g. turned, ground, honed, laser-textured or isotropic. They were measured by the white light interferometer. Results were analyzed with areal surface topography maps (image) characterization. It was found that wavelet filtering approaches might be valuable in assessments of the selected surface texture features, such as dimple (size or distributions) or treatment trace distances, received by an application of STI processing methods. Additionally, proposed in the paper, the image digital extension method might be crucial in the definition of feature size (depth, width), especially when they are located on the edge of analyzed (measured) detail. Moreover, STI might be advantageous in the reduction of some surface texture measurement errors (a selected type of noise).
Article
New recommendations require airlines to establish a safety management strategy to keep reducing the number of accidents. The flight data recorders have to be systematically analysed in order to identify, measure and monitor the risk evolution. The aim of this thesis is to propose methodological tools to answer the issue of flight data analysis. Our work revolves around two statistical topics: variable selection in supervised learning and functional data analysis. The random forests are used as they implement importance measures which can be embedded in selection procedures. First, we study the permutation importance measure when the variables are correlated. This criterion is extended for groups of variables and a new selection algorithm for functional variables is introduced. These methods are applied to the risks of long landing and hard landing which are two important questions for airlines. Finally, we present the integration of the proposed methods in the software FlightScanner implemented by Safety Line. This new solution in the air transport helps safety managers to monitor the risks and identify the contributed factors.
Article
The selection of grouped variables using the random forest algorithm is considered. First a new importance measure adapted for groups of variables is proposed. Theoretical insights into this criterion are given for additive regression models. Second, an original method for selecting functional variables based on the grouped variable importance measure is developed. Using a wavelet basis, it is proposed to regroup all of the wavelet coefficients for a given functional variable and use a wrapper selection algorithm with these groups. Various other groupings which take advantage of the frequency and time localization of the wavelet basis are proposed. An extensive simulation study is performed to illustrate the use of the grouped importance measure in this context. The method is applied to a real life problem coming from aviation safety.