Serge Guillaume

Serge Guillaume
French National Institute for Agriculture, Food, and Environment (INRAE) | INRAE

About

170
Publications
18,549
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,003
Citations
Introduction
Skills and Expertise

Publications

Publications (170)
Article
A new clustering algorithm Path-scan aiming at discovering natural partitions is proposed. It is based on the idea that a (k,ɛ) coreset of the original data base represented by core and support patterns can be path-connected via a density differential approach. The Path-scan algorithm is structured in two main parts producing a connectivity matrix...
Article
Density-based clustering algorithms have made a large impact on a wide range of application fields application. As more data are available with rising size and various internal organizations, non-parametric unsupervised procedures are becoming ever more important in understanding datasets. In this paper a new clustering algorithm S-DBSCAN¹ is propo...
Article
Clustering algorithms become more and more sophisticated to cope with large data sets of increasing complexity. Sampling selection methods are likely to provide an interesting alternative as they can reduce memory requirements, and reduce execution time. Many sampling algorithms for clustering are efficient but they each have their own limitations...
Article
A soil chemical quality index is designed for cacao production systems. It is based on worldwide scientific knowledge and built using data from three municipalities of Tolima department in Colombia. Fuzzy logic is used in a multicriteria decision making framework in two different ways. First, fuzzy sets are used to model an expert preference relati...
Article
New clustering algorithms are expected to manage complex data, meaning various shapes and densities while being user friendly. This work addresses this challenge. A new clustering algorithm KdMutual ¹ driven by the number of clusters is proposed. The idea behind the algorithm is based on the assumption that working with cluster cores rather than co...
Chapter
Fuzzy logic is widely used in linguistic modeling. In this work, fuzzy logic is used in a multicriteria decision making framework in two different ways. First, fuzzy sets are used to model an expert preference relation for each of the individual information sources to turn raw data into satisfaction degrees. Second, fuzzy rules are used to model th...
Data
This is the on-line presentation presented at the GISTAM conference, which was held as a virtual conference due to the COVID-19 outbreak in the first half of 2020. If you have any questions or comments on the content of the presentation, please fee free to contact me . I retain the rights to the presentation and the content within and this shoul...
Conference Paper
This paper presents an application of Fuzzy Logic, well known for its linguistic modeling ability, in a multicriteria decision making framework applied to spatial data sets. The Fuzzy Logic is integrated in two different ways. First, fuzzy sets are used to model an expert preference relation for each of the individual spatial information sources to...
Article
Full-text available
Single nucleotide variants (SNVs) occurring in a protein coding gene may disrupt its function in multiple ways. Predicting this disruption has been recognized as an important problem in bioinformatics research. Many tools, hereafter p-tools, have been designed to perform these predictions and many of them are now of common use in scientific researc...
Book
Smart farming is about how emerging and evolving technologies support the farmer, and their professional network, in the management of production and of information related to production. Decision support is therefore a core concern. As decision support is actually about making the right decisions and undertaking the right action, it relates to pre...
Chapter
Many Bioinformatics tools, known as p-tools, have been developed to predict the effect of single nucleotide polymorphisms (SNPs) on gene functionality, in an effort to reduce the need for in-vivo assays. However, the large number of p-tools available and the heterogeneity of their output make their selection and comparison difficult. To study the c...
Chapter
Three algorithms for unsupervised sampling are introduced. They are easy to tune, scalable, and yield a small size sample. They are based on the same concepts: they combine density and distance, they use the farthest-first traversal that allows for runtime optimization, they yield a coreset, and they are driven by a single user parameter. DIDES giv...
Chapter
This chapter reviews the data reduction problem for instance and feature selection methods in the context of supervised classification. In the first part, instance and feature selections are studied separately. As instance and feature selection are not independent, algorithms dealing with simultaneous selection are then presented. To provide a comp...
Chapter
Smart farming is about supporting the farmer and their professional network, about the management of production and about managing information related to production. The advent of digital agriculture is pushing agricultural decision towards new standards, with re-gard to the complexity and intensity of information handled as inputs or outputs and t...
Book
Three algorithms for unsupervised sampling are introduced. They are easy to tune, scalable and yield a small size sample. They are based on the same concepts: they combine density and distance, they use the farthest first traversal that allows for runtime optimization, they yield a coreset and they are driven by a single user parameter. DIDES gives...
Book
This chapter reviews the data reduction problem for instance and feature selection methods in the context of supervised classification. In the first part, instance and feature selections are studied separatively. As instance and feature selection are not independent, algorithms dealing with simultaneous selection are then presented. To provide a co...
Book
This book describes in detail sampling techniques that can be used for unsupervised and supervised cases, with a focus on sampling techniques for machine learning algorithms. It covers theory and models of sampling methods for managing scalability and the “curse of dimensionality”, their implementations, evaluations, and applications. A large part...
Presentation
Full-text available
These are the slides from an invited presentation to the First Symposium on Precision Management of Orchards and Vineyards that was sponsored by the ISHS. The full paper is available at Taylor, J.A., Bates, T.R., Manfrini, L. and Guillaume, S. (2021). Zoning and data fusion in precision horticulture: current and needed capabilities to assist deci...
Article
Full-text available
Fruit load estimation at plot level before harvest is a key issue in fruit growing. To face this challenge, two sampling methods to estimate fruit load in a peach tree orchard were compared: simple random sampling (SRS) and ranked set sampling (RSS). The study was carried out in a peach orchard (Prunus persica cv. 'Platycarpa') covering a total are...
Article
Full-text available
This paper proposes a methodology to improve grape yield sampling and yield estimation of the current season by using historical yield data. This approach is based on the conjoint use of (i) historical yield data all over the study field and (ii) several yield measurements collected at specific sites within the field during the current season. The...
Article
Hierarchical clustering is widely used in data mining. The single linkage criterion is powerful, as it allows for handling various shapes and densities, but it is sensitive to noise\footnote{A sample code is available at: \url{http://frederic.rosresearch.free.fr/mydata/homepage/}}. Two improvements are proposed in this work to deal with noise. Firs...
Article
Full-text available
It is expected for new clustering algorithms to find the appropriate number of clusters when dealing with complex data, meaning various shapes and densities. They also have to be self-tuning and adaptive for the input parameters to differentiate only between acceptable solutions. This work addresses this challenge. At the beginning mutual nearest n...
Article
Full-text available
The world we live in is an increasingly spatial and temporal data-rich environment, and agriculture is no exception. However, data needs to be processed in order to first get information and then make informed management decisions. The concepts of ‘Precision Agriculture’ and ‘Smart Agriculture’ are and will be fully effective when methods and tools...
Article
Full-text available
In the process of knowledge discovery in big data, sampling is a technological brick that can be included in a more general framework to speed up existing algorithms and contribute to the scalability issue. Two challenging and connected problems arise with complexity: tuning and timing. ProTraS¹ is a new algorithm that fulfills both requirements. I...
Conference Paper
Full-text available
This paper proposes a methodology aiming at using historical yield data to improve yield sampling and yield estimation. The sampling method is based on a collaboration between historical data (at least three years) and yield measurements of the year performed on some sites within the field. It assumes a temporal stability of within field yield spat...
Article
Full-text available
Source to sink size ratio, i.e.: the relative abundance of photosynthetically active organs (leaves) with regards to photosynthate demanding organs (mainly bunches), is widely known to be one of the main drivers of grape oenological quality. However, due to the difficulty of remote sink size estimation, Precision Viticulture (PV) has been mainly ba...
Article
Fuzzy measures are powerful at modeling interactions between elements. Unfortunately, they use a number of coefficients that exponentially grows with the number of elements. Beyond the computational complexity, assigning a value to any coalition of a large set of elements does not make sense. k-Order measures model interactions involving at most k...
Article
Full-text available
As clustering algorithms become more and more sophisticated to cope with current needs, large data sets of increasing complexity, sampling is likely to provide an interesting alternative. The proposal is a distance-based algorithm: The idea is to iteratively include in the sample the furthest item from all the already selected ones. Density is mana...
Article
To deal with large datasets, sampling can be used as a preprocessing step for clustering. In this paper, an hybrid sampling algorithm is proposed. It is density-based while managing distance concepts to ensure space coverage and fit cluster shapes. At each step a new item is added to the sample: it is chosen as the furthest from the representative...
Article
Infrared spectroscopy data is characterized by the presence of a huge number of variables. Applications of infrared spectroscopy in the mid-infrared (MIR) and near-infrared (NIR) bands are of widespread use in many fields. To effectively handle this type of data, suitable dimensionality reduction methods are required. In this paper, a dimensionalit...
Article
Full-text available
In this paper, a new instance selection algorithm is proposed in the context of classification to manage non-trivial database sizes. The algorithm is hybrid and runs with only a few parameters that directly control the balance between the three objectives of classification, i.e. errors, storage requirements and runtime. It comprises different mecha...
Conference Paper
Full-text available
To face the big data challenge, sampling can be used as a preprocessing step for clustering. In this paper, an hybrid algorithm is proposed. It is density-based while managing distance concepts. The algorithm behavior is investigated using synthetic and realworld data sets. The first experiments proved it can be accurate, according to the Rand Inde...
Conference Paper
Fuzzy logic is a powerful interface between linguistic and numerical spaces. It allows the design of transparent models based upon linguistic rules. The FisPro open source software includes learning algorithms as well as a friendly java interface. In this paper, it is used to model a composite agronomical feature, the vine vigor. The system behavio...
Book
Fuzzy logic is a powerful interface between linguistic and numerical spaces. It allows the design of transparent models based upon linguistic rules. The FisPro open source software includes learning algorithms as well as a friendly java interface. In this paper, it is used to model a composite agronomical feature, the vine vigor. The system behavio...
Conference Paper
Full-text available
Fuzzy logic is a powerful interface between linguistic and nu-merical spaces. It allows the design of transparent models based upon linguistic rules. The FisPro open source software includes learning al-gorithms as well as a friendly java interface. In this paper, it is used to model a composite agronomical feature, the vine vigor. The system behav...
Article
An important limitation of fuzzy integrals for information fusion is the exponential growth of coefficients for an increasing number of information sources. To overcome this problem a variety of fuzzy measure identification algorithms has been proposed. HLMS is a simple gradient-based algorithm for fuzzy measure identification which suffers from so...
Article
Vine vigor, a key agronomic parameter, depends on environmental factors but also on agricultural prac- tices. The goal of this paper is to model vine vigor level according to the most influential variables. The approach was based upon a collected dataset in a French vineyard in the middle Loire valley and the available expert knowledge. The input fe...
Article
Full-text available
Early definition of oenologically significant zones within a vineyard is one of the main goals of precision viticulture, as it would allow an increase in profitability through the adaptation of agronomic practices to the specific requirements of each zone, and/or segregation of the harvest into different batches to produce wines with different qual...
Conference Paper
Full-text available
A univariate segmentation algorithm has recently been developed for precision agricultural applications. This is adapted to a bivariate analysis to investigate zoning based on yield and protein responses in an eastern Australian wheat field. The intention is to provide a zone-by-zone interpretation of the agronomic response to N. The segmentation a...
Conference Paper
This work discusses the implementation of a semi-distance based on fuzzy partitions, that allows to introduce expert knowledge into distance computations done on numerical data. It can be used in various kinds of statistical clustering or other applications. The semi-distance univariate behaviour is first studied, then a multivariate clustering cas...
Article
Full-text available
In many fields, due to the increasing number of automatic sensors and devices, there is an emerging need to integrate georeferenced and temporal data into decision support tools. Geographic Information Systems (GIS) and Geostatistics lack some functionalities for modelling and reasoning using georeferenced data. Soft computing techniques and softwa...
Article
Monitoring water status at different points within a single field is time-consuming and expensive. Nevertheless, it is necessary to consider within-field variability since water status is usually highly variable and has a large effect on grape quality. To overcome this situation, models that allow estimation of the relative difference in vine water...
Article
Region growing methods are of potential interest to define within-field zones and resulting site-specific management. These methods are unsupervised and based on regions which grow from the initial seeds according to homogeneity criteria. However, the determination of seed number and seed locations has strong repercussions on the zoning output. Thi...
Article
Full-text available
Ce travail présente une séquence robuste d'optimisa-tion des paramètres d'un système d'inférence floue. Cha-cune desétapesdes´desétapes permet d'optimiser un ensemble de pa-ramètres interdépendants, suivant des critères de performance numérique mais aussi de couverture. La structure du système n'est pas modifiée par la procédure et des contraintes...
Book
Ce travail présente une séquence robuste d’optimisation des paramètres d’un système d’inférence floue. Chacune des étapes numérique mais aussi de couverture. La structure du système n’est pas modifiée par la procédure et des contraintes sont imposées pour garantir son interprétabilité. Dix couples apprentissage-test sont générés pour chaque jeu de...
Article
Full-text available
p style="text-align: justify;"> Aims : The evolution of the economic and environmental context (low-input management practices, increase of energy cost and climate change) requires adaptation and/or optimization of winegrower’s practices in order to elaborate competitive and yet still qualitative wines. To adapt and sustain their practices at the p...
Conference Paper
Full-text available
In Agronomy and Environment, due to the increasing number of automatic sensors and devices, there is an emerging need to integrate georeferenced and temporal data into decision support tools, traditionally based on expert knowledge. Soft computing techniques and software suited to these needs may be very useful for modelling and decision making. Th...
Conference Paper
Full-text available
This paper proposes a flexible optimization sequence that can be applied to any parameter of a fuzzy inference system. Interrelated parameters can be optimized together, and criteria include system accuracy and coverage. The fuzzy inference system structure is preserved and constraints are imposed to respect the fuzzy partition semantics. The proce...
Book
In Agronomy and Environment, due to the increasing number of automatic sensors and devices, there is an emerging need to integrate georeferenced and temporal data into decision support tools, traditionally based on expert knowledge. Soft computing techniques and software suited to these needs may be very useful for modelling and decision making. Th...
Book
This paper proposes a flexible optimization sequence that can be applied to any parameter of a fuzzy inference system. Interrelated parameters can be optimized together, and criteria include system accuracy and coverage. The fuzzy inference system structure is preserved and constraints are imposed to respect the fuzzy partition semantics. The proce...
Article
Full-text available
Precision viticulture (PV) has been mainly applied at the field level, for which the ability of high resolution data to match within-field variability has been already shown. However, the interest of PV for grape growers would be greater if its principles could also apply at a larger scale, as most growers still focus their management on a multi-fi...
Article
Fuzzy inference systems (FIS) are likely to play a significant part in system modeling, provided that they remain interpretable following learning from data. The aim of this paper is to set up some guidelines for interpretable FIS learning, based on practical experience with fuzzy modeling in various fields. An open source software system called Fi...
Article
Physically based hydrological models are increasingly used to simulate the impact of land use changes on water and mass transfers. The problems associated with this type of parameter-rich model from a water management perspective are related to the need for (1) a large number of local parameters instead of only a few catchment-scale decision variab...
Article
Full-text available
In multicriteria decision making, the study of attribute contributions is crucial to attain correct decisions. Fuzzy measures allow a complete description of the joint behavior of attribute subsets. However, the determination of fuzzy measures is often hard. A common way to identify fuzzy measures is HLMS (Heuristic Least Mean Squares) algorithm. I...
Conference Paper
Full-text available
This work studies a new distance function which takes into account expert knowledge by making use of fuzzy partitions. It considers the symbolic distances between concepts and is equivalent to the Euclidean distance for regular partitions made of triangular membership functions. Its behaviour is investigated in comparison with that of the Euclidean...
Book
This work studies a new distance function which takes into account expert knowledge by making use of fuzzy partitions. It considers the symbolic distances between concepts and is equivalent to the Euclidean distance for regular partitions made of triangular membership functions. Its behaviour is investigated in comparison with that of the Euclidean...
Conference Paper
Full-text available
A fuzzy Inference System (FIS) was developed to generate recommendations for spatially variable applications of nitrogen (N) fertilizer using soil, plant and precipitation information. Experiments were conducted over three seasons (2005-2007) to assess the effects of soil electrical conductivity (ECa), nitrogen sufficiency index (NSI), and precipit...
Article
Full-text available
p style="text-align: justify;"> Aim : Recent work has identified strong intra-field relationships of predawn leaf water potential ( Ψ <sub>PD</sub>) between paired sites. This study investigates if these relationships exist at the inter-field level when soil types between fields are constant or different in a vineyard in Southern France. Method and...
Chapter
Fuzzy logic inference systems (FISs) can help provide within-eld nitrogen (N) fertilization recommendations by combining critical plant-and soil-based spatial information. This chapter describes how, based on spatially distributed information, FIS can be used to develop in-season N recommendations. A sample problem is provided. Soil and plant infor...
Article
Full-text available
In production agriculture, savings in herbicides can be achieved if weeds can be discriminated from crop, allowing the targeting of weed control to weed-infested areas only. Previous studies demonstrated the potential of ultraviolet (UV) induced fluorescence to discriminate corn from weeds and recently, robust models have been obtained for the disc...
Article
Full-text available
A fuzzy inference system (FIS) was developed to generate recommendations for spatially variable applications of N fertilizer. Key soil and plant properties were identified based on experiments with rates ranging from 0 to 250kgNha−1 conducted over three seasons (2005, 2006 and 2007) on fields with contrasting apparent soil electrical conductivity (...