Camila Maione’s research while affiliated with Universidade Federal de Goiás and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Location of the study area and collected samples.
Distribution of samples in the three clusters, divided by the K-means expressed by the two principal components. Cluster 1 = black circles, Cluster 2 = red triangles and Cluster 3 = green crosses.
Mean values for the concentration of the ten most relevant elements determined by the CFS feature selection technique.
Map from the location from which each sample was collected. Cluster 1 = gray, Cluster 2 = red and Cluster 3 = green.
A Cluster Analysis Methodology for the Categorization of Soil Samples for Forensic Sciences Based on Elemental Fingerprint
  • Article
  • Full-text available

December 2021

·

82 Reads

·

5 Citations

Camila Maione

·

·

·

Soil and dirt fragments are easily transferred from the ground to objects such as clothing, shoes, skin, nails, and tires. The elemental analysis of these samples involved in crimes can be an important source of information for forensic scientists because they can present substantial evidence by creating links between victims, suspects, crime scenes and other relevant actors or places. In this work we present a promising new approach for the study of soil samples, using data mining techniques applied to the elemental fingerprints of soil. We experimented on soil samples obtained from southeast Oregon and northern Nevada, two neighboring states in the United States that have similar geological characteristics while also displaying some specific differences. The chemical composition of soil and sediments samples were determined by the use of inductively coupled plasma-mass spectrometry (ICP-MS). Thirty-three elements were analyzed, and we used their concentrations to conduct the analysis. Cluster analysis was performed employing the K-means clustering algorithm. We found three clusters that showed interesting chemical patterns. In order to investigate the most significant chemical elements that distinguish the clusters, we employed the Correlation-Based Feature Selection (CFS) algorithm. Lastly, we developed a classification model based on support vector machine (SVM), which can predict in which of the found clusters an arbitrary soil sample would fall with a 99% prediction accuracy when all 32 variables were used for training the model, and a 95% prediction accuracy when only the 10 most relevant elements were used for training the model. Following this methodology, forensic scientists and experts would be able to establish profiles of soil samples extracted from the crime scene and nearby regions, and use classification models to predict which of these profiles an arbitrary soil sample found on the subjects involved in the crime would be associated with.

Download


Figure 1 -Map of administrative divisions of Brazil. Pernambuco and São Paulo States are highlighted in green and blue colors, respectively.
Figure 2 -Methodology for construction of the SVM (support vector machines) and LDA (linear discriminant analysis) models and performance measure estimation.
Figure 4 -Mean values of the sum of bases, CEC T , CEC e , exchangeable calcium in soil, well-crystallized aluminum in soil, sand, amorphous aluminum in soil, nickel in soil and zinc in plant for the lettuce samples produced in São Paulo (SP) and Pernambuco (PE) state.
Approximated geographical coordinates of cities where the analyzed lettuce samples were collected. Column # is the number of samples collected from the location.
Determining the geographical origin of lettuce with data mining applied to micronutrients and soil properties

January 2021

·

197 Reads

·

8 Citations

Scientia Agricola

Lettuce (Lactuca sativa) is the main leafy vegetable produced in Brazil. Since its production is widespread all over the country, lettuce traceability and quality assurance is hampered. In this study, we propose a new method to identify the geographical origin of Brazilian lettuce. The method uses a powerful data mining technique called support vector machines (SVM) applied to elemental composition and soil properties of samples analyzed. We investigated lettuce produced in São Paulo and Pernambuco, two states in the southeastern and northeastern regions in Brazil, respectively. We investigated efficiency of the SVM model by comparing its results with those achieved by traditional linear discriminant analysis (LDA). The SVM models outperformed the LDA models in the two scenarios investigated, achieving an average of 98 % prediction accuracy to discriminate lettuce from both states. A feature evaluation formula, called F–score, was used to measure the discriminative power of the variables analyzed. The soil exchangeable cation capacity, soil contents of low crystalized Al and Zn content in lettuce samples were the most relevant components for differentiation. Our results reinforce the potential of data mining and machine learning techniques to support traceability strategies and authentication of leafy vegetables.


Predicting the botanical and geographical origin of honey with multivariate data analysis and machine learning techniques: A review

February 2019

·

241 Reads

·

95 Citations

Computers and Electronics in Agriculture

Because honey is a natural product synthetized by bees from secretions of various flowers and plants, its composition and properties are determined by those unique floral origins. Aside from honey’s sweet and distinctive flavor, it can provide various human health benefits, which makes its market value favorable to those of other sweeteners. Consequently, honey has been a very common and easy target of adulteration for economic profit, making authenticity of honey a longtime concern for researchers and producers around the world. The oldest and most popular of these methods is melissopalynology, the analysis of pollen contained in honey, which can determine its floral origin. However, because this method is time consuming, requires a specialist, and is unable to detect fraudulent pollen contamination, several other methods for the determination of the floral and geographical origin of honey have been proposed in the scientific literature. Multivariate data analysis and machine learning are emerging areas that offer performance and cost advantages for extracting valuable information from raw data sets. They are capable of performing both exploratory and predictive analyses, which can help to uncover trends and hidden patterns within data. Most of the studies reviewed base their methods on atomic spectra and physicochemical properties as descriptive variables for honey. Sensorial data obtained from electronic tongue and nose, and color histograms of honey images also show high discriminative power for ascertaining honey origin. Principal component analysis (PCA), discriminant analysis (DA), and cluster analysis are the preferred techniques for performing exploratory and predictive analyses for the purpose of identifying origin. PCA and DA continue to be preferred due to their ease of application and interpretation, while machine learning algorithms are more complex to model. Nevertheless, both machine learning algorithms and PCA-DA models achieved excellent results in discriminating honey origin. Finally, a commonly observed tendency is the use of hybrid methods that combine multivariate data analysis and machine learning techniques, as each technique has its own strengths and weaknesses regarding uncovering and classifying patterns in honey data.


Decision boundary, decision boundary margins and SVM parameters
Relative importance of the chemical elements in ecstasy samples according to the F-score values computed
Establishing chemical profiling for ecstasy tablets based on trace element levels and support vector machine

August 2018

·

92 Reads

·

16 Citations

Neural Computing and Applications

Ecstasy is an amphetamine-type substance that belongs to a popular group of illicit drugs known as “club drugs” whose consumption is rising in Brazil. The effects caused by this substance in the human organism are mainly psychological, including hallucinations, euphoria and other stimulant effects. The distribution of this drug is illegal, and effective strategies are required in order to detain its growth. One possible way to obtain useful information on ecstasy trafficking routes, sources of supply, clandestine laboratories and synthetic protocols is by its chemical components. In this paper, we present a data mining and predictive analysis for ecstasy tablets seized in two cities of São Paulo state (Brazil), Campinas and Ribeirão Preto, based on their chemical profile. We use the concentrations of 25 elements determined in the ecstasy samples by ICP-MS as our descriptive variables. We develop classification models based on support vector machines capable of predicting in which of the two cities an arbitrary ecstasy sample was most likely to have been seized. Our best model achieved a 81.59% prediction accuracy. The F-score measure shows that Se, Mo and Mg are the most significant elements that differentiate the samples from the two cities, and they alone are capable of yielding an SVM model which achieved the highest prediction accuracy.


Fig. 1. Map of Ceará state. The six cities of origin of the family farmers analyzed (Barbalha, Guaraciaba do Norte, Boa Viagem, Limoeiro do Norte, Itarema and Parambu) and the capital, Fortaleza, are highlighted. 
Table 2 Best features determined by the CFS method. 
Research on social data by means of cluster analysis

February 2018

·

1,313 Reads

·

52 Citations

Applied Computing and Informatics

This paper presents a data mining study and cluster analysis of social data obtained on small producers and family farmers from six country cities in Ceará state, northeast Brazil. The analyzed data involve demographic, economic, agriculture and food insecurity information. The goal of the study is to establish profiles for the small producer families that reside in the region and to identify relevant features which differentiate these profiles. Moreover, we provide an efficient data mining methodology for analysis of social data sets which is capable of handling its natural challenges, such as mixed variables and abundance of null values. We use the Silhouette method for the estimation of the best number of natural groups within the data, along with the Partitioning Around Medoids clustering algorithm in order to compute the profiles. The Correlation-Based Feature Selection method is used to identify which social criteria are the most important to differentiate the families from each profile. Classification models based on support vector machines, multilayer perceptron and decision trees were developed aiming to predict in which of the identified clusters an arbitrary family would be best fit. We obtained a good separation of the families into two clusters, and a multilayer perceptron model with approximately 93.5% prediction accuracy.


Summary of recent studieson discrimination of rice using multivariate data analysisand machine learning techniques. Aim of the study, propertiesanalyzed and methodsused are detailed. 
Recent applications of multivariate data analysis methods in the authentication of rice and the most analyzed parameters: A review

January 2018

·

1,479 Reads

·

100 Citations

Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.


Elemental fingerprint profiling with multivariate data analysis to classify organic chocolate samples

December 2017

·

125 Reads

·

12 Citations

Journal of Chemometrics

Chocolate is an appreciated food derived from cacao fruit. The flavonoids and minerals present in the chocolate have benefits to health, and some specific minerals are known to be toxic. Because of differences in their production systems, organic chocolate has a distinguishable pattern in mineral concentrations than conventional chocolate. Aiming for authenticity and study of the toxic elements in organic chocolate, we present in this work a simple method to classify organic chocolate samples based on elemental fingerprint profiling and multivariate data analysis. Thirty‐eight elements (toxic and essential) were determined in 36 chocolate samples (organic and conventional) by using inductively coupled plasma mass spectrometry to establish reference ranges and to identify differences in patterns of elements in both type of samples. Our results showed that Al, Zn, Mn, Cu, and Ba are the most present components for both types of chocolate, and higher concentrations of essential elements Fe, Zn, and Mg are found in conventional type, opposing the idea that organic food is rich in essential elements. Principal component analysis and linear discriminant analysis were used for multivariate data analysis, and sample differentiation was possible with 83% accuracy. We present a method to differentiate organic from conventional chocolate based on elemental fingerprint profiling and multivariate data analysis. Thirty‐eight elements (toxic and essential) were determined in samples by inductively coupled plasma—mass spectrometry. We found that Al, Zn, Mn, Cu, and Ba are the most commonly present elements for both types of chocolate, whereas higher concentrations of essential elements Fe, Zn, and Mg are found in the conventional samples.


Finding the Most Significant Elements for the Classification of Organic Orange Leaves: A Data Mining Approach

June 2017

·

333 Reads

·

6 Citations

Brazil is the world's largest producer of oranges. The Brazilian conventional citrus crop requires repeated application of agrochemicals to achieve satisfactory levels of productivity. The organic citriculture is an alternative production system, which is environmentally friendly and offers a safe food to consumers. However, it is difficult to determine if a food or plant was cultivated in organic or conventional system by just common observation, which makes the customers of organic food market vulnerable against fraudulent entrepreneurs. In this study we present a data mining approach for the study of Brazilian organic citrus leaves which can aid in the certification of authenticity of the citrus leaves. The elemental composition is determined by inductively coupled plasma – mass spectrometry (ICP-MS). We developed classification models based on support vector machines and artificial neural networks capable of predicting whether a citrus leaf is organic or conventional through analysis of the concentration levels of the fourteen chemical elements (Al, Ba, Co, Cr, Cs, Cu, Fe, Mg, Mn, Ni, Rb, Si, Sr and V) found in both types of leaves. Feature selection filter methods are used to determine the most relevant elements for the classification process. Our best model obtained was a support vector machine with approximately 88% prediction accuracy. The elements Mn, Mg, and Rb were evaluated as the most significant for the classification decision. This is the first paper which addresses the problem of classification of organic orange leaves based on chemical composition. The presented methodology is useful for attesting authenticity of organic citrus leaves and can be adapted for other organic food or substances.


Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil

February 2017

·

70 Reads

·

7 Citations

Journal of Forensic Sciences

The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples.


Citations (11)


... Another reason is the complexity of elemental composition of the soils, wherein types and quantities of the elements strongly depend on climate of the particular region, types and composition of the local pollutions, human activities etc., thereby being a soilprint. Maione et al. (2021) employed correlation-based feature selection algorithm together with inductively coupled plasma-mass spectrometry to analyze the soil samples. Based on this analysis the authors further developed classification system based on support vector algorithm for the future establishing the profiles of the soil samples collected at the crime site, matching them to the already analyzed groups of the samples, and linking these profiles to the subjects that might be involved in the crime. ...

Reference:

Using Artificial Intelligence in the Forensic Science for the Analysis of Microparticles: A Systematic Review
A Cluster Analysis Methodology for the Categorization of Soil Samples for Forensic Sciences Based on Elemental Fingerprint

... https: //doi.org/10.35685/revintera.v6i1.4041 (Maione et al., 2022). A produção se destaca na preferência dos olericultores pela facilidade de cultivo e grande aceitação na alimentação dos consumidores, assegurando a essa olerácea uma expressiva importância econômica em todas as regiões produtoras do país Costa, 2008;Gusatti et al., 2019). ...

Determining the geographical origin of lettuce with data mining applied to micronutrients and soil properties

Scientia Agricola

... This nondestructive technology provides spectral data that swiftly reflect the chemical composition of samples, enabling rapid and straightforward classification of the geographical origin of samples when combined with a chemometric tool [12][13][14] . Furthermore, machine learning techniques have recently emerged as powerful chemometric tools for determining the geographical origin of various foods, including honey 13 , teas 14,15 , cocoa beans 16 , rice 17 , and sea cucumber 18 . Fortunately, machine learning techniques can directly interpret classification results without human intervention, and can efficiently deal with large amounts of complex data. ...

Predicting the botanical and geographical origin of honey with multivariate data analysis and machine learning techniques: A review
  • Citing Article
  • February 2019

Computers and Electronics in Agriculture

... Regarding chocolate, the usual method for sample preparation for multi-element determination is acid digestion in a closed system, assisted by microwave radiation, which could use highly concentrated acids and acid mixtures (Chekri et al. 2017;Hartwig et al. 2016;Junior et al. 2018;Lo Dicoa et al. 2017;Mrmolanin et al. 2018;Pedro et al. 2006;Peixoto et al. 2012). Although the use of a closed system causes the reduction of problems involved in contamination and losses of analytes, this method demands high amounts of acids that could cause interference and generate harmful waste. ...

Elemental fingerprint profiling with multivariate data analysis to classify organic chocolate samples
  • Citing Article
  • December 2017

Journal of Chemometrics

... This can result in an incomplete understanding, as they fail to capture the dynamic and interrelated elements of food security. Researchers and organizations have developed composite indices to enable a more comprehensive assessment of food security (Biederlack & Rivers, 2009;Maione et al., 2019;Mathenge et al., 2023;Reig, 2012;Santeramo, 2015;Vaitla et al., 2017;Wineman, 2016). ...

Research on social data by means of cluster analysis

Applied Computing and Informatics

... PLS-DA models are commonly employed to classify samples into predefined categories and predict the classification of unknown samples [45,46]. The k-NN algorithm, recognized for being lightweight, simple, and cost-effective, is particularly effective for small datasets and multi-class problems [47]. To develop the most robust model, the discriminative capabilities of all three methods were compared. ...

Recent applications of multivariate data analysis methods in the authentication of rice and the most analyzed parameters: A review

... During the past few decades, considerable efforts have been developed to create non-fungicidal methods to control postharvest decay in citrus (Fig. 3), which are considered to be safer for consumers, workers, and the environment (Maione et al., 2017). ...

Finding the Most Significant Elements for the Classification of Organic Orange Leaves: A Data Mining Approach

... A significant challenge in implementing clustering algorithms, such as Kmeans and its variants, is the necessity to pre-define the number of clusters as input. This study adopts an exploratory data analysis approach, which proves relevant when researchers require pre-defined models or hypotheses but aim to comprehend the general characteristics or structure of high-dimensional data (Maione et al., 2017). The primary objective is to comprehensively understand the data, and identifying the number of emerging clusters is one of the principal goals. ...

Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil
  • Citing Article
  • February 2017

Journal of Forensic Sciences

... A similar study was conducted on seizures from two cities in the same state in Brazil with the aim being to discriminate between seizures originating from both. Researchers analyzed 25 elements using ICP-MS and found significant differences in certain elements among the 25 [47]. To predict the samples' origin, a statistical classification model was created based on three elements: Se, Mo, and Mg. ...

Establishing chemical profiling for ecstasy tablets based on trace element levels and support vector machine

Neural Computing and Applications

... Overall, it is important to acknowledge that the use of decision trees as a machine learning technique for food authentication and traceability still remains relatively uncommon, with only a few prior studies focusing on plant-based foods and rarely measuring lanthanides (Maione et al., 2016;Sim et al., 2023;Vanderschueren et al., 2019). Consequently, making direct comparisons and drawing conclusions from the results in the context of existing literature poses significant challenges. ...

Classification of geographic origin of rice by data mining and inductively coupled plasma mass spectrometry
  • Citing Article
  • February 2016

Computers and Electronics in Agriculture