Article

Using machine learning regression models to predict the pellet quality of pelleted feeds

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Pelleted feeds are widely used in monogastric animal production systems because they not only improve animal performance (increasing digestibility and feed consumption) but are convenient to store and handle. However, pellet quality can be affected by many factors. While previous studies have reported the effect of a single or several factors on pellet quality, no studies have investigated how pellet quality can be affected by the large number of factors that vary during feed manufacturing. Therefore, the current study reports using machine learning regression models to predict pellet quality using commercial feed mill data. A dataset consisting of 2,471 observations describing the pellet manufacturing process, the feed formulation, and environmental conditions (e.g. outdoor temperature were collected from two feed mill lines for 8 months. Sixteen features (13 continuous, 3 categorical) were used for building the regression models, and the output was the pellet durability index (PDI) of the pelleted feeds. Twelve regression algorithms including Linear Regression (LR), Least Absolute Shrinkage and Selection Operator (LASSO) regression, Ridge Regression (RR), Support Vector Regression (SVR), Linear Support Vector Regression (LSVR), Random Forest (RF), Decision Tree (DT), Gradient Boosting Regression (GBR), Adaptive Boosting Regression (ABR), Multi-Layer Perceptron (MLP) neural network, K-Nearest Neighbor (KNN), and Stacking Regression (SR) were examined in this study. Feature importance analysis using permutation importance was performed to identify what features were most relevant for each model. Average outdoor temperature, bakery byproduct and wheat inclusion levels, as well as production line, all had high permutation importance values while the fat added into the mixer (with controls at the mill already in place to limit it) was less important than most features. The cleaned dataset was preprocessed and then split into a training (80% of total samples, n = 1,147) and a testing (20% of total samples, n = 287) set. A 5-fold cross-validation process was applied and learning curves were used to verify the presence of overfitting for each algorithm before and after tuning the hyperparameters on the training set. The models that exhibited overfitting were excluded from the final results and only models with tuned hyperparameters were evaluated on the testing set. The SVR algorithm was selected as the best overall model for predicting PDI, as it had the lowest mean absolute and mean squared prediction errors (MAE = 3.280, MSPE = 16.192), and the second highest concordance correlation coefficient (CCC = 0.636). In conclusion, this study shows that feed mill features describing manufacturing parameters, feed formulation, and environmental data can be successfully used to build machine learning regression models for pellet quality prediction.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... where y i and ŷ i are the actual and predicted response variable of soluble nitrogen, respectively, y is the average value of actual soluble nitrogen and n is the number of datapoints in the test set. Next, the potential for overfitting and underfitting was assessed, where a model performs very well on training data but fails to generalize on external, or unseen data such as the test set in this study, or when the dataset is too small, so the model performs poorly on both training and test sets, respectively (You et al., 2022). A learning curve was used as a diagnostic tool to assess machine learning model performance (i.e., mean squared error) on both the training and test sets, while holding the test set fixed and gradually increasing the size of the training data (Giola et al., 2021;Hosseinzadeh, Zhou, Zyaie, et al., 2022). ...
... The gradient boosting and support vector regression algorithms were better able to predict soluble nitrogen compared to the random forest algorithm. While the training and test errors converged to a low value in learning curves plotted for the three algorithms ( Fig. A10 and A11 for Mozzarella and Cheddar, respectively), signaling generalizable and well-fitted models that did not over-or underfit the data (You et al., 2022), a larger gap remained between the curves for the random forest algorithm (Fig. A10a and A11a for Mozzarella and Cheddar, respectively). This gap indicates that adding more than 100 datapoints into the training set did not further improve performance for the random forest algorithm. ...
Article
Proteolysis is a complex biochemical event during cheese storage that affects both functionality and quality, yet there are few tools that can accurately predict proteolysis for Mozzarella and Cheddar cheese across a range of parameters and storage conditions. Machine learning models were developed with input features from the literature. A gradient boosting method outperformed random forest and support vector regression methods in predicting proteolysis for both Mozzarella (R2 = 92%) and Cheddar (R2 = 97%) cheese. Storage time was the most important input feature for both cheese types, followed by coagulating enzyme concentration and calcium content for Mozzarella cheese and fat or moisture content for Cheddar cheese. The ability to predict proteolysis could be useful for manufacturers, assisting in inventory management to ensure optimum Mozzarella functionality and Cheddar with a desired taste, flavor and texture; this approach may also be extended to other types of cheese.
... Such an examination of ML model behavior may lend insight into features of importance in a dataset that might not initially be intuitive. As an example from another field, permutation importance conducted by You et al. (2022) demonstrated the importance of regional outdoor temperature in prediction of pellet quality within a commercial feed mill. Such information may lead to further research on the effect of air quality being pulled into the mill to cool pellets being manufactured. ...
Article
Over the past decade, there has been considerable attention on mitigating enteric methane (CH4) emissions from ruminants through the utilization of antimethanogenic feed additives (AMFA). Administered in small quantities, these additives demonstrate potential for substantial reductions of methanogenesis. Mathematical models play a crucial role in comprehending and predicting the quantitative impact of AMFA on enteric CH4 emissions across diverse diets and production systems. This study provides a comprehensive overview of methodologies for modeling the impact of AMFA on enteric CH4 emissions in ruminants, culminating in a set of recommendations for modeling approaches to quantify the impact of AMFA on CH4 emissions. Key considerations encompass the type of models employed (i.e., empirical models including meta-analyses, machine learning models, and mechanistic models), the modeling objectives, data availability, modeling synergies and trade-offs associated with using AMFA, and model applications for enhanced understanding, prediction, and integration into higher levels of aggregation. Based on an evaluation of these critical aspects, a set of recommendations is presented concerning modeling approaches for quantifying the impact of AMFA on CH4 emissions and in support of farm-level, national, regional, and global inventories for accounting greenhouse gas emissions in ruminant production systems.
... Receiving feed materials, grinding, proportioning, mixing, conditioning, pellet conversion, cooling, and packaging are some of the steps involved in making pelleted feeds [14]. Because it keeps ingredients from being separated, the pellet manufacturing process improves bulk density, handling qualities, and nutritional value in animal feeds [15]. ...
Article
Full-text available
This study is to determine the characteristics of the palm kernel cake (PKC) and Indigofera zollingeriana and its suitability as the ingredients for animal feed. This study focuses on the chicken feed formulation optimization with mixture of the solid waste and the Indigofera zollingeriana. This study also investigates the effect of the formulated chicken feed towards the growth of the chicken within the experimental duration. The formulated chicken feed with palm kernel cake and Indigofera zollingeriana have the nutrients needed for the chicken to grow healthily and this shows that the agro-industrial waste can substitute the corn as the main ingredient in chicken feed.
... In a recent study (Ittiphalin et al. 2017), a model was constructed based on local linear map and artificial neural networks to identify the amount of fat that should go into the mixer for each batch of pelleted feed. Another study from our research group recently reported using ML models to predict pellet quality of pelleted feeds within commercial mills for the first time (You et al. 2022). Pellet quality is important for pelleted feeds and can be affected by factors including manufacturing parameters, feed formulation, and environment. ...
... Pelleting is a common procedure in the feed industry (Bastiaansen et al., 2023;You et al., 2022) and biomass energy , and the core is that the powders are compacted into solid pellets by the ring die and press roll (Holm et al., 2006). However, the quality of the pellet is equally affected by processing conditions (e.g. ...
Article
The discrete element method (DEM) has demonstrated significant advantages in modelling feed-tool interaction. The selection of an appropriate contact model and its parameter calibration is crucial for developing effective DEM simulations. A DEM model of feed powder was developed to forecast powder compression and densification in feed pelleting by using the Edinburgh elasto-plastic adhesion contact model. Firstly, a calibration method of DEM parameters based on the Plackett–Burman (PB) and the central composite (CC) test was proposed to enhance calibration efficiency. Both experimentally and through DEM simulation of the uniaxial confined compression experiments, powder repose angle experiments, and powder static friction angle experiments were used to calibrate the DEM parameters. Extract the DEM parameters that have a significant impact on feed pelleting through PB testing. Then, the effects of these parameters on response were analysed using CC testing, and ascertain their optimal values. Finally, the single-hole open compression experiment and feed pelleting experiment were used to evaluate the accuracy of the parameters. The results show that the average relative error between the predicted ultimate compressive force in the open compression experiment and the measured value is 8.16%, and that of the torque of the pellet mill is 2.10%. Thus, it was shown that the modelling and calibration method can effectively predict the compression and densification of powder in the feed pelleting.
... Р азвитие современного программного обеспечения для изучения и обобщения свойств статистик регрессионных зависимостей [1][2][3] открывает возможности обработки большого объема комбинаторно сочетающихся признаков количественных и категориальных данных. При этом возникает проблема смысловой нагрузки на эти сочетания, чтобы достижение целей анализа и обработки данных не было нарушено противоречивым набором результатов машинных действий. ...
Article
Full-text available
Aim to statistically determine the distribution of chronic diseases in the chronology of observations; to show the specifics of methods for testing hypotheses in the quantitative and probabilistic prediction of the prevalence of polypous rhinosinusitis. Material and methods. The outpatient data for the period of 20172021 and quantitative information about the cases with polypous rhinosinusitis as main or concomitant diagnosis registered by medical organizations of 25 districts of the Samara region were used in the study. Results. The synthesis of the initial data statistics, which amounted to the volume of the numerical expansion of primary indicators in the following ratio: categories 15.8%, counting data 26.3%; quantitative values 21.1%; 26.7% relative incidence and prevalence data. The rest of the data is the descriptive statistics and indicators in the form of tables of correlation coefficients. For extensions of the synthesized data, distributions were evaluated and hypotheses tested using statistical criteria. Conclusion. The count of the number of chronic diseases is approximated by the density of atypical distributions. Approximately 58% of samples for diagnoses are not confirmed as obeying the law of distribution. In such a situation, when preparing a forecast for the transition to a time series, it is necessary to solve the problem of obtaining sequences with stationary characteristics. In machine learning, data in predictive calculations must be checked for probabilistic confirmation of the coincidence of related sample parameter distributions. The results of the forecast should be taken as a probabilistic conclusion at the level of an unrejected hypothesis.
... In the food material industry, 3Dprint technology is used and random forest (RF) is integrated to improve the performance. In the food processing industry, RF and gradient boosting regression (GBR) algorithm is used to optimize pellet quality [43], in addition to the abovementioned fields, In paper industry [44], chemical industry [45], pharmaceutical manufacturing [46], plastic products [47], ferrous metal [48], non-ferrous metal [49], automobile manufacturing industry [50], transportation industry [51], through method based on EL innovation, data innovation, domain innovation to enhance and improve the existing manufacturing process has obtained the best benefits. Table 4 shows application of EL has grown by leaps and bounds. ...
Article
This study proposes a systematic review of the application of Ensemble learning (EL) in multiple industries. This study aims to review prevailing application in multiple industries to guide for the future landing application. This study also proposes a research method based on Systematic Literature Review (SLR) to address EL literature and help advance our understanding of EL for future optimization. The literature is divided three categories by the National Bureau of Statistics of China (NBSC): the primary industry, the secondary industry and the tertiary industry. Among existing problems in industrial management systems, the frequently discussed are quality control, prediction, detection, efficiency and satisfaction. In addition, given the huge potential in various fields, the gap and further directions are also suggested. This study is essential to industry managers and cross-disciplinary scholars to lead a guideline to solve the issues in practical work, as it provided a panorama of application domains and current problems. This is the first review of the application of EL in multiple industries in the literature. The paper has potential values to broaden the application area of EL, and to proposed a novel research method based SLR to sort out literature.
Article
Consumer decision-making varies according to an individual's relationship with the recipient of the gift. This study used a mock purchase task to investigate consumer decision-making and its underlying neurological mechanisms when purchasing gifts of different prices for recipients with varying levels of intimacy. Functional near-infrared spectroscopy was used to record neural activity during the task. Behavioral results found that the lover group had a much higher purchasing rate than the friend group, particularly when acquiring premium products. Analysis of the functional near-infrared spectroscopy data found that neural activity in the dorsolateral prefrontal cortex and orbitofrontal cortex decreased when items were discounted, with lower activation in the dorsolateral prefrontal cortex in lovers during the purchasing of premium products. Furthermore, we identified significant differences in functional connectivity between the dorsolateral prefrontal cortex and orbitofrontal cortex under different conditions. We compared the support vector machine algorithm and logistic regression, finding that logistic regression better predicts purchasing tendencies based on neuroactivation levels. In our view, a stronger emotional connection leads to a more rewarding experience for consumers when buying premium products. This study reveals the impact of intimate relationships on consumer decision-making and provides guidance for businesses in developing marketing strategies targeted at the lover's market.
Article
Pellet quality, measured as Pellet Durability Index (PDI), is an important key performance indicator (KPI) for commercial feed manufacturing, as it can impact both mill efficiency and downstream performance of animals fed the manufactured diets. However, it is an ongoing challenge for the feed industry to control pellet quality, due to the complexity of feed manufacturing and the large number of variables influencing the process. Previous studies have explored prediction of pellet quality using either simple empirical models with a few variables or machine learning models with many variables. The objective of the current study was to develop statistical regression models to predict PDI, and to describe the relationship between pellet quality and 55 available variables based on a dataset with 2691 observations collected from a commercial feed mill. In the current study, the response variable (PDI) was transformed using the Box-Cox approach into the transformed response variable (tPDI), that was more normally distributed. Three multiple regression models were developed based on subsets of variables processed by variable selection and dimensionality reduction methods: Forward Selection, Principal Component Analysis, and Partial Least Squares. The results indicated that Model 1 (Forward Selection with manual removal of sparse variables), built on 9 variables, performed better than the other two models. It exhibited consistent model prediction performance on the training data and testing data, in terms of MAE (1.93 ± 0.063 versus 1.96), RMSPE (2.45 ± 0.079 versus 2.45), and CCC (0.549 ± 0.0273 versus 0.550), with a better prediction precision based on the fit plot. Expanding Temperature (℃), Fat Content (%), and ADF Content (%), and Indoor Humidity (Pelletizer) (%) were identified as more influential than other variables on the transformed response variable (tPDI) in Model 1, based on a behavior analysis. The models developed in the current study can be helpful to feed mills for predicting and comprehending the effect of a number of commonly measured variables on pellet quality in the commercial setting.
Article
Densification is the most effective forming technology, the production of which, pellets are the most intensive energy content of biomass solid fuels. Increasing the overall utilization efficiency of bioenergy, extending available types of wastes and residues for pelleting, and improving the pellet quality comprehensively are necessary. This paper reviewed feedstocks used for pelleting, binding mechanism of biomass pellets, and evaluation systems of pellet qualities. To extend the supply amounts and improve the qualities of raw biomass, various feedstocks and formulation recipes were classified based on their regions, sources, and species. And applicable objects and conditions of several different pretreatment technologies for feedstocks were compared. Physical properties and biochemical analyses are also concluded to evaluate the flowability, compactibility, and potential characteristics as solid fuel. For a better understanding of energy consumption and conversion during the densification process, pelleting devices and binding mechanism are elaborated. Associations and differences between single pelleting press in lab-scale and industrial pelleting in pilot-scale are also discussed. Suggestions are proposed for the optimization design of pelleting devices and processes for different feedstock densification. Lastly, relationships between operating parameter factors of pelleting process and pellet quality are presented. This review could provide theoretical analyses for powder compression, new insights for waste to energy, and guidance for achieving global peak carbon and environmental strategic objectives.
Article
Objectives This study aimed to analyze the effects of the use of binders on the physical quality and digestibility of Alabio ducks (Anas platyrinchos Borneo). Materials and Methods Pellet binders used tapioca meal (TM) (Manihot utilissima), sago meal (SM) (Metroxylon sagu Rottb.), and sweet potato meal (SPM) (Ipomoea batatas) pelleted feed. Laying Alabio ducks, around 120 birds, aged 20 weeks with an average body weight of 1,426 ± 113.5 gm, were used. A fully randomized design with 4 treatments and 15 repeats was used in this study. The variables measured include the physical quality and digestibility of pellet feed. Data analysis used a Fisher test. For the distinction between treatments, the Duncan multiple-range test was conducted. Results The finding showed that the plant-based pellet binder had a natural effect on physical properties, including pellet durability index, moisture content, threshold power, stack density, and stack compacted density. The strength of the pellet binder is seen in the durability index of TM 98.12%, SM 97.64%, and SPM 97.35%, respectively. However, these variables did not differ significantly in terms of specific gravity and stack angle. Pellet binders considerably affect the consumption of feed and vary markedly in dry matter, organic matter, and metabolizable energy digestibility. Conclusion Plant-based pellet binders influence the physical quality and digestibility of pelleted feed in Alabio ducks. TM can maintain physical quality and digestibility compared to SM and SPM as plant-based pellet binders.
Article
The bitterness of soy protein isolate hydrolysates prepared using five proteases at varying degree of hydrolysis (DH) and its relation to physicochemical properties, i.e., surface hydrophobicity (H0), relative hydrophobicity (RH), and molecular weight (MW), were studied and developed for predictive modelling using machine learning. Bitter scores were collected from sensory analysis and assigned as the target, while physicochemical properties were assigned as features. The modelling involved data pre-processing with local outlier factor; model development with support vector machine, linear regression, adaptive boosting and K-nearest neighbors algorithms; and performance evaluation by 10-fold stratified cross-validation. The results indicated that alcalase hydrolysates were the most bitter, followed by protamex, flavorzyme, papain, and bromelain. Distinctive correlation results were found among the physicochemical properties, influenced by the disparity of each protease. Among the features, the combination of RH-MW fitted various classification models and resulted in the best prediction performance.
Article
Full-text available
Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.
Article
Full-text available
SciPy is an open-source scientific computing library for the Python programming language. Since its initial release in 2001, SciPy has become a de facto standard for leveraging scientific algorithms in Python, with over 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories and millions of downloads per year. In this work, we provide an overview of the capabilities and development practices of SciPy 1.0 and highlight some recent technical developments. This Perspective describes the development and capabilities of SciPy 1.0, an open source scientific computing library for the Python programming language.
Article
Full-text available
Feed manufacturing is an integral component of the poultry industry. A study was conducted to evaluate the impact of feed ingredients, conditioning temperature, and AZOMITE (AZ) level on production rate and pellet quality measured by pellet durability index (PDI). A 2 × 2 × 2 × 3 factorial arrangement within the randomized complete block design was used, for a total of 24 treatments. A broiler grower diet was tested under two conditioning temperatures (82.2°C and 87.8°C), two distillers dried grains with solubles (DDGS) levels (0 and 8%), two meat and bone meal (MBM) levels (0 and 4%), and three levels of AZ (0, 0.25, 0.50%). Conditioning temperature, DDGS, MBM, and AZ levels all influenced production rate. Interactions between the different factors were assessed and interpreted. These interactions indicate that the inclusion of AZ increases production rate in diets containing DDGS. Both DDGS and MBM decreased production rate compared with relative controls, whereas AZ at 0.25% and 0.50% increased the production rate. Increasing conditioning temperature from 82.2°C to 87.8°C improved production rate and had a positive influence on PDI. The level of MBM and AZ did not impact PDI; however, the inclusion of DDGS had a negative impact on PDI. Although AZ did not increase throughput and PDI simultaneously, its ability to improve pellet production while maintaining PDI indicates usefulness in feed manufacturing. These results indicated that temperature and AZ can be used to offset some of the negative effects of DDGS on pelleting production rate.
Article
Full-text available
This paper presents an experimental design approach for the optimization process parameters for the chicken feed pellets. To achieve this goal, the speed parameters and number of grinding wheels are selected and two levels of this parameter are considered. Design of expert (DOE) of tests was used for experimental design and analysis of results. The chicken feed ingredients mixture is formed into pellets using a grinding wheel pellet machine with a variation of 150, 200 and 250 rpm rotation and variations in the number of grinding wheels 4, 6, and 8 pieces. The highest pellet production capacity (26.2 Kg/hour) occurs at 200 rpm rotating speed and 4 pieces of grinding wheels. The highest engine efficiency (87.6%) occurs at 200 rpm rotating speed and 4 pieces of grinding wheels. The highest pellet durability (91.6%) occurs at 200 rpm rotating speed and 4 pieces of grinding wheels. Optimal machining parameters are recommended to produce a pellet production capacity response of 25.6 Kg/hour, 85.53% efficiency machine, and 91.473% durability pellets are 150-225 rpm rotation speed range and 4-5 pieces grinding wheels.
Article
Full-text available
Pelleting is the most common heat processing method used in the poultry feed industry and the quality of feed processing directly impacts the efficiency of the utilization of feedstuffs by broilers, and consequently, their performance. The objective of this experiment was to evaluate the influence of conditioning temperature on pellet physical quality, and on the apparent ileal digestibility coefficients of dry matter (CIADDM), crude protein (CIADCP) and starch (CIADstarch), coefficient of apparent metabolizability of dry matter (CAMDM), and the apparent metabolizable energy content (AME) of the diets. The live performance of broilers (feed intake, FI; weight gain, WG; and feed conversion ratio, FCR) was evaluated. Treatments consisted of a mash diet and pelleted/crumbled corn-soybean meal based diets submitted to different conditioning temperatures (no conditioning or conditioned at 60, 70, 80, and 90°C). Feed was steam-conditioned for 15 s at a 1.5 kgf/cm² for all pelleted treatments. Pellet quality was determined as a function of Pellet Durability Index (PDI), percentage of fines, and pellet hardness. Conditioning temperature linearly increased (P < 0.05) PDI and pellet hardness, CIADDM, CIADCP, and CIADstarch, had quadratic effect on AME and CAMDM (P < 0.05). Broiler FI was not affected by the different conditioning temperatures, but WG and FCR presented a quadratic behavior (P < 0.05). Overall, current results suggest that the increase on conditioning temperature is an important tool to improve physical quality of pelleted diets as well as protein and starch availability. However, high conditioning temperatures may reduce broiler performance.
Article
Full-text available
This experiment was conducted to study the effect of different feeding programs and pelleting on performance, nutrient digestibility, ileal digestible energy (IDE); and carcass yield of broilers from 21 to 35 d of age. In total, 768 male broilers were distributed according to a completely randomized design with 6 treatments and 8 replicates of 16 birds each. The treatments were mash and pelleted diets provided ad libitum, or pelleted and supplied at the same rate (100%) or restricted at 95, 90, and 85% (P100, P95, P90, and P85) of the amount consumed by the birds fed mash diet ad libitum. When supplied ad libitum, the pelleted diet had the highest feed intake and weight gain (WG), better feed conversion ratio (FCR), better feed conversion adjusted for 2.3 kg (AdjFCR, P < 0.001) and caloric conversion (P < 0.001); and higher amount of abdominal fat (P < 0.001) when compared to the control (mash ad libitum). However, there were no effects on nutrient digestibility (P > 0.05). When the pelleted feed was provided in the same amount as in the control group, there were no differences in any of the evaluated parameters (P > 0.05). Limiting pelleted diet to 95, 90, and 85% of free choice mash diet resulted in lower WG (P < 0.001). P90 and P95 treatments resulted in higher dry matter and crude protein digestibility and IDE in relation to the others (P < 0.001). Carcass yield was reduced (P < 0.05) in the birds fed P85 diet. The regression analysis between P100, P95, P90, and P85 showed a linear reduction in WG when restriction was increased (P < 0.01); however, there was a linear increase in the nutrient digestibility (P < 0.001). It is concluded that pelleting improves broiler performance, but these results depend on feed intake. The higher intake provided by pelleting can increase the amount of abdominal fat. Feed intake reduction can result in lower performance and lower carcass and cuts yield in broilers.
Preprint
Full-text available
Over the past decades, researchers and ML practitioners have come up with better and better ways to build, understand and improve the quality of ML models, but mostly under the key assumption that the training data is distributed identically to the testing data. In many real-world applications, however, some potential training examples are unknown to the modeler, due to sample selection bias or, more generally, covariate shift, i.e., a distribution shift between the training and deployment stage. The resulting discrepancy between training and testing distributions leads to poor generalization performance of the ML model and hence biased predictions. We provide novel algorithms that estimate the number and properties of these unknown training examples---unknown unknowns. This information can then be used to correct the training set, prior to seeing any test data. The key idea is to combine species-estimation techniques with data-driven methods for estimating the feature values for the unknown unknowns. Experiments on a variety of ML models and datasets indicate that taking the unknown examples into account can yield a more robust ML model that generalizes better.
Article
Full-text available
Statistics draws population inferences from a sample and machine learning finds generalizable predictive patterns.
Article
Full-text available
This paper is devoted to the comparison of Ridge and LASSO estimators. Test data is used to analyze advantages of each of the two regression analysis methods. All the required calculations are performed using the R software for statistical computing.
Article
Full-text available
An experiment was conducted to investigate the effects of the feed form and conditioning time of pelleted diets on pellet quality, broiler performance and nutrient digestibility during the starter phase. A total of 480 male Cobb broilers were distributed according to a completely randomized experimental design into six treatments with eight replicates each. Treatments consisted of a mash diet and five crumbled diets submitted to different conditioning times (zero, 60, 80, 100, or 120 seconds). The broilers fed pelleted diets submitted to steam conditioning presented higher feed intake and BW gain (P ≤ 0.05), higher coefficient of ileal apparent digestibility (CIAD) of DM and CP, as well as higher ileal digestible energy (IDE) (P ≤ 0.05) than those fed the mash diet. However, treatments did not influence FCR or starch digestibility (P > 0.05). Feed intake increased linearly (P ≤ 0.05) with conditioning time while a quadratic response (P ≤ 0.05) was noted for IDE. Conditioning time did not affect the amount of intact pellets or protein solubility (P > 0.05), but increased pellet durability index (P ≤ 0.01), pellet hardness (P ≤ 0.05), and water activity (P ≤ 0.05). It was concluded that feed physical form and conditioning time influence the performance and nutrient digestibility in starter broilers. and that increasing conditioning times promote better pellet quality.
Article
Full-text available
Pelleting is the most popular thermal processing technique in poultry industry. Birds fed pelleted diets have greater feed intake and weight gain, and better feed conversion ratio. However, this better performance can only be achieved, if the pellets remain intact until they are ingested by the birds. Many factors may affect pellet physical quality, such as feed nutritional composition, ingredient particle size, conditioning temperature and time, feed moisture, etc.. Despite their importance, sometimes these factors are not managed properly, therefore, pelleted feed may not contain a high amount of intact pellets. In addition, the possible interactions among these variables may yield different responses in comparison with those expected when individual factors are considered. Very few experiments have been conducted to evaluate the impact of combined factors on pellet quality. This may be explained by the presence of many qualitative and quantitative factors in the manufacturing process. Research indicates that heat processing and feed formulation, especially fat inclusion level, are the factors which have the biggest influence on pellet quality. Strategies, such as the expansion process and fat inclusion restriction or post pellet liquid fat application could be implemented to produce high physical quality pellets. More research is needed to identify which factors have a positive or negative effect on pelleting process and to find new strategies to improve pellet physical quality.
Article
Full-text available
An experiment was carried out to evaluate the effect of the inclusion of 20% whole-grain or ground pearl millet (PM) in mash and pelleted diets on the performance, carcass traits, and organ weights of broilers reared until 21 days of age. A randomized block experimental design in a 3 x 2 factorial arrangement (diets containing corn and soybean meal, whole-grain PM, or ground PM x mash or pelleted diets), with five replicates per treatment and 10 birds per experimental unit, was applied. Diets were analyzed for mean geometric diameter, geometric standard deviation, pellet hardness, and density. Broiler performance, carcass yield, and organ weights were evaluated. On day 21, one bird with the average weight of each experimental unit was sacrificed for carcass evaluation. It was concluded that both as whole-grain and ground PM can be added to the diet of broilers up to 21 days of age. The dietary inclusion of PM results in higher abdominal fat deposition. Broilers fed the pelleted diets presented lower feed intake, better feed conversion ratio, lower gizzard and heart percentages, and higher carcass weight.
Article
Full-text available
Pelleting is the most prevalent heat treatment in the production of poultry feed. The objective of pelleting is to agglomerate smaller feed particles into larger particles as pellets to enhance the economics of production by increasing the feed intake, and thus growth performance and feed efficiency. However, due to the heat, moisture and mechanical pressure applied during conditioning and pelleting, some chemical and physical alterations occur that may have beneficial or detrimental effects on feed components, gastrointestinal development and subsequent bird performance. Pelleting process has been shown to gelatinise starch, but only to a small extent, and thus may be of modest relevance in starch digestion. Pelleting process may also result in partial denaturation of proteins; a process which can potentially improve protein and to some extent starch digestibility due to inactivation of proteinaceous enzyme inhibitors. Cell wall breakage, as a result of the physical stress of pelleting, may also provide greater accessibility of nutrient contents, previously encapsulated within endosperm sub-aleurone, to digestive enzymes. In diets based on viscous cereals, nutrient availability may be negatively affected through increased digesta viscosity as a result of either an increase in soluble carbohydrate concentration or changes in the molecular weight of soluble fibres or both, due to pelleting. Pelleting process also remains a potentially aggressive process on the stability of exogenous feed enzymes and vitamins, a major concern of feed manufacturers. Particle size-reducing property of the pelleting process may result in a suboptimal gizzard development and thus reduced nutrient digestibility of diets for poultry. While physical pellet quality is a critical factor to optimise feed efficiency and growth response of broilers, the present review highlights that it is the balance between nutrient availability and physical quality of pellets which is critical in determining the actual performance of broilers. Under the conventional pelleting process, good pellet quality is usually obtained at the expense of nutritional quality. Research is warranted to identify and evaluate possible strategies to manufacture highly digestible high quality pellets. Such strategies will require novel approaches of improving feed hygiene which are not detrimental to feed nutrients.
Article
Full-text available
Least-cost diet formulations and pellet mill operating techniques vary widely. As a result, pellet quality is often inconsistent. Past research has associated pellet quality changes with feed formulation and manufacturing techniques. However, the interaction between the 2 factors has rarely been explored. The objective of the current study was to evaluate the effects of altering a least-cost diet (LC) formulation and altering manufacturing techniques on pellet processing variables and quality. Generally, pellet quality improves with higher levels of protein and moisture. Therefore, increased levels of CP and moisture were added to LC broiler starter and grower formulations to compose a research-based (RB) formulation. The LC and RB formulations were pelleted using 2 manufacturing techniques, a thin die with a fast production rate (TF) or a thick die with a slow production rate (TS). During manufacture of the starter diets, the RB formulation improved the pellet durability index (PDI) and modified PDI (MPDI) while decreasing pellet mill relative electrical energy usage (P = 0.05) compared with the LC formulation. The TS technique increased PDI and MPDI while decreasing production of fines (P = 0.05) compared with the TF technique. During manufacture of the grower diets, the RB formulation and TS technique resulted in decreased production of fines (P = 0.05) compared with the LC formulation and TF technique. A significant interaction observed for PDI and MPDI of the grower diets indicated that the RB formulation improved pellet quality and would be even more beneficial if a mill used a TF technique (P = 0.05). We conclude that diet formulation and manufacturing technique are, in fact, linked and must be considered when attempting to optimize pellet quality.
Article
Full-text available
This paper reviews the use of evolutionary algorithms (EAs) to optimize artificial neural networks (ANNs). First, we briefly introduce the basic principles of artificial neural networks and evolutionary algorithms and, by analyzing the advantages and disadvantages of EAs and ANNs, explain the advantages of using EAs to optimize ANNs. We then provide a brief survey on the basic theories and algorithms for optimizing the weights, optimizing the network architecture and optimizing the learning rules, and discuss recent research from these three aspects. Finally, we speculate on new trends in the development of this area.
Article
Full-text available
The majority of broiler feed is in pelleted form. Feeding pelleted diets results in improved BW gain and FE compared with feeding mash diets. However, improvements in performance are contingent on pellet quality. Past research has focused on methods to improve pellet qual-ity without negatively affecting processing variables and performance. The objective of these studies was to evaluate the effects and interactions that occur when small inclusion amounts of fiber, protein, and moisture were formulated into corn-and soybean-based broiler diets. Two experiments were conducted. In experiment 1, small inclusion amounts (5%) of fiber (in the form of cellulose) or protein (in the form of soy protein isolate) improved the pellet durability index and the modified pellet durability index compared with the control (P ≤ 0.05). Moreover, manufacturing variables were not negatively affected (P > 0.05). In experiment 2, small inclu-sion amounts of protein (2%), in the form of soybean meal, and moisture (2 and 4%), in the form of tap water, improved the pellet durability index and the modified pellet durability index (P ≤ 0.05). Small inclusion amounts of fiber, in the form of oat hulls, negatively affected pellet quality and manufacturing variables. Ingredient interactions were not observed for manufac-turing variables or pellet quality. These results demonstrate that small inclusion amounts of supplemental fiber (cellulose), protein (soy protein isolate or soybean meal), and moisture (tap water) can be used to ameliorate poor pellet quality.
Article
Full-text available
Feed currently constitutes 60 to 65% of the total cost of broiler production. Pellets are the primary feed form for commercially reared broilers. Understanding how to optimize pellet quality through precision thermo-mechanical processing may impact broiler performance and nutrient availability, and thus cost of production. Steam conditioning represents a manipulable thermo-mechanical processing variable. Corn-soybean meal-based diets were conditioned with 1 of 4 steam pressure × temperature combinations: 138 kPa at 82.2°C, 138 kPa at 93.3°C, 552 kPa at 82.2°C, or 552 kPa at 93.3°C. High steam pressure and temperature conditioning were shown to increase pellet quality. Three additional diets were prepared: unprocessed mash, the 552 kPa/93.3°C diet reground to mash, and a 50:50 combination of pellets and mash produced from the 552 kPa/93.3°C treatment to simulate a high fine percentage diet. All diets were fed to Cobb 500 broilers from 21 to 38 d, and nutrient availability was determined with Single Comb White Leghorn roosters. Broilers fed pellets conditioned with high steam temperature demonstrated decreased feed intake and feed conversion ratio. Amino acid and energy availability were not affected by variations in steam conditioning.
Article
Full-text available
Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.
Article
Full-text available
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.
Article
Full-text available
This experiment was conducted to investigate the effect of the form of diets with different levels of protein and energy on broilers performance at the end of the third week. A total of 2800 male broiler chicks were fed with two forms of diet (mash and crumble-pellet), two levels of protein (23% and 21% CP), and two levels of energy (3200 and 3000 Kcal/Kg ME) from 1 to 21 days of age. The bodyweight (BW) and Feed conversion rate (FCR) were affected by the form of diet with the crumble-pellet form being better (P < .001). The diet with high protein significantly increased BW and decreased FCR (P < .001). The different levels of energy did not affect FCR and BW in crumble-pellet diet but should a significant effect on them in mash diet (P < .05). There were no significant interactions for any of the parameters tested except for interactions between energy and feed form. BW and FCR were improved by energy when diets were fed in the mash form (unlike the crumble-pellet form) at all ages. It is concluded that feeding crumble-pellets from 1 to 21 days of age improved BW and FCR and that an increase in the protein (unlike energy) content of the diet increased the performance of the chickens at the end of the third week.
Article
Full-text available
In life sciences, interpretability of machine learning models is as important as their prediction accuracy. Linear models are probably the most frequently used methods for assessing feature relevance, despite their relative inflexibility. However, in the past years effective estimators of feature relevance have been derived for highly complex or non-parametric models such as support vector machines and RandomForest (RF) models. Recently, it has been observed that RF models are biased in such a way that categorical variables with a large number of categories are preferred. In this work, we introduce a heuristic for normalizing feature importance measures that can correct the feature importance bias. The method is based on repeated permutations of the outcome vector for estimating the distribution of measured importance for each variable in a non-informative setting. The P-value of the observed importance provides a corrected measure of feature importance. We apply our method to simulated data and demonstrate that (i) non-informative predictors do not receive significant P-values, (ii) informative variables can successfully be recovered among non-informative variables and (iii) P-values computed with permutation importance (PIMP) are very helpful for deciding the significance of variables, and therefore improve model interpretability. Furthermore, PIMP was used to correct RF-based importance measures for two real-world case studies. We propose an improved RF model that uses the significant variables with respect to the PIMP measure and show that its prediction accuracy is superior to that of other existing models. R code for the method presented in this article is available at http://www.mpi-inf.mpg.de/ approximately altmann/download/PIMP.R CONTACT: altmann@mpi-inf.mpg.de, laura.tolosi@mpi-inf.mpg.de Supplementary data are available at Bioinformatics online.
Article
Full-text available
Rations containing varying ratios of corn, high-oil corn, soybean meal, and mechanically expelled soybean meal were pelleted. The effects of ingredients, conditioning steam pressure, and mixing paddle configuration inside the conditioner on pellet quality were investigated. Ration ingredients strongly affected pellet quality. Increasing the protein content increased the pellet durability, whereas increasing the oil content above 7.5% greatly decreased pellet durability. High-oil corn and mechanically expelled soybean meal produced acceptable pellets when combined with soybean meal and regular corn, respectively. However, poor pellet quality resulted when rations containing high-oil corn and mechanically expelled soybean meal were processed. Increasing the residence time in the conditioner by changing mixing paddle pitch resulted in an average 4.5-point increase in pellet durability indices among 65:35 (wt) corn:soybean meal and 65:35 high-oil corn:soybean meal rations.
Article
This study was developed to model the pellet quality to identify influential factors in an industrial pelleting process of feeds for broilers and pigs. Two independent databases were used to calibrate and to validate the models. Each column of the spreadsheet represented a descriptive variable of the manufacturing process (yield, amperage, pressure in the conditioner, and temperatures of the environment, the conditioner, and the cooler), feed characteristics (inclusion of the ingredients in the feed formula and bromatological composition of the main ingredients), and pellet quality (percentage of fines and pellet durability index - PDI). Each row of the spreadsheet represented one observation, or the equivalent of a lot of feed produced. The data were submitted to graphical analysis, descriptive statistics, and regression analysis by step-wise procedure. Three models were developed for each variable (yield, percentage of fines, and PDI): Model I, characteristics of the manufacturing process and inclusion of the ingredients in the formula; Model II, characteristics of the manufacturing process and weighted bromatological composition; and Model III, characteristics of the manufacturing process, inclusion of the ingredients in the formula, and weighted bromatological composition. The accuracy of the models (validation) was evaluated by the mean square of the predicted error (MSPE). The models obtained in this study differed from each other in the number of predictors selected in the statistical procedure. However, the main factors have been found repeatedly in the models. The amperage represented at least 22.84% of the total variance, the cooling temperature responded by at least 2.93%, and the inclusion of soybean oil in the feed formula accounted for at least 4.21%. The models that considered characteristics of the manufacturing process and the inclusion of the ingredients in the formula (Models I) were the most accurate (lower MSPE) in relation to the Models II and III of each response. The pelletizing process requires constant monitoring in the feed factories and the models generated in this study are useful in the quality assurance sectors, providing a better definition of the monitoring parameters.
Article
Ensemble methods are considered the state‐of‐the art solution for many machine learning challenges. Such methods improve the predictive performance of a single model by training multiple models and combining their predictions. This paper introduce the concept of ensemble learning, reviews traditional, novel and state‐of‐the‐art ensemble methods and discusses current challenges and trends in the field. This article is categorized under: • Algorithmic Development > Model Combining • Technologies > Machine Learning • Technologies > Classification
Article
In animal feed pellets, the fat content is obtained either from the feed ingredients or is directly added during processing. Additional fat is required when the fat level in the feed ingredients is less than the desired level. This fat can be added either during the mixing process or after the pelleting process. However, adding fat at different time leads to different results. The addition of an increasing amount of fat during the mixing process decreases the pellet durability but enhances the pellet production rate. To avoid a reduction in the pellet durability, limiting the inclusion of fats in the mixer is suggested. The use of suitable fat addition ratios during mixing and after pelleting can improve the pellet quality and the production capability. Many factors significantly affect the decision of how much fat to add, such as the fiber inclusion content in the feed formulation, pellet die size, required feed durability, total required fat, and required additional fat. Due to frequent changes in the feed mix, anticipating the suitable amount of fat addition during the mixing process becomes a cumbersome task for a mill. In this paper, a model for estimating the amount of fat required in the mixer for each feed formulation is proposed. The model is based on the local linear map (LLM) and the back-propagation neural network (BPNN) methods. The LLM is used to identify which feed formulations require the addition of fat both during mixing and after pelleting, whereas the BPNN is employed for estimating the proper total fat required in the mixer, and the ratio of fat to add during the mixing process is subsequently estimated by subtracting the fat in the raw material from the total fat required in the mixer. The model is developed using data from one the largest feed mills in Thailand. The proposed model provides an accurate prediction and is practical for implementation in the mill that was studied.
Article
Feed manufacturing faces enormous challenges and with the demand for good quality feed increasing gradually, it becomes essential to improve the processes in a feed mill. This article provides a brief overview of the different processes in feed manufacturing and identifies the critical process parameters. Five critical parameters are identified where the production rate is the output parameter. Mash feed size, steam temperature, conditioning time and feed rate are the input parameters. Artificial neural network is the methodology which is used to optimize the process parameters. Root mean squared error and coefficient of determination and computation time are used as performance measures and it is observed that Polak-Ribiere conjugate gradient backpropagation training function with log sigmoid - pure linear transfer function combination provided good results among the different available alternatives. The process parameters are then optimized using the appropriate ideal settings of neural network parameters. This model is extremely useful for the prediction of production rate for 1 specific recipe in a feed mill.
Article
Pelleting of animal feed occurs extensively throughout the feed manufacturing industry and steam conditioning plays an important role in this process. We investigated the effects of mash moisture, retention time, and steam quality in two conditioners (manufactured by California Pellet Mill (CPM) Co. and Bliss Industries) on pellet quality, electrical energy consumption, and steam flow rate during the pelleting process. Results of this study indicated that pellet quality, energy consumption, and steam flow were significantly related to mash moisture (12 and 14%), retention time (short and long), steam quality (70, 80, 90, and 100%), and their interactions in mash conditioned to a constant 82.2 °C. The maximum pellet quality (88% pellet durability) was achieved with two combinations of steam quality and retention time (70%-short retention time, 80%-long retention time)for the 14% moisture mash using the CPM conditioner. A long retention time resulted in the lowest energy consumption (kWh/t) during pellet production for the 12% moisture mash with the Bliss conditioner. Feed conditioned to 82.2 °C using 100% quality steam required a lower flow rate (kg/h) than did the 70% quality steam for both conditioners. As competitive pressures continue in the global feed business, this study augments the manufacturers' knowledge of how to control this capital intensive cost center.
Article
Cereal Chem. 82(4):462-467 The quality of pelleted feeds is dependent on several variables among which formulation is recognized as having the heaviest influence. An experiment was conducted to investigate the functional properties of feed ingredient components when these ingredients were blended at different proportions on the physical quality of pelleted feeds as measured by the pellet durability index (PDI). Thirteen treatments consisting of different inclusion levels of corn, soybean meal, and soybean oil were designed and a total of 18 batches manufactured, processed, and tested for PDI. Three-dimensional plots were used to depict the relationships between composition of the formulations and PDI. Results showed only a weak dependence of the response on the starch levels in the formulas and the degree of starch gelatinization. Protein, especially from soybean meal, had a positive effect on PDI, whereas a strong negative effect was observed following fat inclusion in the mixer at levels >6.5%. Empirical equations devised by multiple regression analysis adequately predicted PDI of verification batches.
Article
k-nearest neighbor (k-NN) classification is a well-known decision rule that is widely used in pattern classification. However, the traditional implementation of this method is computationally expensive. In this paper we develop two effective techniques, namely, template condensing and preprocessing, to significantly speed up k-NN classification while maintaining the level of accuracy. Our template condensing technique aims at “sparsifying” dense homogeneous clusters of prototypes of any single class. This is implemented by iteratively eliminating patterns which exhibit high attractive capacities. Our preprocessing technique filters a large portion of prototypes which are unlikely to match against the unknown pattern. This again accelerates the classification procedure considerably, especially in cases where the dimensionality of the feature space is high. One of our case studies shows that the incorporation of these two techniques to k-NN rule achieves a seven-fold speed-up without sacrificing accuracy.
Article
A total of 144 ISA-i757 broiler chicks were fed on mash, pellet and crumble diet in the age duration of 21 to 56 days to compare the performance of broiler on different dietary groups. All the forms of feed were of identical composition as well as same environment and management were provided for all the treatments. The body weight of birds fed on mash, pellet and crumble group from 4th to 8th weeks of age differed significantly (P0.01). Total cost of production was significantly (P< 0.01) less for crumble and this was statistically similar with pellet group. The results of this experiment give an impression that crumble form of feed is better than mash and pellet form for the production of commercial broiler for the age duration of 21 to 56 days.
Article
A new reproducibility index is developed and studied. This index is the correlation between the two readings that fall on the 45 degree line through the origin. It is simple to use and possesses desirable properties. The statistical properties of this estimate can be satisfactorily evaluated using an inverse hyperbolic tangent transformation. A Monte Carlo experiment with 5,000 runs was performed to confirm the estimate's validity. An application using actual data is given.
Article
Support vector machines (SVMs) are becoming popular in a wide variety of biological applications. But, what exactly are SVMs and how do they work? And what are their most promising applications in the life sciences?
Article
Matplotlib is a 2D graphics package used for Python for application development, interactive scripting, and publication-quality image generation across user interfaces and operating systems. The latest release of matplotlib runs on all major operating systems, with binaries for Macintosh's OS X, Microsoft Windows, and the major Linux distributions. Matplotlib has a Matlab emulation environment called PyLab, which is a simple wrapper of the matplotlib API. Matplotlib provides access to basic GUI events such as button_press_event, mouse_motion_event and can also be registered with those events to receive callbacks. Event handling code written in matplotlib works across many different GUIs. It supports toolkits for domain specific plotting functionality that is either too big or too narrow in purpose for the main distribution. Matplotlib has three basic API classes, including, FigureCanvasBase, RendererBase and Artist.
Unknown examples & machine learning model generalization
  • Chung
pandas-dev/pandas: Pandas 1.2.3
  • The pandas development team
Factors Affecting Pellet Quality. Dept. of Poultry Science and Animal Science, Collage of Agricultural
  • K C Behnke
  • E A Fahrenholz
  • C Stark
  • C Jones
Behnke, K.C., Fahrenholz, E.A., Stark, C., Jones, C., 1994. Factors Affecting Pellet Quality. Dept. of Poultry Science and Animal Science, Collage of Agricultural, University of Maryland, College Park, MD, USA, pp. 44-54.