Figure - available from: Sustainability
This content is subject to copyright.
The general structure of LightGBM.

The general structure of LightGBM.

Source publication
Article
Full-text available
Elastic modulus (E) is a key parameter in predicting the ability of a material to withstand pressure and plays a critical role in the design of rock engineering projects. E has broad applications in the stability of structures in mining, petroleum, geotechnical engineering, etc. E can be determined directly by conducting laboratory tests, which are...

Similar publications

Article
Full-text available
House prices have significant impact on people’s daily life, and it is essential for people to have fixed abode, to live, work and social prosperity and stability. Hence predicting House price is a meaningful and big challenge. To achieve this goal, we use California Census dataset in this project to how distinctive features (attributes) can make t...

Citations

... Moreover, the present results underline the significant and non-negligible role of the type of the geological formation and different main pressures on drilling performance estimation. Although the well-known influence of physicals and others mechanicals parameters in ROP prediction are clearly shown in many previous researches [64][65][66][67][68][69], the results obtained from the present study demonstrated the significant and non-negligible role of the different main operating pressures on drilling performance estimation and could greatly contribute to help companies in making wise decisions and hydraulic drilling operations more efficient. ...
Article
Full-text available
The city’s population growth in developing countries over the past two decades led to a sharp increase in water supply needs. This generates proliferation of drilling companies and a highly competitive environment, in Cameroon in particular. A regular optimization of drilling operations in geological formations appears to be crucial and urgent, so as to reduce drilling costs since the drilling equipment used is too expensive and sometimes scarce. The present paper investigated an accurate machine learning model for the penetration rates (ROP) prediction in lateritic soil covers layers. The present study investigates various machine learning techniques including the linear regression, K-Nearest Neighbors, ridge regression and Random Forest, to predict the penetration rate. Data from four (04) defined parameters: the percussion pressure, Pp (MPa); the blowing pressure, Ps (MPa); the pressure of compressor, Pc (MPa); the rotation speed, Vr (tr/min) have been used to build the dataset, for training and validation tests, with an 70/30 ratio. The drilling time and drilling depth ranges from 0.3 to 2.8 h, and from 0.85 to 4.6 m respectively, for a constant rotation speed of 1350 tr/min while values of pressures range between 2.5 and 294.3 MPa. The key performance metrics including correlation coefficient (R2), mean absolute error (MAE) and root mean square error (RMSE) were calculated for each method to evaluate the accuracy of the predictions. Results show that the Random Forest model exhibited the best accuracy, with a R2 value of 0.999, with RMSE and MAE values of 0.0225 and 0.0121 respectively. A relatively high accuracy has been obtained for the K-nearest neighbors method with R2, RMSE and MAE values of 0.933, 0.1547 and 0.0802 respectively. Relatively low to values of metrics have also been obtained for linear regression and ridge regression methods. Related R2, RMSE, MAE values obtained are respectively 0.8612, 0.2232 and 0.1724 for the linear regression, and 0.8447, 0.2361 and 0.1894 for the ridge regression method. The discussion of the obtained results shows that, the Rain Forest model predicts the ROP value with a highly good accuracy, and can thus greatly contribute in reducing the costs and time related to drilling operations in lateritic soil covers context.
... The samples were analyzed for various properties, including wet density (WD) in (g/cm 3 ), moisture (%), dry density (DD) in (g/cm 3 ), Brazilian tensile strength (BTS) in (MPa), uniaxial compressive strength (UCS) in (MPa), and elastic modulus (E) in (GPa). Unlike a prior study (Shahani et al. 2022) that utilized a single predictive model based on 106 datasets, this research implemented metaheuristicoptimized ensemble models, incorporating a larger dataset to enhance accuracy. Table 2 provides the statistical distribution of the E dataset used in this study. ...
Article
Full-text available
The elastic modulus (E) of rocks is an essential parameter in mining and rock engineering projects, as it directly affects their stability and structural integrity. This study investigates the application of metaheuristic optimization algorithms, specifically Cuckoo Search (CS) and Harris Hawks Optimization (HHO), to fine-tune the hyperparameters of ensemble regression models, including extreme gradient boosting (XGBoost), decision tree (DT), and adaptive boosting (AdaBoost). A dataset of 122 rock samples, including input parameters such as wet density, moisture, dry density, Brazilian tensile strength, and uniaxial compressive strength, was used to predict E. A dataset was split into training and testing datasets with a 70:30 ratio. Model performance was evaluated using metrics like coefficient of determination (R²), root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) in %, and severity index (SI). The results show that CS-HHO-optimized models significantly outperformed the unoptimized models, with the optimized stacking model providing superior prediction accuracy for predicting E. Both the unoptimized and optimized Stacking Models exhibited superior performance on the test data. The unoptimized Stacking Model achieved an R² of 0.980, RMSE of 0.0483, MAE of 0.0223, MAPE of 12.412%, and SI of 0.1737, while the CS-HHO-optimized Stacking Model yielded a similar R² of 0.980, with slight variations in RMSE (0.0497), MAE (0.0234), MAPE (13.5736%), and SI (0.1786). This study provides a robust predictive framework for rock behavior analysis, contributing to the field of mining and rock engineering project design.
... Performance metrics are the key indicators to assist in model evaluation. The optimal model is considered to have the largest coefficient of determination (R 2 ) (modified after (Cai et al. 2022, Shahani et al. 2021), the smallest MSE (Wu et al. 2024), MAE (Ahmed et al. 2023), and RMSE (Zeng et al. 2021), as well as appropriate a20-index (Shahani et al. 2022a, Shahani et al. 2022b values. Each developed model's performance in predicting backbreak was assessed employing the evaluation metrics below. ...
Article
Backbreak, a recurring issue in blasting operations, causes mine wall instability, equipment failure, inappropriate disintegration, lower drilling efficiency, and increased cost of mining operations. This study aims to address these issues by developing a hybrid LSSVM-GWO model for predicting blast-induced backbreak in open pit mines. To evaluate the effectiveness of the proposed model, its predictive performance was compared with three convolutional models, such as the support vector machine, K-nearest neighbor, and the least square support vector machine. Results demonstrated that the LSSVM-GWO model outperformed the other three models, achieving coefficient of determination values of 0.998 and 0.997, mean absolute error values of 0.0068 and 0.1209, root mean squared error values of 0.0825 and 0.1936, and a20-index values of 0.99 and 1.01 for training and testing datasets, respectively. Furthermore, the SHAP machine learning technique was applied to evaluate the feature importance, revealing that the powder factor had the highest influence, while the burden exhibited the least impact on backbreak. Sensitivity analysis confirmed these findings, highlighting the robustness of the hybrid model. The study concludes that the LSSVM-GWO model significantly enhances the prediction and evaluation of backbreak in open pit mines, providing critical insights to improve blasting operations, reduce costs, and ensure mine safety.
... The comparison of the approaches' performance in estimating V s and static characteristics revealed that SVR had superior accuracy compared to other methods. Shahani et al. (2022) developed different models for the prediction of modulus of elasticity using six machine learning based regression models such as light gradient boosting machine (LightGBM), SVM, Catboost, gradient boosted tree regressor (GBRT), RF, and extreme gradient boosting (XGBoost). Four input parameters were used: wet density, moisture content, dry density, and Brazilian tensile strength. ...
Article
Full-text available
The elastic modulus of basalt is a significant engineering parameter required for many projects. Therefore, a total of 137 datasets of basalts from Digor-Kilittasi, Turkey, were used to predict the elastic modulus of intact rock (Ei) for this study. P wave velocity, S wave velocity, apparent porosity, and dry density parameters were employed as input parameters. In order to predict Ei, seven different models with two or three inputs were constructed, employing four different machine learning methods such as Support Vector Machine (SVM), Gaussian Process Regression (GPR), Ensembles of Tree (ET), and Regression Trees (RT). The performance of datasets, models, and methods was evaluated using the coefficient of determination (R²), Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Error (MAE). This study presented and analyzed the performance of four machine learning methods. A ranking approach was employed to determine the best performing method and dataset. Based on these evaluations, all four machine learning techniques effectively estimate the value of Ei. While they can be used as an appropriate choice for estimating the elastic modulus of basaltic rocks, the ET approach appears to be the most successful method. However, the performance of the GPR is the worst according to model assessments. The average R² values for Model 1 through 7 of the ET method for the five test datasets are 0.97, 0.93, 0.89, 0.97, 0.91, 0.99, and 0.99, respectively. The the average R² values for GPR from Models 1 to 7 for the five test datasets are 0.73, 0.55, 0.69, 0.48, 0.47, 0.73, 0.56, respectively. An additional indication that the ET performed better than all the other methods was the Taylor diagram, which made it simple to determine how well the model predictions matched the observations. Furthermore, these findings validate the performance of the machine learning techniques employed in this study as valuable instruments for future investigations into the modeling of complex engineering issues. The results of this study suggest that machine learning algorithms can help reduce the need for high-quality core samples and labor-intensive procedures in predicting the elastic modulus of basaltic rocks, resulting in time and cost savings.
... Azarafza et al. (2022) and Zhang and Afzal (2022) explored deep learning for predicting rock index properties. Shahani et al. (2022) and Wang et al. (2023) employed machine learning and other computer-based techniques to predict E i 's dependence on various parameters. Additionally, one of the recent advancements in machine learning involves predicting soil organic carbon in coal mining, subsidence, slope stability, and mining processes. ...
Article
Full-text available
The accurate determination of rock elasticity modulus is crucial for geomechanical analysis and reliable rock engineering designs. Traditional experimental methods have limitations in estimating elasticity modulus, prompting the adoption of artificial intelligence and data-driven techniques to develop adaptive and accurate predictive models. This study utilized the Deep Random Forest Optimization (DRFO) algorithm, a hybrid approach combining deep learning and random forest algorithms, to predict rock elasticity modulus. The dataset consisted of 350 sedimentary rock samples from various regions in Iran, including sandstone, limestone, marlstone, and mudstone. The performance of the predictive models was assessed using confusion matrices, statistical errors, and the coefficient of determination (R²). The results revealed the superior performance of the DRFO model, exhibiting a remarkably low Mean Absolute Error (MAE) of 0.180 GPa, outperforming other models. The Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) values (0.026 and 0.161, respectively) confirmed the precision of DRFO’s predictions. DRFO demonstrated robustness and generalization capability, yielding excellent performance in both training and testing datasets. Moreover, accuracy and precision evaluation in the training dataset showed a high accuracy (0.97) and precision (0.97), indicating the reliability of DRFO in estimating rock elasticity modulus. The study underscores the significance of data-driven techniques, particularly the potential of DRFO in accurately predicting rock properties. It contributes valuable insights to the field of geotechnical engineering, aiding infrastructure design and ensuring the safety and stability of sedimentary rock-based structures. Further research can explore DRFO’s adaptability to different geological contexts and extend its application to other essential rock properties, advancing geotechnical and geological engineering practices. The integration of advanced data-driven approaches like DRFO can enhance rock mechanics understanding, facilitating sustainable engineering solutions for various geotechnical projects.
... To select the top-k elements and apply global voting techniques, the data to be learned are partitioned into several trees. Figure 7 illustrates how LightGBM identifies the leaf with the maximum splitter gain using a leaf-wise approach [48]. combined with analytical boosting methods. ...
... To select the topk elements and apply global voting techniques, the data to be learned are partitioned into several trees. Figure 7 illustrates how LightGBM identifies the leaf with the maximum splitter gain using a leaf-wise approach [48]. ...
... LightGBM general structure[48]. ...
Article
Full-text available
The global shear capacity of steel–concrete composite downstand cellular beams with precast hollow-core units is an important calculation as it affects the span-to-depth ratios and the amount of material used, hence affecting the embodied CO2 calculation when designers are producing floor grids. This paper presents a reliable tool that can be used by designers to alter and optimise grip options during the preliminary design stages, without the need to run onerous calculations. The global shear capacity prediction formula is developed using five machine learning models. First, a finite element model database is developed. The influence of the opening diameter, web opening spacing, tee-section height, concrete topping thickness, interaction degree, and the number of shear studs above the web opening are investigated. Reliability analysis is conducted to assess the design method and propose new partial safety factors. The Catboost regressor algorithm presented better accuracy compared to the other algorithms. An equation to predict the shear capacity of composite cellular beams with hollow-core units is proposed using gene expression programming. In general, the partial safety factor for resistance, according to the reliability analysis, varied between 1.25 and 1.26.
... 2. Fitness evaluation: The purpose of this step is to evaluate each individual participating in the search process. In this study, the fivefold cross-validation method is used for designing the fitness function (Shahani et al. 2022b;Qiu and Zhou 2023b;Zhou et al. 2022d). When performing cross-validation, the training set is further divided into five subsets. ...
Article
Full-text available
Whether the underground open stope can remain stable during the planned period is the premise of safe mining. In the field of stability analysis in open stope, there is a famous graphical tool called the Mathews stability graph. However, mining in deep environments and complex conditions has become a mainstream trend, so the stability graph derived from traditional empirical or semi-empirical techniques must be effectively adjusted to ensure its reliability. In response to this problem, this study will incorporate machine learning to explore feasible alternatives to traditional methods. To improve the credibility of the research results, eight classifiers are employed to compare, and a suitable classifier, support vector machine (SVM), is finally determined. In addition, three meta-heuristic strategies, grey wolf optimizer (GWO), particle swarm optimization (PSO), and cuckoo search algorithm (CS), are introduced to further improve the classifier performance and construct the corresponding hybrid models. Through comprehensive evaluation, CS–SVM is determined as the optimal model (accuracy = 0.8462, precision = 0.8125, recall = 0.9559, and F1 score = 0.8784). Based on this hybrid model with reliable generalization ability, the new decision boundary is output to realize the accurate identification of stable and unstable classes. Compared to traditional techniques, the updated stability graph avoids interference from various subjective factors and demonstrates stronger flexibility and interpretability. In addition, a single prediction technique may lead to accidental errors, and the prediction results near the decision boundary inherently possess fuzzy attributes. To overcome these shortcomings, another aspect of this study is to derive a discriminative criterion for convenient use, which consists of two parts: the expression of the fitted curve for the updated decision boundary and the explicit expression output by the genetic programming (GP) technique. Finally, based on the above efforts, the corresponding prediction platform is constructed to provide feedback information for on-site decision-making, which has certain engineering application value.
... CatBoost is a gradient-boosting tree construction method 36 , which makes use of both symmetric and non-symmetric construction methods. In CatBoost, a tree is learned at each iteration with the aim of reducing the error made by previous trees. ...
Article
Full-text available
One of the main challenges in screening of enhanced oil recovery (EOR) techniques is the class imbalance problem, where the number of different EOR techniques is not equal. This problem hinders the generalization of the data-driven methods used to predict suitable EOR techniques for candidate reservoirs. The main purpose of this paper is to propose a novel approach to overcome the above challenge by taking advantage of the Power-Law Committee Machine (PLCM) technique optimized by Particle Swam Optimization (PSO) to combine the output of five cutting-edge machine learning methods with different types of learning algorithms. The PLCM method has not been used in previous studies for EOR screening. The machine learning models include the Artificial Neural Network (ANN), CatBoost, Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM). The CatBoost is used for the first time in this work for screening of EOR methods. The role of the PSO is to find the optimal values for the coefficients and exponents of the power-law model. In this study, a bigger dataset than those in previous studies, including 2563 successful worldwide EOR experiences, was gathered. A bigger dataset improves the generalization of the data-driven methods and prevents overfitting. The hyperparameters of the individual machine-learning models were tuned using the fivefold cross-validation technique. The results showed that all the individual methods could predict the suitable EOR method for unseen cases with an average score of 0.868. Among the machine learning models, the KNN and SVM had the highest scores with a value of 0.894 and 0.892, respectively. Nonetheless, after combining the output of the models using the PLCM method, the score of the predictions improved to 0.963, which was a substantial increase. Finally, a feature importance analysis was conducted to find out the most influential parameters on the output. The novelty of this work is having shown the ability of the PLCM technique to construct an accurate model to overcome the class-imbalance issue in EOR screening by utilizing different types of data-driven models. According to feature importance analysis, oil gravity and formation porosity were recognized as the most influential parameters on EOR screening.
... Structure of a Decision Tree composed of a root node, decision nodes, and leaf nodes.pg. 32Decision Tree models find applications in diverse domains, including finance, healthcare, and engineering fields(Mbonyinshuti et al., 2022;Shahani et al., 2022). They are appreciated for their ability to handle both numerical and categorical data and for not requiring much data pre-processing. ...
... Key properties assessed included wet density (WD), moisture, dry density (DD), Brazilian tensile strength (BTS), shore hardness (SH), elastic modulus (E), and uniaxial compressive strength (UCS). Previously (Shahani et al., 2022a), we used 106 presents three-dimensional (3D) surface plots illustrating the relationships between input parameters and the output variables UCS and E. ...
... In Figure 14, the analysis indicates that the STD of the PSO-XGBoost model is closest to its corresponding original STD, suggesting that it provides a reliable prediction. Drawing from a comprehensive review of publicly available literature (Ghose and Chakraborti, 1986;Katz et al., 2000;Tiryaki, 2008;Jahed Armaghani et al., 2015b;Guha Roy and Singh, 2018;Umrao et al., 2018;Davarpanah et al., 2020;Shahani et al., 2021;Zhong et al., 2021;Shahani et al., 2022a), this study has identified the optimal model, which consistently delivers highly accurate predictions for both UCS and E. While the STD values of other models also approach their original counterparts, they exhibit comparatively lower R 2 values. SHAP is derived from game theory, and it is a multivariate method used to compute the importance values of each feature, helping to understand the influence of each feature on model predictions. ...
Article
Full-text available
The mechanical characteristics of rocks, specifically uniaxial compressive strength (UCS) and elastic modulus (E), serve as crucial factors in ensuring the integrity and stability of relevant projects in mining and civil engineering. This study proposes a novel hybrid PSO (particle swarm optimization) with tree-based models, such as gradient boosting regressor (GBR), light gradient boosting machine (LightGBM), random forest (RF), and extreme gradient boosting (XGBoost) for predicting UCS and E of rock samples from Block IX of the Thar Coalfield in Pakistan. A total of 122 datasets were divided into training and testing sets, with an 80:20 ratio, respectively, to develop the predictive models. Key performance metrics, including the coefficient of determination (R ²), mean absolute error (MAE), and root mean square error (RMSE), were employed to assess the model’s predictive performance. The results indicate that the PSO-XGBoost model demonstrated the highest accuracy in predicting UCS and E, outperforming the other models, which exhibited inferior predictive performance. Furthermore, this study utilized the SHAP (Shapley Additive exPlanations) machine learning method to enhance our understanding of how each input feature variable influences the output values of UCS and E. In conclusion, the proposed framework offers significant advantages in evaluating the strength and deformation of rocks at Thar Coalfield, with promising applications in the field of mining and rock engineering.