Figure - available from: Remote Sensing
This content is subject to copyright.
The graphs between original observed soil moisture (raw data) and observed soil moisture (SM) removed by isolation forest (IF) considering precipitation (PCP). Blue circles represent the original observed soil moisture before applying the IF method, and red Xs represent observed soil moisture after applying the IF method.

The graphs between original observed soil moisture (raw data) and observed soil moisture (SM) removed by isolation forest (IF) considering precipitation (PCP). Blue circles represent the original observed soil moisture before applying the IF method, and red Xs represent observed soil moisture after applying the IF method.

Source publication
Article
Full-text available
The spatial distribution of soil moisture (SM) was estimated by a multiple quantile regression (MQR) model with Terra Moderate Resolution Imaging Spectroradiometer (MODIS) and filtered SM data from 2013 to 2015 in South Korea. For input data, observed precipitation and SM data were collected from the Korea Meteorological Administration and various...

Citations

... Outlier detection, also referred to as anomaly detection, is a technique employed to identify patterns within data sets that significantly deviate from expected patterns. In this study, the Isolation Forest (IF) algorithm was selected as the outlier detection method (Jung et al. 2020). IF is an ensemble approach that exhibits shorter computational time compared to other anomaly detection algorithms. ...
Article
Full-text available
Forest Canopy Height (FCH) is a crucial parameter that offers valuable insights into forest structure. Spaceborne LiDAR missions provide accurate FCH measurements, but a significant challenge is their point-based measurements lacking spatial continuity. This study integrated ICESat-2’s ATL08-derived FCH values with multi-temporal and multi-source remote sensing (RS) datasets to generate continuous FCH maps for northern forests in Iran. Sentinel-1/2, ALOS-2 PALSAR-2, and FABDEM datasets were prepared in Google Earth Engine (GEE) for FCH mapping, each possessing unique spatial and geometrical characteristics that differ from those of the ATL08 product. Given the importance of accurately representing the geometrical characteristics of the ATL08 segments in modeling FCH, a novel Weighted Kernel (WK) approach was proposed in this paper. The WK approach could better represent the RS datasets within the ATL08 ground segments compared to other commonly used resampling approaches. The correlation between all RS data features improved by approximately 6% compared to previously employed approaches, indicating that the RS data features derived after convolving the WK approach are more predictive of FCH values. Furthermore, the WK approach demonstrated superior performance among machine learning models, with random forests outperforming other models, achieving a coefficient of determination (R²) of 0.71, root mean square error (RMSE) of 4.92 m, and mean absolute percentage error (MAPE) of 29.95%. Furthermore, in contrast to previous studies using only summer datasets, this study included spring and autumn data from Sentinel-1/2, resulting in a 6% increase in R² and a 0.5-m decrease in RMSE. The proposed methodology filled the research gaps and improved the accuracy of FCH estimations.
... Following the prediction of gas concentration sequences, a Bootstrap algorithm is utilized to resample the forecasted outputs with repetition, leading to the derivation of the ultimate prediction confidence interval, as documented in the literature [30][31][32]. Relying on the mean value extracted from the terminal instant, a real-time approximation of the relative gas emission volume is computed. This computation forms the groundwork for the employment of the PersistAD anomaly detection algorithm, which is strategically tasked with surveilling abrupt amplifications, indicative of potential anomalies potentially linked to distinctive geological features or irregular breaches in coal faces. ...
Article
Full-text available
Addressing common challenges such as limited indicators, poor adaptability, and imprecise modeling in gas pre-warning systems for driving faces, this study proposes a hybrid predictive and pre-warning model grounded in time-series analysis. The aim is to tackle the effects of broad application across diverse mines and insufficient data on warning accuracy. Firstly, we introduce an adaptive normalization (AN) model for standardizing gas sequence data, prioritizing recent information to better capture the time-series characteristics of gas readings. Coupled with the Gated Recurrent Unit (GRU) model, AN demonstrates superior forecasting performance compared to other standardization techniques. Next, Ensemble Empirical Mode Decomposition (EEMD) is used for feature extraction, guiding the selection of the Variational Mode Decomposition (VMD) order. Minimal decomposition errors validate the efficacy of this approach. Furthermore, enhancements to the transformer framework are made to manage non-linearities, overcome gradient vanishing, and effectively analyze long time-series sequences. To boost versatility across different mining scenarios, the Optuna framework facilitates multiparameter optimization, with xgbRegressor employed for accurate error assessment. Predictive outputs are benchmarked against Recurrent Neural Networks (RNN), GRU, Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM), where the hybrid model achieves an R-squared value of 0.980975 and a Mean Absolute Error (MAE) of 0.000149, highlighting its top performance. To cope with data scarcity, bootstrapping is applied to estimate the confidence intervals of the hybrid model. Dimensional analysis aids in creating real-time, relative gas emission metrics, while persistent anomaly detection monitors sudden time-series spikes, enabling unsupervised early alerts for gas bursts. This model demonstrates strong predictive prowess and effective pre-warning capabilities, offering technological reinforcement for advancing intelligent coal mine operations.
... Outlier detection, also referred to as anomaly detection, is a technique employed to identify patterns within data sets that signi cantly deviate from expected patterns. In this study, the Isolation Forest (IF) algorithm was selected as the outlier detection method (Jung et al., 2020). IF is an ensemble approach that exhibits shorter computational time compared to other anomaly detection algorithms. ...
Preprint
Full-text available
Forest Canopy Height (FCH) is a crucial parameter that offers valuable insights into forest structure. Spaceborne LiDAR missions provide accurate FCH measurements, but a major challenge is their point-based measurements lacking spatial continuity. This study integrated ICESat-2's ATL08-derived FCH values with multi-temporal and multi-source Remote Sensing (RS) datasets to generate continuous FCH maps for northern forests in Iran. Sentinel-1/2, ALOS-2 PALSAR-2, and FABDEM datasets were prepared in Google Earth Engine (GEE) for FCH mapping, each possessing unique spatial and geometrical characteristics that differ from those of the ATL08 product. Given the importance of accurately representing the geometrical characteristics of the ATL08 segments in modeling FCH, a novel Weighted Kernel (WK) approach was proposed in this paper. The WK approach could better represent the RS datasets within the ATL08 ground segments compared to other commonly used resampling approaches. The correlation between all RS data features improved by approximately 6% compared to previously employed approaches, indicating that the RS data features derived after convolving the WK approach are more predictive of FCH values. Furthermore, the WK approach demonstrated superior performance among machine learning models, with Random Forests outperforming other models, achieving an R² of 0.71, RMSE of 4.92 m, and MAPE of 29.95%. Furthermore, in contrast to previous studies using only summer datasets, this study included spring and autumn data from S1/2, resulting in a 6% increase in R² and a 0.5 m decrease in RMSE. The proposed methodology succeeded in filling the research gaps and improved the accuracy of FCH estimations.
... Following the optimization, the MLPRegressor, XgbRegressor, and RFR were used to predict personnel distribution, and the results are shown in Figure 7 and Table 7. In Figure 8, Panels a-c depict the prediction error plots for the three algorithms, with the 95% confidence intervals calculated using the BootStrap [23,28] algorithm. Each tunnel is represented by bullets of different colors. ...
Article
Full-text available
Efficient evacuation route planning during underground coal mine fires is essential to minimize casualties. This study addresses current shortcomings by proposing a real-time method that integrates a multifactor coupling analysis and the optimized multilayer perceptron regressor-shortest path faster algorithm (MSPFA). This research aims to enhance evacuation route planning by overcoming factors such as inadequate consideration, low accuracy, and information lag in existing methods. This study improves the shortest path faster algorithm (SPFA) for dynamic route planning, mitigates the impact of fixed walking speed parameters using the particle swarm algorithm, and selects the optimal model (MLPRegressor) through the Bootstrap algorithm for estimating personnel walking speeds. Validated through smoke-spread experiments, the MSPFA algorithm dynamically adjusts evacuation routes, preventing toxic passages. Visualization via drawing interchange format (DXF) successfully enhances route comprehension. The MSPFA algorithm outperforms the Dijkstra algorithm with a runtime of 78.5 msand a personnel evacuation time of 3344.74 s. This research establishes a theoretical foundation for intelligent evacuation decision making in underground fire disasters. By introducing the MSPFA algorithm, it provides crucial technical support, significantly reducing the risk of casualties during emergencies.
... Most recently, Gallardo et al. [12] proposed a parametric quantile regression model for asymmetric response variables. Jung et al. [13] applied the multiple quantile regression method to the estimation of the spatial distribution of soil moisture. Chen et al. [14] studied estimation and inference for linear quantile regression models with generated regressors using a practical, two-step estimation procedure. ...
Article
Full-text available
Temporal gene expression data contain ample information to characterize gene function and are now widely used in bio-medical research. A dense temporal gene expression usually shows various patterns in expression levels under different biological conditions. The existing literature investigates the gene trajectory using the mean function. However, temporal gene expression curves usually show a strong degree of heterogeneity under multiple conditions. As a result, rates of change for gene expressions may be different in non-central locations and a mean function model may not capture the non-central location of the gene expression distribution. Further, the mean regression model depends on the normality assumptions of the error terms of the model, which may be impractical when analyzing gene expression data. In this research, a linear quantile mixed model is used to find the trajectory of gene expression data. This method enables the changes in gene expression over time to be studied by estimating a family of quantile functions. A statistical test is proposed to test the similarity between two different gene expressions based on estimated parameters using a quantile model. Then, the performance of the proposed test statistic is examined using extensive simulation studies. Simulation studies demonstrate the good statistical performance of this proposed test statistic and show that this method is robust against normal error assumptions. As an illustration, the proposed method is applied to analyze a dataset of 18 genes in P. aeruginosa, expressed in 24 biological conditions. Furthermore, a minimum Mahalanobis distance is used to find the clustering tree for gene expressions.
... The main differences between these two methods are the electromagnetic energy source, the wavelength region of the electromagnetic spectrum used, the response measured by the sensor and so on [8]. The former estimates SMC by analyzing correlations between SMC and various outputs from optical satellites, such as surface temperature and vegetation-related indices, and uses various statistical, empirical, or machine learning techniques [9][10][11]. The latter method directly estimates SMC using surface backscatter difference water index (NDWI) [42], estimated by optical satellites or ground vegetation measurements [40]. ...
... As an alternative to vegetation parameters, precipitation data were applied by borrowing the concept of antecedent precipitation from the Soil Conservation Service-Curve Number (SCS-CN) method in some studies [9][10][11][44][45][46]. The SCS-CN method was developed by the U.S. Soil Conservation Service (SCS) to create the synthetic unit hydrograph [47]. ...
Article
Full-text available
this study estimates soil moisture content (SMC) using Sentinel-1A/B C-band synthetic aperture radar (SAR) images and an artificial neural network (ANN) over a 40 × 50-km 2 area located in the Geum River basin in South Korea. The hydrological components characterized by the antecedent precipitation index (API) and dry days were used as input data as well as SAR (cross-polarization (VH) and copolarization (VV) backscattering coefficients and local incidence angle), topo-graphic (elevation and slope), and soil (percentage of clay and sand)-related data in the ANN simulations. A simple logarithmic transformation was useful in establishing the linear relationship between the observed SMC and the API. In the dry period without rainfall, API did not decrease below 0, thus the Dry days were applied to express the decreasing SMC. The optimal ANN architecture was constructed in terms of the number of hidden layers, hidden neurons, and activation function. The comparison of the estimated SMC with the observed SMC showed that the Pearson's correlation coefficient (R) and the root mean square error (RMSE) were 0.85 and 4.59%, respectively.
... Most recently, Gallardo et al. [12] proposed a parametric quantile regression model for asymmetric response variables. Jung et al. [13] applied the multiple quantile regression method to the estimation of the spatial distribution of soil moisture. Chen et al. [14] studied estimation and inference for linear quantile regression models with generated regressors using a practical, two-step estimation procedure. ...
Article
Chronic Obstructive Pulmonary disease (COPD) is a heterogenous respiratory disease characterized by a progressive, not fully reversible airflow limitation associated with an abnormal inflammatory response of the lung to noxious stimuli. It is a disease presenting with pulmonary inflammation as well as a systemic one. Measurement of inflammatory marker is difficult but platelet count estimation is easy and less costly. This descriptive, cross-sectional study was carried out at Department of Medicine, Mymensingh Medical college Hospital, Mymensingh, Bangladesh for a period of twelve months among fifty-nine COPD patients. Data were collected through interview, physical examination and laboratory investigations. Statistical analysis was performed using SPSS version 22.0 for consistency and completeness. Age range of the patients was 40 to 49 years with a mean of 56.3±10.9 years. Age group 40-49 years contained the highest number (19; 32.3%) of patients. Majority 57(96.6%) of the respondents were male. Thirty seven (62.7%) of patients were illiterate. Majority 56(94.9%) of patients resided in rural area, of them most 38(64.4%) were farmers. According to Spirometric measurement among 59 respondents of COPD patient, 3(5.1%) were in GOLD stage-I, 9(15.3%) in GOLD stage-II, 27(45.8%) in GOLD stage-III and 20(33.9%) in GOLD stage IV group. Mean platelet count (10³/μl), 241.6±86.5 was found in mild, whereas 315.0±47.7 in moderate, 337.2±76.3 in severe, and 412.4±67.5 in very severe group of COPD patients. So increase in platelet count is statistically significant in severity of COPD. In conclusion, platelet count measurement is less costly to categorize COPD and may be a diagnostic marker.
Article
Full-text available
This study is to estimate soil moisture (SM) using Sentinel-1A/B C-band SAR (synthetic aperture radar) images and Multiple Linear Regression Model(MLRM) in the Yongdam-Dam watershed of South Korea. Both the Sentinel-1A and -1B images (6 days interval and 10 m resolution) were collected for 5 years from 2015 to 2019. The geometric, radiometric, and noise corrections were performed using the SNAP (SentiNel Application Platform) software and converted to backscattering coefficient of VV and VH polarization. The in-situ SM data measured at 6 locations using TDR were used to validate the estimated SM results. The 5 days antecedent precipitation data were also collected to overcome the estimation difficulty for the vegetated area not reaching the ground. The MLRM modeling was performed using yearly data and seasonal data set, and correlation analysis was performed according to the number of the independent variable. The estimated SM was verified with observed SM using the coefficient of determination (R²) and the root mean square error (RMSE). As a result of SM modeling using only BSC in the grass area, R² was 0.13 and RMSE was 4.83%. When 5 days of antecedent precipitation data was used, R² was 0.37 and RMSE was 4.11%. With the use of dry days and seasonal regression equation to reflect the decrease pattern and seasonal variability of SM, the correlation increased significantly with R² of 0.69 and RMSE of 2.88%.