Article

Applied Nonparametric Regression

Authors:
To read the full-text of this research, you can request a copy directly from the author.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Sebagai alternatifnya, analisis time series dapat dilakukan dengan model regresi nonparametrik dengan tetap memperhatikan stasioneritas data harus dipenuhi. Hardle [7] memberikan solusi bahwa dari data time series yang stasioner dapat dibawa ke dalam permasalahan regresi dengan mendefinisikan data sekarang sebagai variabel respon dan data satu periode sebelumnya sebagai variabel prediktor. Fitri, et. ...
... Nilai λ yang tidak sama dengan satu menandakan bahwa data tidak stasioner dalam varians, dan data harus ditransformasi Box-Cox. Transformasi Box-Cox yang dinyatakan oleh Persamaan (7). ...
... Praktiknya banyak ditemukan bahwa asumsi independensi data tersebut tidak dipenuhi misalnya data time series dengan respon obyek sekarang tergantung dari respon sebelumnya. Hardle [7] menyusun konsep dasar pemodelan matematika yang mendasari pemodelan ini, yaitu: ...
Article
p>Classic time series data analysis techniques, such as autoregressive, model stationary data in which the values of prior observations influence the current observations through a process known as linear regression. There are several requirements for error assumptions in autoregressive, including independence, normal distribution with a zero mean and constant variance. It is frequently discovered that these assumptions are challenging to verify when modelling real data. Kernel time series regression is an alternative model that does not require error assumptions. Non-stationary time series data can be effectively modelled using the kernel time series method. Time series data that isn't yet stationary is made stationary first, then the data is modified by forming the current stationary time series data as the response variable and the previous period data as the predictor variable. Next, regression kernel modelling is carried out while applying kernel weight function and determining the optimal bandwidth. For development of science, the optimal bandwidth can be achieved by minimizing the MSE, CV, GCV, or UBR values. It is possible to use R2 or MAPE as the kernel time series regression model's goodness metric. A strong model is generated while modelling MDKA stock price data using kernel regression utilizing the Gaussian kernel function and optimal bandwidth selection using GCV since R2 is 0.9828372 more than 0.67 and MAPE is 1.985681% under 10%. Keywords : 3 time series; kernel regression; GCV; MDKA stock price.</p
... Nonparametric regression is used since it does not depend on a particular model; hence it can be claimed that nonparametric regression offers high data flexibility. In addition, nonparametric regression can also be used to model the existence of nonlinearity in a model [10], [11]. The formulation of nonparametric regression is as follows: ...
... The seven (7) kernel functions are presented as follows [10]: (1 − 2 ) 3 : | | ≤ 1, 0 others Former studies have used several types of kernel functions in kernel nonparametric regression. Some used the quartic kernel function, such as in modeling the water discharge in the Jangkok Watershed, Lombok Island, by autoregressive pre-whitening [6]; also in predicting the daily rainfall and simulating monthly rainfall data of Dodokan Watershed in the statistical downscaling model [8], [5]. ...
... Time series modeling in determining the independent variables can be formulated by { , > 1}, by defining the lag value −1 as the independent variable ( ) and lag value as dependent variable ( ). The problem can be considered as a form of smoothing regression as follows[10]: ...
Article
Full-text available
In Central Lombok Regency, the hotel tax is one of the highest incomes contributing to Regional Original Revenue. A hotel tax is a tax on services provided by the hotel. This research aims to estimate the nonparametric kernel regression curve on hotel tax revenue data in Central Lombok. The method used is nonparametric kernel regression analysis with the seven kernel functions. The results of the analysis with the Generalized Cross Validation (GCV) criteria, the optimal bandwidth values generated by the seven kernel functions have varying values. Although the bandwidth values vary, the resulting estimation results are similar, and the comparison of the Mean Square Error (MSE) values of the seven kernel functions is not significantly different.
... Specifically, we model the dependence of the RV on time and activity indicators by means of kernel regression (hereinafter KR), another non-parametric regression technique. Its foundations are given in, for example, Hardle (1992), Loader (1999), and Takeda We apply a locally linear model to fit the RV at a time t k by minimizing the objective function Z: (4) where N is the number of RV and simultaneous activity indicator observations, RV(t i ) the RV at time t i , x i ≡ x(t i ) a generic indicator at time t i , β 0 and β 1 the coefficients with respect to which the objective function Z is minimized, and W the kernel given by ...
... The optimal values of the bandwidths h x and h t are obtained by the so-called leave-one-out method (Hardle 1992), that is, by minimizing the function ...
Preprint
Stellar activity is the ultimate source of radial-velocity (RV) noise in the search for Earth-mass planets orbiting late-type main-sequence stars. We analyse the performance of four different indicators and the chromospheric index logRHK\log R'_{\rm HK} in detecting RV variations induced by stellar activity in 15 slowly rotating (vsini5v\sin i \leq 5 km/s), weakly active (logRHK4.95\log R'_{\rm HK} \leq -4.95) solar-like stars observed with the high-resolution spectrograph HARPS-N. We consider indicators of the asymmetry of the cross-correlation function (CCF) between the stellar spectrum and the binary weighted line mask used to compute the RV, that is the bisector inverse span (BIS), ΔV\Delta V, and a new indicator Vasy(mod)V_{\rm asy(mod)} together with the full width at half maximum (FWHM) of the CCF. We present methods to evaluate the uncertainties of the CCF indicators and apply a kernel regression (KR) between the RV, the time, and each of the indicators to study their capability of reproducing the RV variations induced by stellar activity. The considered indicators together with the KR prove to be useful to detect activity-induced RV variations in 47±1847 \pm 18 percent of the stars over a two-year time span when a significance (two-sided p-value) threshold of one percent is adopted. In those cases, KR reduces the standard deviation of the RV time series by a factor of approximately two. The BIS, the FWHM, and the newly introduced Vasy(mod)V_{\rm asy(mod)} are the best indicators, being useful in 27±1327 \pm 13, 13±913 \pm 9, and 13±913 \pm 9 percent of the cases, respectively. The relatively limited performances of the activity indicators are related to the very low activity level and vsiniv\sin i of the considered stars. For the application of our approach to sun-like stars, a spectral resolution of at least 10510^5 and highly stabilized spectrographs are recommended.
... This method is known as smoothing spline [35,36] since the solution turns out to be a cubic spline with the sampling times t i , i = 0, 1, . . . , N , as the breakpoints. ...
... In the case of SEECR, R(α, θ, φ 0 |y) is replaced by Λ(α, θ, φ 0 |y) [see Eq. (19)]. Its minimization over the parameters α and φ 0 yields the fitness function, F (θ|y), defined in Eq. (36). Thus, −2 ln L in Eq. (42) is replaced by the minimum value, F M,K , of the fitness function, ...
Preprint
A method is described for the detection and estimation of transient chirp signals that are characterized by smoothly evolving, but otherwise unmodeled, amplitude envelopes and instantaneous frequencies. Such signals are particularly relevant for gravitational wave searches, where they may arise in a wide range of astrophysical scenarios. The method uses splines with continuously adjustable breakpoints to represent the amplitude envelope and instantaneous frequency of a signal, and estimates them from noisy data using penalized least squares and model selection. Simulations based on waveforms spanning a wide morphological range show that the method performs well in a signal-to-noise ratio regime where the time-frequency signature of a signal is highly degraded, thereby extending the coverage of current unmodeled gravitational wave searches to a wider class of signals.
... Our approach is based on the theory of kernel smoothers for nonparametric function and derivative estimation [7][8][9][13][14][15]17,18,21]. Kernel smoothers are weighted averages of the measured values of a slowly evolving unknown function. ...
... In the appendix, we describe data adaptive methods where we estimate ∂ p t g(t) using a higher order kernel of order (p, p + 2) and substitute ∂ p t g(t) into (2.7). More detailed treatments of kernel estimation can be found in [7][8][9][13][14][15]17,18,21]. ...
Preprint
We consider kernel estimators of the instantaneous frequency of a slowly evolving sinusoid in white noise. The expected estimation error consists of two terms. The systematic bias error grows as the kernel halfwidth increases while the random error decreases. For a non-modulated signal, g(t), the kernel halfwidth which minimizes the expected error scales ash[σ2Nt2g2]1/5h \sim \left[{ \sigma^2 \over N| \partial_t^2 g^{}|^2 } \right]^{1/ 5}, where %A()A^{(\ell)} is the coherent signal at frequency, ff_{\ell}, σ2\sigma^2 is the noise variance and N is the number of measurements per unit time. We show that estimating the instantaneous frequency corresponds to estimating the first derivative of a modulated signal, A(t)exp(iϕ(t))A(t)\exp(i\phi(t)). For instantaneous frequency estimation, the halfwidth which minimizes the expected error is larger: h1,3[σ2A2Nt3(eiϕ~(t))2]1/7h_{1,3} \sim \left[{ \sigma^2 \over A^2N| \partial_t^3 (e^{i \tilde{\phi}(t)} )|^2 } \right]^{1/ 7}. Since the optimal halfwidths depend on derivatives of the unknown function, we initially estimate these derivatives prior to estimating the actual signal.
... Regression analysis is one of the most widely used statistical data analysis methods to determine the relationship pattern between the independent variable and the dependent variable. According to Hardle (1994), three approaches can be used to estimate the regression curve: parametric, semiparametric, and nonparametric approaches. In the parametric approach, the relationship between the variables is known or estimated from the shape of the regression curve; for example, it is assumed to form a linear, quadratic, exponential and polynomial pattern. ...
... Then, data patterns were identified using a scatterplot to see the relationship between the dependent variable and each independent variable. The following scatterplot was obtained: Based on Figures 3.1 to 3.5 above, it can be seen that the pattern of relationship between the dependent variable Economic Growth Rate and each independent variable Human Development Index, Population Index, General Allocation Fund Original Local Government Revenue, and Labor Force Participation Rate does not follow a particular pattern, so the estimated model used is nonparametric regression (Hardle, 1994;Budiantara, 2009). Then, a multicollinearity test is carried out to find out the variables that will be used in the next stage. ...
Article
Full-text available
Economic growth can indicate the success of economic development in people's lives, so it is essential to study the relationship between economic growth and factors that affect economic growth. Regression analysis is one of the most widely used statistical data analysis methods to determine the relationship pattern between the independent and dependent variables. Three methods can be used to estimate the regression curve, one of which is nonparametric regression. Economic growth data is one form of longitudinal data, with observations of independent subjects, with each subject being observed repeatedly over different periods. Kernel nonparametric regression model applications can be used for longitudinal data. This research aims to estimate the curve and get the best regression model. In this research, the smoothing technique chosen to estimate the nonparametric regression model for longitudinal data is the kernel triangle estimator, which can be obtained by minimizing the square of error using Weighted Least Squares (WLS) and selecting the optimum bandwidth using the Generalized Cross Validation (GCV) method. This study uses the economic growth rate in West Nusa Tenggara as the dependent variable and the human development index, population density, general allocation funds, local revenue, and labor force participation as independent variables. The result showed that the model is less accurate because of the low value of the coefficient for determination and the high value of the mean absolute percentage error (MAPE). This can be caused by the selection of bandwidth intervals that are too small.
... aling bebas. Dalam praktiknya banyak ditemukan asumsi kebebasan data tidak terpenuhi. Contohnya, ketika mengamati data yang disusun dalam urutan waktu dari suatu objek penelitian, respons objek saat ini dapat dipengaruhi oleh respons sebelumnya. Oleh karena itu, diperlukan pengembangan suatu model yang mempertimbangkan ketidakbebasan data tersebut.Hardle (1990) [7] memberikan konsep matematika yang menjadi dasar dalam pengembangan model, yaitu:1. Model (S): Suatu barisan stasioner {( , ), = 1, 2, … , } (antar pengamatan diijinkan tak bebas stokastik) telah terobservasi dan akan diestimasi ̂ menggunakan ( ) = ( | = ). 2. Model (Ts): Suatu data time series { , ≥ 1} telah terobservasi dan akan di ...
... Contohnya, ketika mengamati data yang disusun dalam urutan waktu dari suatu objek penelitian, respons objek saat ini dapat dipengaruhi oleh respons sebelumnya. Oleh karena itu, diperlukan pengembangan suatu model yang mempertimbangkan ketidakbebasan data tersebut.Hardle (1990) [7] memberikan konsep matematika yang menjadi dasar dalam pengembangan model, yaitu:1. Model (S): Suatu barisan stasioner {( , ), = 1, 2, … , } (antar pengamatan diijinkan tak bebas stokastik) telah terobservasi dan akan diestimasi ̂ menggunakan ( ) = ( | = ). ...
Article
p class="AbstractEnglish">Investment is an important way to manage finances for profit. One of the most popular investments in Indonesia is buying and selling shares. In addition to getting profits, they also have risks. Therefore, analyzing stock prices before buying and selling is an important key in stock investing. Investors should buy stocks at a low price and sell them at a high price. One of the methods used is parametric regression analysis, but it has assumptions that must be met. A more flexible alternative is local polynomial regression without any particular assumptions. PT Merdeka Copper Gold Tbk with MDKA stock code is a company engaged in the mining and industrialization of gold, silver, and other associated minerals. The study of modeling the lowest daily price of MDKA shares using local polynomial regression showed excellent results. The high coefficient of determination exceeding 67% on the in-sample data indicates strong model performance, and the Mean Absolute Percentage Error (MAPE) value on the out-of-sample data is less than 10%, ensuring excellent model accuracy. Keywords: local polynomial regression; MDKA shares; time series</p
... where, h is the bandwidth, used to parameterize the weights size. Further, K is the kernel function used to determine kernel weight shape, See [3], for the K the following three conditions are met:    Actually, the KS estimator yields poor results in very uneven x-spaces [4] and [5] ...
... where, h is the bandwidth, used to parameterize the weights size. Further, K (. ) is the kernel function used to determine kernel weight shape, See [3], for the K (. ) the following three conditions are met: ...
Article
Full-text available
Estimation of variance is a commonly discussed topic under simple random sampling (SRS) scheme. The current article deals the issue of variance estimation utilizing supplementary information with the nonparametric approach under different ranked set sampling (RSS) schemes. We propose a class of nonparametric variance estimators utilizing kernel regression [1] with different bandwidths (Plug-in and Cross-validation), under RSS schemes. Simulation study is provided utilizing diverse data sets. The comparison of simulation results has been made between the members of the proposed class with respect to the unbiased variance estimator.
... The k-Nearest Neighbor (k-NN) method is an instance-based machine learning technique designed to handle large datasets effectively [39,42]. It discerns patterns by matching current input variables to comparable data points from past records [39,43]. ...
Article
Full-text available
This study examines how winter weather conditions influence traffic patterns for both passenger vehicles and trucks, using data collected from weigh-in-motion (WIM) stations and nearby weather monitoring sites along Alberta’s Highways 2 and 2A. To explore how snowfall and temperature affect traffic volumes, we developed Ordinary Least Squares Regression (OLSR) models. The findings indicate that passenger car volumes drop more sharply than truck volumes under increased snowfall, with the decline being particularly notable on Highway 2, a rural stretch. In contrast, Highway 2A showed an uptick in truck traffic, likely due to detours from adjacent routes with less winter maintenance. For estimating missing traffic data during severe weather, we employed both OLSR and a machine learning technique, k-Nearest Neighbor (k-NN). In comparing the two approaches, OLSR demonstrated superior accuracy and consistency, making it more effective for filling in missing traffic data throughout the winter season. The performance of the OLSR model underscores its reliability in addressing data gaps during adverse winter conditions. Additionally, this study contributes to sustainable transportation by improving data accuracy, which aids in better resource allocation and enhances road safety during adverse weather. The findings support more efficient traffic management and maintenance strategies, including optimizing winter road maintenance and improving sustainable infrastructure planning, thereby aligning with the goals of sustainable infrastructure development.
... The model has two free parameters, a and b. The first one corresponds to cr 2 , and the second one corresponds to I Bg"(t) 1 where the hj are equispaced in h with a cutoff value of hj chosen such that CR(hcutofJ)~2CR(hRice). We then select the halfwidth to minimize aV(h) + bh 4 . ...
Preprint
We determine the expected error by smoothing the data locally. Then we optimize the shape of the kernel smoother to minimize the error. Because the optimal estimator depends on the unknown function, our scheme automatically adjusts to the unknown function. By self-consistently adjusting the kernel smoother, the total estimator adapts to the data. Goodness of fit estimators select a kernel halfwidth by minimizing a function of the halfwidth which is based on the average square residual fit error: ASR(h). A penalty term is included to adjust for using the same data to estimate the function and to evaluate the mean square error. Goodness of fit estimators are relatively simple to implement, but the minimum (of the goodness of fit functional) tends to be sensitive to small perturbations. To remedy this sensitivity problem, we fit the mean square error %goodness of fit functional to a two parameter model prior to determining the optimal halfwidth. Plug-in derivative estimators estimate the second derivative of the unknown function in an initial step, and then substitute this estimate into the asymptotic formula.
... It is well known that when the covariate X is finite dimensional, say X ∈ R q , and X has a continuous positive density at x, one needs to have a sequence of bandwidths {h n } such that h n −→ 0 and nh q n −→ ∞ as n −→ ∞ to ensure the consistency of the kernel regression estimate Θ n (x) (see, e.g., chapter 3 in Hardle (1990)). To deal with covariates, which are not necessar- ...
Preprint
We consider a nonparametric regression setup, where the covariate is a random element in a complete separable metric space, and the parameter of interest associated with the conditional distribution of the response lies in a separable Banach space. We derive the optimum convergence rate for the kernel estimate of the parameter in this setup. The small ball probability in the covariate space plays a critical role in determining the asymptotic variance of kernel estimates. Unlike the case of finite dimensional covariates, we show that the asymptotic orders of the bias and the variance of the estimate achieving the optimum convergence rate may be different for infinite dimensional covariates. Also, the bandwidth, which balances the bias and the variance, may lead to an estimate with suboptimal mean square error for infinite dimensional covariates. We describe a data-driven adaptive choice of the bandwidth, and derive the asymptotic behavior of the adaptive estimate.
... Smoothing methods are used to non-parametric estimate the regression function (Hardle, 1990). A regression smoother is a tool for summarizing the trend in the Y response calculation as a function of one or more X predictor measurements. ...
Article
Trends and growth rate analysis are extensively employed in the agricultural sector as these have significant policy implications. The present study was commenced to design a methodology to fit trends in the three phases of different Gram crops grown in Tamil Nadu state using nonparametric regression. Relative growth rates were calibrated based on non-parametric regression model. On average, the percentage growth rate values obtained in the years 1950-1951 to 2009-2010 for the three phases of different grams crops showed that production increased with a rate of 6.0, which has been at a rate of 2.89 and 3.21 per cent per year due to the combined effect of area and productivity.
... ‫البيانات‬ ‫قرب‬ ‫او‬ ‫بعد‬ ‫حسب‬ ‫تتغير‬ ‫والتي‬ ‫المشاهدة‬ ‫قيمة‬ ‫عن‬ ‫دالة‬ ‫تقدير‬ ‫في‬ ‫جدا‬ ‫مهم‬ ‫الحزمة)‬ ‫(عرض‬ ‫التمهيد‬ ‫معلمة‬ kernel ‫حول‬ ‫ومتماثلة‬ ‫سالبة‬ ‫غير‬ ‫الدالة‬ ‫اختيار‬ ‫هو‬ ‫عادة‬ ‫ذلك‬ ‫يتم‬ ‫الثانية‬ ‫المشتقة‬ ‫ولديها‬ ‫ومستمرة‬ ‫الصفر‬ [4,9] . ...
Article
Full-text available
The regression method is used to measure the relationship between two variables in the form of a function, for the relationship between a dependent variable, which is related to one or explanatory variables. In this research, a parasympathetic partial linear regression model that represents the median state between the parameter regression model and the Non-parametric regression model has found wide acceptance in many Among the studies where methods of estimating a developer have been used to estimate the semi-linear partial linear regression model with a loss in the parameter part represented by the MCBEM model calibration method in addition to the MCB model calibration method proposed by the researcher Qi- HuaWang.
... Moreover, nonparametric regression offers high flexibility, as the regression curve can be adapted to the local nature of the data [ 1 ]. However, a nonparametric regression curve cannot be determined arbitrarily without information from the data, such as examining data patterns based on a scatterplot [2][3]. In nonparametric regression, there are several estimator approaches for the regression curve, such as Spline function [4][5][6], Kernel function [7][8][9], Fourier series function [10][11][12][13], and many others. ...
Article
Full-text available
Nonparametric regression is an approximation method in regression analysis that is not constrained by the assumption of knowing the regression curve. One of the functions to approximate the curve is a Fourier series function. The nonparametric regression model with approximation of a Fourier series function has been widely discussed by several researchers. However, discussions on statistical inference, particularly in partial hypothesis testing, has not been carried out previously. Therefore, the purpose of this research is to discuss the statistical inference on nonparametric regression model with approximation of a Fourier series function. The discussion includes parameter and model estimations, simultaneous and partial hypotheses testing. In the application, we use life expectancy data from East Java Province during 2022. Based on data analysis, we obtain a model estimation with an R-square value of 96.24 %. At a 5 % significance level, the parameters simultaneously have a significant influence on the model. Partially, four parameters are not significant. However, overall, the predictor variables significantly influence the life expectancy data.•The Fourier series function used is a Fourier series function introduced by Bilodeau (1992). •The model estimation is obtained by selecting the optimal number of oscillation parameters. •The statistical test is obtained using the LRT method.
... The fact is that in this case the purpose is consecutive test of hypotheses about the appropriateness of inclusion in the linear multidimensional regression model of one variable (one Support point). This allows being limited with single criterion use -partial F-criterion (Norman R. Draper, Harry Smith, 1998, Hardle W. 1990). ...
Article
Full-text available
In the absence of aprioristic information about the arrangement of the Support points in Integral Aggregate of nonlinear regressionanalysis there is a need for elimination of statistically non significant Reference point. It is shown that the task is referring tothe solution of a well-known task in the multidimensional linear regression analysis on selecting the best regression equation.The efficiency of the developed method is shown on the example of the seven point Integrated Aggregates, approximating aparabolic configuration of observed points.
... An increase in the efficiency of traffic filtration is achieved by periodically ranging the filtration rules in descending order of their weights, obtained in accordance with the estimates of the parameters of the filtered information flows. A particularity of the developed approach is the use of the nonparametric method of local approximation (MLA) [4,5] to evaluate the parameters of filtered information flows. In the ranging process for a rule set, the current characteristics and dynamics of changes in the parameters of information flows are considered. ...
Article
Full-text available
The given article is a continuation of a number of works devoted to the development of models and methods for ranging the filtration rules to prevent a decrease in the firewall performance caused by the use of a sequential scheme for checking packet compliance with the rules, as well as by the heterogeneity and variability of network traffic. The article includes a description of a firewall mathematical model given in the form of a complex system and a queuing system with a phase-type discipline for request servicing, which formalizes the network traffic filtering process with the functionality of ranging the rules. The purpose of modeling is to obtain estimates for major firewall performance metrics for various network traffic behavior scenarios, as well as to evaluate an increase in the firewall performance due to ranging a filtration rule set. Calculation of estimates for the firewall (FW) performance metrics was made using the analytical method for a Poisson request flow. Based on the analysis of the modeling results, conclusions were drawn on the effectiveness of ranging the filtration rules in order to improve the firewall performance for traffic scenarios that are close to real ones.
... If the regression curve is a parametric model, according to [12], the parametric estimation will be efficient, but otherwise, it will give a misleading interpretation. Therefore, if the shape of the curve ( − ) is unknown and there is no information about it, a non-parametric approach will be more appropriate. ...
Article
Full-text available
The spline regression model is a nonparametric model and it is applied to data that do not have a certain curve shape and do not have information about it. In this study, the results of the analysis of the B-Spline regression model and the Spline Truncated model were compared on temperature data at several stations on Java Island to obtain the best model that can be used to forecast the temperature for the next few days. Daily temperature data were obtained from BMKG at Semarang, Juanda, Serang, Sleman, Bandung, and Kemayoran stations. The temperature data were modeled with the B-Spline and Spline Truncated regression using the optimal knot point of the GCV, and the best model was obtained. The analysis shows that the B-Spline regression models are better than the truncated Spline models with a fairly small MSE value and a greater coefficient of determination than the truncated Spline model.
... The assumption of as a known or unknown function could be seen by using a scatterplot [2] . In this research, we are assuming as an unknown function. ...
Article
Full-text available
Nonparametric regression model with the Fourier series approach was first introduced by Bilodeau in 1994. In the later years, several researchers developed a nonparametric regression model with the Fourier series approach. However, these researches are limited to parameter estimation and there is no research related to parameter hypothesis testing. Parameter hypothesis testing is a statistical method used to test the significance of the parameters. In nonparametric regression model with the Fourier series approach, parameter hypothesis testing is used to determine whether the estimated parameters have significance influence on the model or not. Therefore, the purpose of this research is for parameter hypothesis testing in the nonparametric regression model with the Fourier series approach. The method that we use for hypothesis testing is the LRT method. The LRT method is a method that compares the likelihood functions under the parameter space of the null hypothesis and the hypothesis. By using the LRT method, we obtain the form of the statistical test and its distribution as well as the rejection region of the null hypothesis. To apply the method, we use ROA data from 47 go public banks that are listed on the Indonesia stock exchange in 2020. The highlights of this research are:•The Fourier series function is assumed as a non-smooth function. •The form of the statistical test is obtained using the LRT method and is distributed as F distribution. •The estimated parameters on modelling ROA data have a significant influence on the model.
... These techniques impose only few assumptions about shape of function and therefore it is more flexible than usual parametric regression approaches. Smoothing techniques are commonly used to estimate the function non-parametrically (Hardle, 1990). ...
Article
Full-text available
Castor is an important non-edible oilseed, basically a cash crop, cultivated around the world owing to the commercial importance, of its oil. Gujarat is the largest Castor producing state followed by Andhra Pradesh and Rajasthan. The present study is an attempt to find past trends of Castor seed in India using parametric, nonparametric and semi-parametric regression methods. The performance of each method is compared using higher values of R 2 and lower values of residual criteria. It is found that the parametric regression comes out to be good fit for trends in Castor seed in comparison to the nonparametric/semi-parametric regression. In Comparison to nonparametric and semi-parametric, semi-parametric spline regression was selected as the best fitted model for trend analysis. The study advocates for researchers technological breakthrough in Castor seed production in India.
... parametric models are more flexible in data analysis when compared to models because the data are not dependent on a particular distribution. Hardle (1994) explained that the aim of the nonlinear regression analysis is to analyze the undetermined regression function by reducing error observations that allow for the interpretation of the average response of y / x and to make a curve approximation called the smooth. ...
... ‫البيانات‬ ‫قرب‬ ‫او‬ ‫بعد‬ ‫حسب‬ ‫تتغير‬ ‫والتي‬ ‫المشاهدة‬ ‫قيمة‬ ‫عن‬ ‫دالة‬ ‫تقدير‬ ‫في‬ ‫جدا‬ ‫مهم‬ ‫الحزمة)‬ ‫(عرض‬ ‫التمهيد‬ ‫معلمة‬ kernel ‫حول‬ ‫ومتماثلة‬ ‫سالبة‬ ‫غير‬ ‫الدالة‬ ‫اختيار‬ ‫هو‬ ‫عادة‬ ‫ذلك‬ ‫يتم‬ ‫الثانية‬ ‫المشتقة‬ ‫ولديها‬ ‫ومستمرة‬ ‫الصفر‬ [4,9] . ...
Article
يستعمل اسلوب الانحدار في قياس العلاقة بين متغيرين على هيئة دالة، للعلاقة بين متغير تابع والذي يرتبط بمتغير او متغيرات توضيحية وفي هذا البحث تم توضيح انموذج الانحدار الخطي الجزئي شبة المعلمي الذي دمج بين انموذج الانحدار المعلمي وانموذج الانحدار اللامعلمي والذي لاقه قبولاً واسعاً في الكثير من الدراسات حيث تم استعمال طرائق تقدير مطوره لتقديرانموذج الانحدار الخطي الجزئي شبة المعلمي مع فقدان في بعض مشاهدات في متغير توضيحي في الجزء المعلمي والمتمثلة بطريقة معايرة الانموذج MCM بالاضافة الى طريقة المقترحة MCBEM وتم اجراء اسلوب المحاكاة باستعمال ثلاث احجام (n=60,90,120) واستعمال ثلاث قيم افتراضية للتباين σ^2=(1.5,1,0.5) وقد تم التوصل الى ان طريقة المقترحة MCBEM اعطت نتائج اكثر دقة وذات اداء جيد عند التقدير .
... Berikut ini merupakan beberapa jenis fungsi Kernel yang dapat digunakan diantaranya Uniform, Segitiga, Epachnikov, Kuadrat, Twiweight, Cosinus, dan Gaussian (Hardle, 1990). Dalam penelitian ini, fungsi kernel yang akan digunakan adalah kernel Gaussian dengan rumus sebagai berikut: ( ) = 1 √2 exp ( 1 2 (− 2 )) ; −∞ < < ∞ dengan merupakan derajat kehalusan dari fungsi kernel dan = 0, …, − 1 merupakan urutan dari fungsi kernel, serta merupakan indikator. ...
Article
Full-text available
Dalam upaya peningkatan kesejahteraan penduduk yang berlandaskan pada kepuasan masyarakat, BPS melakukan pengukuran Indeks Kebahagiaan sejak tahun 2012. Saat ini Indeks Kebahagiaan masyarakat Indonesia di tahun 2021 meningkat dari pengukuran sebelumnya di tahun 2017, meskipun kondisi pandemi Covid-19 masih melanda hingga saat ini yang menimbulkan keluhan kesehatan pada masyarakat. Penelitian ini dilakukan untuk melihat bagaimana pengaruh persentase keluhan kesehatan terhadap Indeks Kebahagiaan di Indonesia pada tahun 2021 dengan menggunakan model regresi nonparametrik Local Polynomial. Hasil penelitian menunjukkan pola fluktuatif yang cenderung menurun, yang berarti peningkatan keluhan kesehatan di masyarakat cenderung menyebabkan indeks kesehatan menurun.
... As mentioned previously, the parameters were distorted because of contradictory observations. Although outliers were excluded from this study, parametric methods did not produce an appropriate analysis, but nonparametric regression provided a solution [56]. This regression method, which is the opposite of the parametric approach, tried to analyze the relationships between dependent and independent variables without considering any functional form of the model [57]. ...
Article
Full-text available
Marinas are essential for tourism as a customized service, which, in turn, necessitates active customer cooperation. This study investigates the participation behavior of customers in marina service delivery and aims to determine the facilitating factors and consequences of customer participation (CP). A questionnaire survey was performed to evaluate the perception of marina users (i.e., boat owners or captains) who received service from full-service private marinas. The collected data were analyzed using the generalized linear model. The empirical results showed that customer self-efficacy and customer affective trust are significant facilitating factors, and actionable participation is the most essential dimension of CP substantially impacting customer cocreated value. Moreover, “experience at sea” and “marina region” are the factors with high control effects on the relationships between CP, self-efficacy, trust, and cocreated value.
... Titik knot merupakan titik perpaduan bersama dimana terjadi pola perubahan perilaku dari suatu fungsi pada selang yang berbeda [7]. Untuk mendapatkan model regresi spline terbaik maka titik optimal dicari yang paling sesuai dengan data. ...
... Kelebihan ini terjadi karena dalam Spline terdapat titik-titik knot. Menurut [5], titik knot merupakan titik perpaduan bersama dimana terjadi pola perubahan perilaku dari suatu fungsi pada selang yang berbeda. Diberikan suatu basis dari ruang Spline berorde p sebagai berikut. ...
... Pada metode regresi nonparametrik spline truncated digunakan bantuan titik-titik knot. Titik knot merupakan titik dimana terjadi pola perubahan perilaku dari suatu fungsi pada selang yang berbeda [6]. Fungsi spline truncated multivariabel berorde dengan titik knot 1 , 2 , . . . ...
... Salah satu kelebihan pendekatan spline adalah model ini cenderung mencari sendiri estimasi data ke mana pola data tersebut bergerak. Kelebihan ini terjadi karena dalam spline terdapat titik-titik knot [7]. Masing-masing fungsi f pada regresi nonparametrik spline truncated multivariabel berorde p dengan titik knot 1 , 2 , …, dapat dituliskan menjadi; ...
... This approach has high flexibility, where data adjust the shape of the regression curve estimation without being influenced by the subjectivity of the researcher [2]. In recent years, researchers used Spline Truncated [3][4][5], Kernel [6][7][8], and Fourier Series [9,10] as estimators of nonparametric regression modeling. Spline Truncated can handle data pattern that changes at certain sub-intervals. ...
Conference Paper
Nonparametric regression is one of the approaches in regression analysis to determine the relationship pattern between predictor variable and response variable. This approach can be used when the data pattern is unknown. Recently, researchers have assumed that every predictor variable in nonparametric regression has the same data pattern by using one form of the estimator for all predictor variables. However, in many cases, there are different data patterns for the relationship of each predictor variable and response variable that partially change in certain sub-intervals, some do not have a set pattern, and some others have a repeating pattern. If the estimation of each predictor variable only uses one form of an estimator, it will produce a bias estimation. Therefore, it requires a mixed estimator to get the better nonparametric regression estimation which is set with data patterns. This research evolves a mixed Spline Truncated, Kernel, and Fourier Series estimator for nonparametric regression estimation. It was applied to longitudinal data that repeatedly measured in each subject at different time intervals. A real case was presented to estimate the problem of poverty in 34 provinces in Indonesia from 2015 to 2020. Weighted Least Square (WLS) approach was utilized as method of the estimation. Based on the results of the analysis, the best nonparametric regression model was obtained, namely the model with 1 knot 1 oscillation, with the smallest GCV value of 0.25.
Article
إن موضوع تحليل الانحدار يلقى اهتماماً متزايداً وواضحاً في معظم الدراسات وخصوصاً الاقتصادية والطبية منها. ويعد نموذج الانحدار اللامعلمي بصورة عامة والانحدار اللامعلمي المتعدد بوجه خاص أحد أهم وأبرز نماذج الانحدار المستخدمة في السنوات الأخيرة التي شهدت توسعاً كبيراً وخصوصاً في الجانب الاقتصادي والبيئي. إذ يعدّ مقدر نداريا- واتسون المتعدد (Multivariate Nadaraya-Watson estimator) من أهم المقدرات المستعملة في أنموذج الانحدار اللامعلمي المتعدد. حيث أن هذا المقدر يعتمد بدوره في تقدير نموذج الانحدار اللامعلمي المتعدد على مصفوفة معلمات تسمى بمعلمات التمهيد (smoothing parameter) والتي لتقديرها أهمية كبيرة في تحقيق جودة توفيق المنحى المقدر في نموذج الانحدار اللامعلمي المتعدد. تم في هذا البحث اقتراح توظيف خوارزمية مستوحاة من الطبيعة والمتمثلة بخوارزمية الاعشاب الضارة في عملية تقدير مصفوفة معلمات التمهيد (Bandwidth matrix) في مقدر نداريا- واتسون المتعدد. كما تمّ استخدام أسلوب محاكاة المونت – كارلو لتوليد بيانات تتبع عدد من نماذج الانحدار اللامعلمي المتعدد. لقد أظهرت نتائج المحاكاة تفوق الطريقة المقترحة مقارنةً بطرائق التقدير الأخرى معتمدين متوسط مربعات الخطأ بوضعها معياراً للمقارنة.
Article
Full-text available
An investigation was carried out to trace the structural changes in area, production and productivity of sugarcane crop grown in Anand region of Gujarat State, India from 1949-50 to 2008-09 using the Nonparametric regression models. It was shown that nonparametric regression with jump points provided a good description of the data under consideration and provided statistical evidence for structural changes took place in the trends in area, production and productivity.
Preprint
Complex demodulation of evolutionary spectra is formulated as a two-dimensional kernel smoother in the time-frequency domain. In the first stage, a tapered Fourier transform, ynu(f,t)y_{nu}(f,t), is calculated. Second, the log-spectral estimate, θ^ν(f,t)ln(ynu(f,t)2\hat{\theta}_{\nu}(f,t) \equiv \ln(|y_{nu}(f,t)|^2, is smoothed. As the characteristic widths of the kernel smoother increase, the bias from temporal and frequency averaging increases while the variance decreases. The demodulation parameters, such as the order, length, and bandwidth of spectral taper and the kernel smoother, are determined by minimizing the expected error. For well-resolved evolutionary spectra, the optimal taper length is a small fraction of the optimal kernel half-width. The optimal frequency bandwidth, w, for the spectral window scales as w2λF/τw^2 \approx \lambda_F/ \tau , where τ\tau is the characteristic time, and λF\lambda_F is the characteristic frequency scale-length. In contrast, the optimal half-widths for the second stage kernel smoother scales as h1/(τλF)1(p+2)h \approx 1/(\tau \lambda_F)^{1 \over ( p+2) }, where p is the order of the kernel smoother. The ratio of the optimal frequency half-width to the optimal time half-width satisfies hF/hT (tpθ/fpθ)h_F / h_T ~ (|\partial_t ^p \theta | / |\partial_f^p \theta|). Since the expected loss depends on the unknown evolutionary spectra, we initially estimate tpθ2|\partial_t^p \theta|^2 and fpθ2|\partial_f^p \theta|^2 using a higher order kernel smoothers, and then substitute the estimated derivatives into the expected loss criteria.
Article
Fertility is a live birth, namely the release of a baby from a woman's womb with signs of life such as screaming, breathing, a throbbing heart, and so on. The source of this research data comes from the publication of the official website of the Central Statistics Agency (BPS). This study aims to model and predict fertility data in 2020 with kernel nonparametric regression using the Nadaraya-Watson estimator. The nonparametric kernel model shows the relationship between fertility (Y) and the percentage of underage women at first marriage , the percentage of women 15-49 years who do not use traditional KB or conventional methods , the number of active family planning participants , the number of couples of childbearing age , the percentage of the average length of schooling , and the total expenditure per capita based on Gaussian kernel function and bandwidth values. Based on the results of the analysis, the independent variables that have a significant effect are , , , on the dependent variable with the optimum bandwidth value of 0.490 and the value of R2 of 99.6%, and the MSE value of 0.332. Modeling fertility is important as it helps understand and predict population trends. It provides insights into the potential number of births in a population in the future. This information can be used for policy planning, including health, educations, and social policies.
Article
Full-text available
This study aims to investigate the relationship between the Human Development Index (HDI) and poverty levels in North Sumatra. Using data from Central Bureau of Statistics (BPS), the study employs a nonparametric truncated spline regression model to analyze the relationship. The findings reveal that HDI significantly impacts poverty levels, with higher HDI associated with lower poverty rates. The model used in this study offers a robust approach to understanding the dynamics between HDI and poverty, and the results underscore the importance of improving HDI to reduce poverty. The research highlights an R-Squared value of 82.35%, indicating a strong correlation between HDI and poverty in the region.
Article
The vertical movements measured by a broadband seismic stations located around Tohoku earthquake 11.03. 2011 with magnitude 9 are calculated. It is shown that during the 15 years before the earthquake the closest to epicenter MAJO station located 386 km from the epicenter the quiet daily variations was demonstrated. Seismic pulses with the durations of a few minutes and amplitudes bigger 10% of the diurnal variations of tidal velocities were revealed in 2009 year. They appeared under quiet meteorological and geomagnetic conditions. These pulses are not found on the records of the stations remoted more 700 km from epicenter. It is hypothesized that sharp changes in the low frequency seismic noise reflect the tectonic deformations in the lithosphere of Japan and adjoin part of Pacific Ocean.
Research
Full-text available
Etude détaillée sur la méthode d'estimation locale polynomiale de la fonction de régression et ses propriétés
Article
Full-text available
This research aims to determine regional economic improvement to achieve a better Indonesian economy and accelerate the path to achieving a Golden Indonesia in 2045 so that it can be realized in a shorter time. This goal will be achieved with the help of statistical analysis methods, where the analysis used in this research is semiparametric truncated spline indirect effect and total effect analysis. The research becomes original in its approach with the utilization of this method and offers novel insights into the dynamics of regional economic development in Indonesia. These methods in this research serve as a tool for analyzing regional economic dynamics, identifying critical factors for improvement, informing policy decisions aimed at realizing Indonesia's economic aspirations for the future, and providing more flexible results to achieve the research objectives. The study was carried out on data with regional expenditure variables as exogenous variables, labor absorption variables as mediating endogenous variables, and regional economic growth variables as pure endogenous variables. The data used in the research are data published by the National/Provincial Central Bureau of Statistics in the form of the Indonesian Statistics Book, BPS publications in the form of Provinces, Provincial Government Financial Statistics, Directorate General of Financial Balance, Sumreg Bappenas, as well as from Ministries, Institutions or Agencies that related to providing data relating to the variables of this research in 2020. The results of this research are that the relationship between regional expenditure variables and labor absorption variables has a significant effect on regional economic growth variables.
Article
Graph-based semi-supervised learning plays an important role in large scale image classification tasks. However, the problem becomes very challenging in the presence of noisy labels and outliers. Moreover, traditional robust semi-supervised learning solutions suffers from prohibitive computational burdens thus cannot be computed for streaming data. Motivated by that, we present a novel unified framework robust structure-aware semi-supervised learning called Unified RSSL (URSSL) for batch processing and recursive processing robust to both outliers and noisy labels. Particularly, URSSL applies joint semi-supervised dimensionality reduction with robust estimators and network sparse regularization simultaneously on the graph Laplacian matrix iteratively to preserve the intrinsic graph structure and ensure robustness to the compound noise. First, in order to relieve the influence from outliers, a novel semi-supervised robust dimensionality reduction is applied relying on robust estimators to suppress outliers. Meanwhile, to tackle noisy labels, the denoised graph similarity information is encoded into the network regularization. Moreover, by identifying strong relevance of dimensionality reduction and network regularization in the context of robust semi-supervised learning (RSSL), a two-step alternative optimization is derived to compute optimal solutions with guaranteed convergence. We further derive our framework to adapt to large scale semi-supervised learning particularly suitable for large scale image classification and demonstrate the model robustness under different adversarial attacks. For recursive processing, we rely on reparameterization to transform the formulation to unlock the challenging problem of robust streaming-based semi-supervised learning. Last but not least, we extend our solution into distributed solutions to resolve the challenging issue of distributed robust semi-supervised learning when images are captured by multiple cameras at different locations. Extensive experimental results demonstrate the promising performance of this framework when applied to multiple benchmark datasets with respect to state-of-the-art approaches for important applications in the areas of image classification and spam data analysis.
Article
In this paper, the nonparametric regression model was estimated using two common estimation methods (Local Linear Regression Estimator (LLRE) and K-Nearest Neighbor Estimator (KNNE). The simulation process was conducted using the statistical programming language, R, for these methods, and they were compared using the Average Root Mean Squares Error (ARMSE) criterion, with three sample sizes (50, 100, 150), three levels of error variance (0.3, 0.7, 1) and two non-linear models. The results for the first model showed the superiority of the nearest neighbor estimator (KNNE) method in most cases, and for the second model the local linear regression estimator (LLRE) was superior at all sample sizes and for all levels of variance. On the applied side, the effect of hemoglobin (Hb) on the packed cell volume (PCV) of 150 patients with chronic kidney disease was studied. The estimation process was made using the two methods, and the comparison between them was made using the root mean square error (RMSE) criterion. It was shown through the results of the applied side that the preference was for the nearest neighbor estimator (KNNE) method.
Article
Full-text available
Economic growth is one of the benchmarks for the success of development or increasing welfare in the government of a region in the economic sector as measured by the Gross Regional Domestic Product (GRDP). It has time series data that often fluctuates so that the appropriate method is nonparametric regression. This study also aims to determine the most influential factors for economic growth in North Sumatra in 2019-2021 using the MARS model, using secondary data published by BPS for 2019-2021. The MARS model is obtained by obtaining a combination of BF, MI, and MO values that have a minimum Generalized Cross Validation (GCV) value. The results of this study indicate that the best MARS model is a combination of BF=28, MI=1, and MO=1 with a GCV value of 8.42E+06. Therefore there are five of the seven variables that have a significant effect on economic growth in North Sumatra, namely population (X_7 ) with an interest rate of 100%, domestic investment (X_5) of 76.86%, local revenue (X_1) of 31.14%, allocated funds special (X_3) of 28.89%, general allocation funds (X_2) of 23.14%.
Article
يلقى موضوع تحليل الانحدار اهتماماً واسعاً وواضحاً في معظم الدراسات وخصوصاً الاقتصادية والطبية منها. ويعد أنموذج الانحدار اللامعلمي بصورة عامة والانحدار اللامعلمي المتعدد بوجه خاص أحد أهم نماذج الانحدار المستخدمة في السنوات الأخيرة التي شهدت توسعاً كبيراً وخصوصاً في الجانب الاقتصادي والبيئي. إذ يعدّ المقدر الممهد واللُبي الموضعي متعدد الحدود (Local Polynomial Kernel Smoother) من أهم المقدرات المستعملة في أنموذج الانحدار اللامعلمي المتعدد. حيث أن هذا المقدر يعتمد بدوره في تقدير أنموذج الانحدار اللامعلمي المتعدد على مصفوفة معلمات تسمى بمعلمات التمهيد والتي لتقديرها أهمية كبيرة في جودة توفيق المنحى المقدر في أنموذج الانحدار اللامعلمي المتعدد. تم في هذا البحث اقتراح توظيف خوارزمية مستوحاة من الطبيعة والمتمثلة بخوارزمية الاعشاب الضارة في عملية تقدير مصفوفة معلمات التمهيد في المقدر الموضعي متعدد الحدود. تم استخدام أسلوب مونت – كارلو في المحاكاة لتوليد بيانات تتبع عدد من نماذج الانحدار اللامعلمي المتعدد. لقد أظهرت نتائج المحاكاة بالاعتماد على متوسط مربعات الخطأ بوضعها معياراً للمقارنة تفوق الطريقة المقترحة مقارنةً بطرائق التقدير الأخرى.
Article
The paper examines the question of non-anonymous Growth Incidence Curves (na-GIC) from a Bayesian inferential point of view. Building on the notion of conditional quantiles of Barnett (1976. “The Ordering of Multivariate Data.” Journal of the Royal Statistical Society: Series A 139: 318–55), we show that removing the anonymity axiom leads to a complex and shaky curve that has to be smoothed, using a non-parametric approach. We opted for a Bayesian approach using Bernstein polynomials which provides confidence intervals, tests and a simple way to compare two na-GICs. The methodology is applied to examine wage dynamics in a US university with a particular attention devoted to unbundling and anti-discrimination policies. Our findings are the detection of wage scale compression for higher quantiles for all academics and an apparent pro-female wage increase compared to males. But this pro-female policy works only for academics and not for the para-academics categories created by the unbundling policy.
Article
في هذا البحث تم تقدير دالة المعولية لبعض الأنظمة (نظام k-out of-n والنظام المتسلسل والنظام المتوازي) بإستعمال المقدرات اللامعلمية التقليدية وبثلاث طرائق مختلفة وهي: طريقة مقدر Kernel وطريقة مقدر Kaplan-Meier وطريقة مقدر مضروب الحدود ومقارنتها بمقدرات دالة المعولية اللامعلمية بإستعمال الطريقة البيزية التي إقترحها العالم Ferguson عام 1973 والتي أطلق عليها عمليات Dirichlet الأولية، ولبيان أفضلية الطرائق لتقدير دالة معولية الأنظمة تم إستعمال إسلوب المحاكاة للمقارنة ولأحجام عينات مختلفة (14 ,30 ,60, 100) بإستعمال معيار المقارنة متوسط مربعات الخطأ التكاملي IMSE ، ومن خلال النتائج تبين أفضلية الطرائق التقليدية بالنسبة لنظام k-out of-n والنظام المتوازي أما بالنسبة للنظام المتسلسل فقد تبين أفضلية الطريقة البيزية لحجوم العينات الصغيرة (14,30) والطرائق التقليدية لحجوم العينات الكبيرة (60,100).
Book
Spline smoothing is a technique used to filter out noise in time series observations when predicting nonparametric regression models. Its performance depends on the choice of smoothing parameter lambda. Most of the existing smoothing methods applied to time series data tend to overfit in the presence of autocorrelated errors. The aim of this study is to propose a smoothing method which is the arithmetic weighted value of Generalized Cross-Validation (GCV) and Unbiased Risk (UBR) methods The objectives of the study were to (i) determine the best-fit smoothing method for the time series observation; (ii) identify the best smoothing method that does not overfit timeseries data when autocorrelation is present in the error term; (iii)establish the optimum value of the proposed smoothing method; (iv) compare GCV, GML and UBR smoothing methods to the proposed smoothing methods in terms of sample size; and (v)test the results of simulation using real life-data. A hybrid smoothing method of the Generalized Cross-Validation (GCV) and Unbiased Risk (UBR) was developed by adding the weighted values of Generalized CrossValidation (GCV) and Unbiased Risk (UBR). The Proposed Smoothing Method (PSM) was compared with Generalized Maximum Likelihood (GML), GCV and UBR smoothing methods. A Monte Carlo experiment of 1,000 trials was carried out at three different sample sizes (20, 60 and 100), three levels of the autocorrelation (02, 05 and 08), and four degrees of smoothing (1, 2, 3 and 4) Real-life data on Standard International Trade Classification (SITC) export and import price indices in Nigeria between 1970 2018 extracted from CBN 2019 edition were also used. The four smoothing methods' performances were estimated and compared using the Predictive Mean Squared Error (PMSE) criterion.
Article
Full-text available
In order to get rid of or reduce the abnormal values of some phenomena that may be the reason for not obtaining the desired results. This makes us to get conclusions far from reality for the phenomenon we are studying. That the traditional nonparametric estimators are very sensitive to anomalous values, which prompted us to use the fortified estimators because they are not much affected by the anomalous values, as well as the nonparametric regression because it does not depend on the previous determinants or assumptions, but it depends directly and fundamentally on the data.
Conference Paper
The general regression model is divided into three forms, namely parametric, nonparametric, and semiparametric model. Regression is a method used to analyze the relationship between response variables and predictor variables. The shape of the regression model depends on the regression curve. Nonparametric regression has become a concern of many researchers because it can determine the relationship between the predictor variable and the response variable which has an unknown regression curve. Nonparametric regression is very flexible so that the model can follow linear or nonlinear functions. Several nonparametric regression approaches that are often used are Spline Truncated, Kernel, and Fourier Series. Nowadays, many studies related to nonparametric regression have been carried out, either with a single estimator or a mixed estimator. So far, research with mixed estimators mostly uses only two estimators. There have not been many studies related to nonparametric regression models involving 3 mixed estimators. Therefore, the purpose of this study is to find a mixed estimator of Spline Truncated, Kernel, and Fourier Series in the biresponse nonparametric regression using the WLS method. The results show that the WLS estimation produces a Spline Truncated estimator, Kernel estimator, Fourier Series estimator, and also a mixed of that 3 estimators. This mixed estimator will be applied to modeling the Percentage of Poor Population (Y1) and the Poverty Depth Index (Y2) in Papua Province in 2019. The best model obtained is a model with 1 knot point and 1 oscillation. In this best model, the Morbidity and Gross Regional Domestic Product variables will be approached by Spline Truncated, the Labor Force Participation Rate variable will be approached by Kernel while the School Participation Rate variable will be approached with Fourier Series. The best model obtained produces a GCV value of 27.71294 with an MSE of 12.56 and an R-Sq of 92.5%.
ResearchGate has not been able to resolve any references for this publication.