ArticlePDF Available

Abstract

Data smoothing is a method that involves finding a sequence of values that exhibits the trend of a given set of data. This technique has useful applications in dealing with time series data with underlying fluctuations or seasonality and is commonly carried out by solving a minimization problem with a discrete solution that takes into account data fidelity and smoothness. In this paper, we propose a method to obtain the smooth approximation of data by solving a minimization problem in a function space. The existence of the unique minimizer is shown. Using polynomial basis functions, the problem is projected to a finite dimension. Unlike the standard discrete approach, the complexity of our method does not depend on the number of data points. Since the calculated smooth data is represented by a polynomial, additional information about the behavior of the data, such as rate of change, extreme values, concavity, etc., can be drawn. Furthermore, interpolation and extrapolation are straightforward. We demonstrate our proposed method in obtaining smooth mortality rates for the Philippines, analyzing the underlying trend in COVID-19 datasets, and handling incomplete and high-frequency data.
http://www.aimspress.com/journal/Math
AIMS Mathematics, 7(10): 19202–19220.
DOI: 10.3934/math.20221054
Received: 26 March 2022
Revised: 18 August 2022
Accepted: 26 August 2022
Published: 30 August 2022
Research article
On smoothing of data using Sobolev polynomials
Rolly Czar Joseph Castillo and Renier Mendoza*
Institute of Mathematics, University of the Philippines Diliman, Quezon City, Philippines
*Correspondence: Email: rmendoza@math.upd.edu.ph.
Abstract: Data smoothing is a method that involves finding a sequence of values that exhibits the
trend of a given set of data. This technique has useful applications in dealing with time series data
with underlying fluctuations or seasonality and is commonly carried out by solving a minimization
problem with a discrete solution that takes into account data fidelity and smoothness. In this paper, we
propose a method to obtain the smooth approximation of data by solving a minimization problem in
a function space. The existence of the unique minimizer is shown. Using polynomial basis functions,
the problem is projected to a finite dimension. Unlike the standard discrete approach, the complexity
of our method does not depend on the number of data points. Since the calculated smooth data is
represented by a polynomial, additional information about the behavior of the data, such as rate of
change, extreme values, concavity, etc., can be drawn. Furthermore, interpolation and extrapolation
are straightforward. We demonstrate our proposed method in obtaining smooth mortality rates for
the Philippines, analyzing the underlying trend in COVID-19 datasets, and handling incomplete and
high-frequency data.
Keywords: data smoothing; Whittaker-Henderson method; Sobolev polynomials; high-frequency
data; approximation; generalized cross validation score
Mathematics Subject Classification: 65K10, 90C23, 35A15
1. Introduction
Data smoothing is a method commonly used to obtain a smooth approximation of a crude data set.
If {f1,f2,..., fn}is a sequence of ndata points, the goal of data smoothing is to find a corresponding
sequence of graduated data points {u1,u2,...,un}that provides a better representation of the underlying
unknown true values [7,33]. Data smoothing is also referred to as data graduation in actuarial science
and is commonly used in actuarial studies to compute smooth mortality rates from crude data.
Goodness-of-fit and smoothness are the two most important criteria in data smoothing [7]. The
Whittaker-Henderson method (WHM), a common non-parametric technique often presented as an
19203
alternative to the moving average method [18, 20] in data smoothing, generates the smoothed data
points while balancing both criteria. In WHM, the sequence of smoothed data points is the minimizer
u={ui}n
i=1of the function
Q(u)=
n
X
i=1
ωi(uifi)2+λ
n
X
i=1
(ui)2,(1.1)
where
ωi: positive weight vector associated with the data,
ui: Newton’s advancing operator defined by ui:=ui+1ui,
λ: a positive smoothing parameter.
The first term is a measure of fidelity to the original data, while the second term is a measure of
smoothness. The weights ωi’s are pre-determined and data-dependent. Note that for high values of λ,
the smoothing is favored, while lower values result in smoothed data that is closer to actual values. A
WHM where the order of the dierence operator is set to 2 is known as the Hodrick-Prescott (HP)
filter [15].
Several techniques for data graduation can be found in [12,16,33, 43, 44]. These methods treat data
graduation as a minimization problem in Rn. In [26], the WHM in (1.1) is generalized as a minimization
problem in the function space L2(). By treating the data {u1,u2,...,un}as a piecewise linear function,
the data graduation problem in (1.1) is solved by minimizing the functional ˜
J(u) : L2() Rgiven
by
˜
J(u)=1
2Z
ω(uf)2dx +λ
2Z
|Du|2dx,(1.2)
where fis an interpolation of the data points {f1,f2,..., fn},ωis an interpolation of the weights
{ω1, ω2, . . . , ωn}, and Dis the derivative operator.
The functional (1.2) can be viewed as the infinite-dimensional generalization of (1.1). Note that
the second term in (1.2) is a penalty term for data smoothing. This technique of adding a term for
smoothing is called regularization, which has gained popularity in various applications [2, 30, 31, 34,
39]. In this work, we modify the smoothing term of the functional ˜
Jin (1.2) and consider
J(u)=1
2Z
ω(uf)2dx +λ
2Z
|Dmu|2dx,(1.3)
where mis the order of the dierential operator D. By doing this, the solution gains more regularity.
To minimize Jin (1.3), we project the problem to a finite-dimensional polynomial space. Hence, the
solution that we obtain is a polynomial. This approach is motivated by [23], where it is argued that
the WHM is useful in obtaining smooth data points when these graduated data points approximate a
polynomial. One advantage of having a polynomial solution is that it is easier to analyze the solution.
With a polynomial solution, one can easily obtain extreme values, the relevant points at which the
graduated data are increasing or decreasing, and concavity of the smooth data. Furthermore, having a
continuous solution instead of discrete points makes interpolation and extrapolation much easier.
Our proposed method can be used to understand time series trends by analyzing the resulting
polynomial solution. In [40], the use of polynomials is demonstrated in the analysis of COVID-19
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19204
cases. By partitioning the daily new COVID-19 cases into sections and fitting these sections into either
a logarithmic, exponential, or linear function, the authors characterized the trend of the disease and
identified periods when the incidence of the disease is rapidly or slowly increasing. However, it is not
clear how the data set is partitioned. Identifying all possible partitions of data points and evaluating the
fit of these points to each of the three functions may be computationally expensive. With polynomials,
partitioning can be set at inflection points.
The function Qin (1.1) uses the Newton’s advancing operator ui, which assumes that data points
of independent variables must be evenly spaced. Hence, the WHM cannot be used for data with
missing terms. Our proposed scheme does not have this limitation because one can still compute the
polynomial interpolation of the data, regardless if it is not evenly spaced or some of its terms are
missing. Another disadvantage of WHM is that it solves a linear system whose dimension is equal to
the number of variables in (1.1). Hence, smoothing high frequency data using WHM requires solving a
high-dimensional linear system. In our proposed method, the number of variables depends only on the
degree of the polynomial. Thus, regardless of having incomplete or high frequency data, the resulting
minimization problem has the same dimension.
This paper is organized as follows: Section 2 discusses the derivation of the minimizer of (1.3)
in a finite-dimensional polynomial space. Section 3 presents our proposed algorithm. In Section 4,
we present the applications of our method on mortality rates and COVID-19 data. We also apply our
method to data with missing terms and high frequency data. Finally, we give our conclusions and
recommendations in Section 5.
2. Theoretical framework
Because we expect the minimizer of (1.3) to be at least m-times dierentiable, we consider the
Sobolev space Hm() [9]. We present a first-order optimality condition for (1.3). We show that solving
this optimality condition is equivalent to minimizing Jin (1.3).
Theorem 1. Suppose f and ωare suciently smooth functions. Because ωis an interpolation of the
weights, then we can assume that 0< ω(x)¯ω,x, for some ¯ω > 0. Then, u Hm()is a
minimizer of (1.3), if and only if, u satisfies
Z
ω(uv)dx +λZ
D(m)u·D(m)vdx =Z
f vdx,(2.1)
for all v Hm()with m N.
Proof. Let vHm() with mN. Define r:RRby r(t) :=J(u+tv)J(u). Suppose uis the
minimizer of (1.3). Then J(u)J(u+tv) for any tR. Hence, r(t)0, tRand r(t)=0 at t=0.
By the definition of J, we can simplify r(t) as
r(t)=t"Z
ω(uv f v)dx +λZ
D(m)u·D(m)vdx#+t2
2Z
ωv2+|Dmv|2dx.
Hence,
r0(t)="Z
ω(uv f v)dx +λZ
D(m)u·D(m)vdx#+tZ
ωv2+λ|Dmv|2dx.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19205
Since ris optimal at t=0, then
0=r0(0) ="Z
ω(uv f v)dx +λZ
D(m)u·D(m)vdx#.
Rearranging the above equation gives us (2.1).
Now, suppose usatisfies (2.1). Let zHm() and define v:=zu. Then,
J(z)J(u)="Z
ω(uv f v)dx +λZ
D(m)u·D(m)vdx#+1
2Z
ωv2+λ|Dmv|2dx.
From the assumption, the first two terms of the above equation is 0. Thus,
J(z)J(u)=1
2Z
ωv2+λ|Dmv|2dx 0.
Therefore, for any zHm(), J(z)J(u), which means that uis a minimizer of J.
Theorem 1 tells us that it is sucient to solve (2.1) to minimize Jin (1.3). To show that (2.1) has a
unique minimizer, we first need the following results.
Lemma 1. Let Rdbe bounded with Lipschitz boundary. Suppose g satisfies the following
conditions:
(1) g :Hm()[0,+)is a seminorm.
(2) There exists a positive constant C such that 0g(v)CkvkHm(), for all v Hm().
(3) If v Pk1:={space of polynomials of degree k 1}and g(v)=0, then v 0. Then the norm
kukHm():=
X
|αk≤m
kDαukL2()
1/2
(2.2)
is equivalent to the norm
kuk0
Hm():=
g2(u)+X
|α|=mZ
|Dmu|2dx
1/2
.
(2.3)
Proof. This is a special case of the result proven in [38], where the equivalence was shown on a system
of functionals and on a more general Sobolev space Wm,p().
Lemma 2. For R, we define
|u|0
Hm():= Z
ωu2+λ|Dmu|2dx!1/2
.
(2.4)
Then the norm in (2.4) is equivalent to the norm in (2.2), or equivalently, ρ1, ρ2>0such that
ρ1kukHm() |u|0
Hm()ρ2kukHm().(2.5)
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19206
Proof. The norm in (2.4) is equivalent to
|u|0
Hm():= λZ
ω
λu2+|Dmu|2dx!1/2
.
We can set
g(u) := Z
ω
λu2dx!1/2
.
Note that gis a weighted L2()-norm, and hence the first condition of Lemma 1 is satisfied. Because
ω¯ω, for all xand from the definition of norm in (2.2), we get
g(v)r¯ω
λkvkL2(r¯ω
λkvkHm().
Therefore, the second condition of Lemma 1 is satisfied. The last condition easily follows because
ω(x)
λ>0 for all x. Therefore, by Lemma 1, our assertion holds.
Theorem 2. The variational formulation in (2.1) has a unique solution in Hm().
Proof. We define
a(u,v) :=Z
ω(uv)dx +λZ
D(m)u·D(m)vdx
and
b(v) :=Z
f vdx.
We use Lax-Milgram’s Lemma to prove the existence of the unique solution. For the discussion of
this lemma, we refer the readers to [9]. To do this, we need to show that ais bilinear, bounded, and
coercive, and bis linear and bounded. The bilinearity and linearity of aand b, respectively, follows
directly from their respective definitions.
We now show that ais bounded, that is, C1>0 such that |a(u,v)| C1kukHm()kvkHm(). We use
triangle inequality, Cauchy-Schwarz inequality, and the definition of the norm in (2.2). Thus,
|a(u,v)| Z
ω(uv)dx+λZ
D(m)u·D(m)vdx
¯ωkukL2()kvkL2()+λkD(m)ukL2()kD(m)vkL2()
max( ¯ω, λ)
| {z }
C1
kukHm()kvkHm().
We use (2.5) from Lemma 2 to show that ais coercive, that is, C2>0 such that |a(u,u)| C2kuk2
Hm()
for all uHm(). Indeed,
|a(u,u)|=Z
ωu2+λ|Dmu|2dx
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19207
=h|u|0
Hm()i2
ρ1
|{z}
C2
kuk2
Hm().
Finally, we show that bis bounded, that is, C3>0 such that |b(v)| C3kvkL2(). Using Cauchy-
Schwarz inequality, we get
|b(v)|≤kfkL2().
| {z }
C3
kvkL2().
We have shown in Theorem 1 that we can minimize Jin (1.3) by solving an equivalent variational
formulation, which we have shown in Theorem 2 to have a unique solution in Hm(). Because of the
equivalence of k·kHm()and |·|0
Hm(), we can use the following inner product for Hm():
hu,viH:=Z
ω(uv)dx +λZ
D(m)u·D(m)vdx,(2.6)
where ωis a continuous function in , and λ > 0. With the above inner product, we can rewrite the
optimality condition in (2.1) as
hu,viH=hf,viL2()vHm().
In this work, we pose the above variational formulation in a finite-dimensional subspace Sof
Hm(), that is, we solve
hu,viH=hf,viL2()vS.
We consider the subspace Swhich is spanned by a set of orthonormal polynomials {p1,p2,...,pl,}
for some lN. We refer to {p1,p2,...,pl}as Sobolev polynomials because they are constructed using
the inner product in Eq (2.6), which involves derivatives, such that each polynomial in this subspace are
also in the Sobolev space Wm,p(). We were motivated to use Sobolev polynomials because they are the
best polynomial approximation to a function fwith respect to the L2() norm [28]. A comprehensive
review of the history and recent development in the study of Sobolev polynomials can be found in [27].
Since S=span{p1,p2,...,pl}, it is sucient to find uSsuch that
hu,pjiH=hf,pjiL2()j {1,2,...,l}.
If uS, then u=
l
P
i=1
uipiand so, we solve
h
n
X
i=1
uipi,pjiH=hf,pjiL2()j {1,2,...,l}.
Equivalently,
n
X
i=1
uihpi,pjiH=hf,pjiL2()j {1,2,...,l}.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19208
Because hpi,pjiH=δi j, we obtain
ui=hωf,piiL2()i {1,2,...,l}.(2.7)
Therefore, the solution uSof
hu,viH=hf,viL2(vS
is given by
u=
l
X
i=1
pihωf,piiL2().(2.8)
We construct the orthonormal polynomials using Gram-Schmidt orthonormalization process [24].
We will also use polynomial interpolations for ωand fso that the solution uis a linear combination of
polynomials. The Gram-Schmidt process is started using Chebyshev polynomials as the initial basis
functions [36].
3. Proposed algorithm
To compute uin (2.8), we first obtain the polynomial interpolations of ωand f. Then we get
a set of orthonormal polynomials from an independent set of polynomials. In our case, we use the
Chebyshev polynomials {c1,c2,...,cl}, for some user-defined degree l1. Then, using Gram-Schmidt
orthonormalization process, we construct a set of orthonormal polynomials {p1,p2,...,pl}. The order
of dierentiation mis also user-defined. The construction of the Sobolev polynomials is summarized
in Algorithm 1.
Algorithm 1 Creating a set of orthonormal Sobolev polynomials from Chebyshev polynomials
1: Input: Set the desired degree of polynomial l1, and the endpoints a,bof the interval of interest
according to the crude data points.
2: Construct a set of basis Chebyshev polynomials of dimension lover the interval (a,b).
3: Apply the Gram-Schmidt orthogonalization procedure that uses the inner product in Eq (2.6) on
the set of Chebyshev polynomials from step 2.
4: Output: The Gram-Schmidt orthogonalization procedure that uses the specified inner product in
step 3 results in a set of orthonormal Sobolev polynomials {p1,p2,...,pl}.
The user can also specify the value for the smoothing parameter λ. However, by default, the
algorithm uses the approach presented in [12, 42]. The smoothing parameter λis computed as the
minimizer of the generalized cross validation (GCV) score, which is expressed as
GCV (λ)=
n
n
P
i=1
(ˆ
fifi)2
n
n
P
i=1
(1 +λγ2
i)1!2,(3.1)
where
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19209
ˆ
f=In+λDTD1f,
nis the number of data points, γ0
isare the eigenvalues of DTD, and Dis a tridiagonal matrix with
entries given by
Di,i1=2
hi1(hi1+hi),Di,i=2
hi1hi
,Di1,i=2
hi(hi1+hi),
with hirepresenting the step between ˆ
fiand ˆ
fi+1.
For a detailed discussion on how to use GCV to identify the smoothing parameter, we refer the
readers to [11, 12, 14, 41, 42]. To compute the GCV value numerically, we used the Matlab code
provided in the Appendix of [12]. The GCV function in (3.1) may have multiple local minima so we
have to use a global minimizer. In this study, we use Genetic Algorithm (GA) which has growing
applications in science and engineering because of its capability to estimate the global minimum and
its non-reliance on the derivative of the objective function [21, 22, 37]. We utilize the Matlab built-in
function ga, which only requires the upper and lower bounds for the smoothing parameter. Although
GA can converge to the global minimum, it can still sometimes get stuck at a local minimum. To
guarantee global convergence, we run ga 10 times and store the best solution λGA . To further improve
accuracy, we hybridize GA with fmincon, a Matlab built-in code for interior point algorithm [6], which
is a local search technique. We use λGA as the initial guess of fmincon. We denote λ?as the minimizer
of the hybrid GA-interior point method. Upon obtaining λ?, we can use the orthogonal polynomial
calculated in Algorithm 1 and obtain the smooth polynomial approximation uin (2.8). We summarize
the proposed method in Algorithm 2.
Algorithm 2 Data graduation using Sobolev polynomials
1: Input: The crude data {fi}n
i=1.
2: Set the value for l(dimension of the polynomial space) and m(order of dierentiation).
3: Determine the polynomial interpolations, ωand f, of the weights and the data, respectively.
4: Set the bounds for the smoothing parameter [λmin, λmax ].
5: Estimate the minimizer the function GCV in (3.1) over the interval [λmin, λmax]using the Matlab
built-in program ga 10 times. Set λG A as the minimizer among the 10 solutions with the least GCV
score.
6: To get a more accurate global minimizer λ?, implement the interior point method to the GCV
in (3.1) over the interval [λmin, λmax]using the Matlab built-in pogram fmincon.
7: Using the smoothing parameter λ?, obtain a set of orthonormal Sobolev polynomial functions
{p1,p2,...,pl}according to Algorithm 1.
8: Output: The smooth approximation of the data as a polynomial function u=
l
P
i=1
pihωf,piiL2().
Given a data {f1,f2,..., fn}, a unique smooth approximation uis calculated using (2.8). Note that
the orthogonal polynomials pido not rely on the data. This means that if the data has a particular error
structure, uncertainty might occur in the coecient terms uiin (2.7). In this study, we rely on a general
bootstrap approach presented in [8, 10] to quantify the uncertainty of the coecients and construct
confidence intervals. The step-by-step procedure is stated in Algorithm 3.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19210
Algorithm 3 Uncertainty quantification
1: Input: The coecients uicalculated from (2.7).
2: Generate Msimulated data ˆ
f1,ˆ
f2,..., ˆ
fnby adding noise assuming an error structure to the
calculated Sobolev polynomial uat points x1,x2,...,xn.
3: Recalculate the coecients ˆuifrom (2.7) given the simulated data ˆ
f1,ˆ
f2,..., ˆ
fn.
4: From the recalculated Mset of values of ˆui, characterize the distribution of the coecients and
calculate the confidence interval.
5: Output: Histograms to display the empirical distributions of the coecients and the corresponding
confidence intervals.
4. Results and discussion
In this section, we test the proposed algorithm to determine the graduated values of dierent
datasets. All numerical simulations were performed in Matlab 2021a on a computer with Intel(R)
Core(TM) i7-8550U CPU clocked at 1.80GHz and 1.99 GHz with 8 GB of RAM that runs Windows
10 OS. By default, we set the order of derivative to 10 and the degree of the polynomial to 8. In all
simulations, the bounds for the smoothing parameter is set to [λmin, λmax ]:=[0.1,5].
For the first application, we test our method to calculate the graduated value of male mortality
rates. We obtained the crude mortality rates from the 2017 Philippine Intercompany Mortality Study
of the Actuarial Society of the Philippines [1]. The illustration of our proposed scheme is shown
in the left panel of Figure 1. The red curve represents the linearly interpolated data and the black
dashed curve shows our proposed method. The smooth approximation using Whittaker-Henderson
graduation is shown in green. It can be seen how our proposed method obtained similar results as the
Whittaker-Henderson method. Both methods produced mortality rates that corrected the fluctuations
observed in the crude data. Moreover, plots produced from both methods also continuously increase
across all ages. While graduated data produced by the discrete Whittaker-Henderson method may
appear smooth because of interpolation, the graduated data produced by our method is guaranteed
to be smooth because it is represented by a polynomial. Note that the crude data for the mortality
rates for ages 80 to 85 are not available. The Whittaker-Henderson method cannot be used to obtain
the graduated values outside the range of the data set. In [1], extrapolation was used to calculate the
graduated values for ages 80–85. The authors used the Gompertz-Makeham model (blue curve) to do
this. An advantage of our proposed method is that extrapolation is straightforward. One simply needs
to evaluate the polynomial at the points outside the range of the data set. The computed graduated
value using the Gompertz-Makeham model and our method are both shown in Figure 1. The right
panel shows the plot of the GCV function in (3.1) for λ[0.1,5]. Observe that the plot (blue curve)
is nonlinear and multi-modal which justifies the use of the hybrid GA and interior point method as a
global minimization algorithm. The red dot shows the global minimize (λ?=0.5285), which we set as
the smoothing parameter for this problem. Figure 2 presents our uncertainty analysis for this problem
using Algorithm 3. For this simulation, we assume a Gamma error structure [5]. The uncertainty in our
calculated coecient values translates into the confidence bounds (red dashed lines) around the smooth
approximation of the data. Since the degree of the polynomial is 8, the dimension of the subspace S
is 9, which means that we calculate 9 coecients ui’s. The histograms and confidence intervals are
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19211
presented in the bottom panels of Figure 2. Observe that all the computed coecients fall within their
corresponding 95% confidence intervals.
In the proposed algorithm, the user can choose the degree of polynomial. The left panel of Figure 3
demonstrates our next example to determine the graduation of male mortality rates for ages 0–30
using various degrees of polynomial while fixing the order of derivative to 10. For this example,
we calculated λ?=1.3798. As presented, the graduation of mortality rates can produce a range of
polynomial functions that practitioners can choose from depending on their purpose. However, it is
also possible to set the degree of polynomial based on the fit of the smoothing curve. One approach
is to iterate over a set of positive integers and select the degree that produces the polynomial that has
least distance to the piecewise linear interpolation of data. The right panel of Figure 3 demonstrates
this and shows that a polynomial with degree 8 (l=9) has the best fit. This polynomial is shown as the
red curve in the left panel.
We also applied our proposed method to obtain a smooth time series data for daily new COVID-19
cases in Germany [17]. This dataset spans new COVID-19 cases from 1 September 2020 to 31 May
2021. Here, λ?=0.7227. The left panel in Figure 4 presents the crude daily new COVID-19 cases
in Germany as blue line, and the graduated data as red line. As shown, the plot of the graduated data
points exhibits peaks and troughs that follow the crude data. The underlying trend in the data can also
be observed after smoothing.
The right panel in Figure 4 compares the results of smoothing using dierent methods, implemented
using the Matlab built-in function smoothdata. Simple moving average, gaussian or kernel
smoothing, and smoothing using locally weighted scatter plot smoothing (LOWESS) all use a window
size of 7 because this allows coverage of both the incubation period and the time from the first
appearance of symptoms to diagnosis [19].
Daily new COVID-19 cases can be viewed as the change in the total number of cases for every small
change in time. Hence, by integrating the interpolating polynomial, the cumulative number of COVID-
19 cases can be approximated. As an illustration, we integrate the polynomial obtained in Figure 4 to
compute the cumulative number of cases. In Figure 5 (left panel), we illustrate how the integral of
the polynomial closely approximates the actual cumulative data. The resulting approximation of the
cumulative data is also a polynomial, which makes finding the inflection points and concavity of the
curve easier. These changes in concavity are shown in Figure 5 (right panel). The segments of the plot
in red indicate a deceleration in the number of cases while the blue segments indicate acceleration.
Plots like this are useful in evaluating the eectiveness of policies in containing the pandemic, or in
identifying events that may have contributed to a more rapid spread of the disease.
We also applied our method to obtain the trend in the total COVID-19 deaths in China from 23
January 2020 to 22 February 2020 [29]. For this example, we computed λ?=0.8018. It was shown
in [3] that the growth in the number of deaths related to COVID-19 in the early stages of the pandemic
follows a quadratic trend. By setting the degree of polynomial to 2, we obtain the graduation in
Figure 6, which captures the trend of the data of COVID-19 deaths.
One of the important applications of our proposed method is in handling time series data with
missing terms. Figure 7 demonstrates this in the case of missing time series data points for
Schistosomiasis cases in years 2009–2011 and 2013 [35]. Here, λ?=1.5577. As shown in the
left panel, our method produced graduated data points that are also consistent with the general trend
in the dataset. Note that both WHM and moving average technique require that the time series data
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19212
be evenly spaced. We also applied other methods in extrapolating the missing data points. The right
panel of Figure 7 compares the results of our smoothing algorithm when paired with either linear,
spline, piecewise cubic hermite interpolating polynomial (PCHIP), and the modified Akima piecewise
cubic hermite interpolation (MAkima) to fill the missing data. We used the Matlab built-in function
fillmissing using dierent methods in our numerical experiments. Our method was applied after
extrapolating the data. For this example, smoothing using the points generated by the piecewise linear
interpolation produced the least L2-norm error. The user can choose which approach they prefer when
filling in missing data. This is a pre-processing step that should be implemented before applying our
method to data with missing values.
Our method oers a useful alternative in smoothing high-frequency datasets. WHM entails high
computational cost in smoothing large datasets because it requires solving a linear system whose
dimension is equal to the number of data points. On the contrary, our method is more suitable because
the number of unknowns depends only on the degree of the polynomial used in the graduation. We
demonstrate this in Figure 8 using temperature data from [32] collected every 5 minutes from 24 April
2019 to 1 October 2020. The data set considered includes 140 thousand data points. For this example,
we calculated λ?=0.9586. Applying WHM to this dataset is not possible because our computer can
only handle a linear system of a dimension of at most 30 thousand. Moreover, observe that there are
missing data points shown in Figure 8 as a break in the blue line. Based on the results of Figure 7, we
used linear interpolation to fill the missing data. Our method was able to provide predicted graduated
values for the missing sequence in this high-frequency dataset. Moreover, the smooth approximation
also exhibits the trend in the data.
Figure 1. Graduation of male mortality rates. The black dotted line in the left panel
represents the graduated rates using our method. The minimization of the GCV function
is presented on the right panel.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19213
Figure 2. Graduation of male mortality rates (red solid line) with the quantified uncertainty
(uppermost panel). The blue circles are the mortality data. The cyan lines are the 10000
realizations of the mortality data assuming Gamma error structure. The red dashed lines
illustrate the 95% confidence bands around the smooth approximation. The bottom panels
show the histogram that shows the empirical distribution of the coecients ui.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19214
Figure 3. Graduation of male mortality rates for age group 0–30. The left panel presents the
smoothing of data using dierent dimensions of basis (or l1 degree of polynomial), while
the right panel presents the L2-norm error corresponding to each dimension of basis. The
polynomial with degree equal to 8, which is shown as red curve in the left panel produced
the best fit according to its L2-norm error, shown as red bar in the right panel.
Figure 4. Data smoothing of daily new COVID-19 cases in Germany, September 2020–May
2021. The red line in the left panel represents the graduated new cases. The right panel
compares our proposed method with other smoothing techniques.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19215
Figure 5. Cumulative COVID-19 cases in Germany, September 2020–May 2021. The left
panel compares the cumulative COVID-18 data with the proposed method. The right panel
visualizes the time intervals when cases are increasing and decreasing based on the concavity
of the calculated Sobolev polynomial.
Figure 6. Graduation of total COVID-19 deaths in China, 23 January–22 February 2020.
The red line represents the graduated total number of deaths due to COVID-19, assuming
that the data follow a quadratic trend [3].
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19216
Figure 7. Schistosomiasis cases, 2000–2019. The left panel presents the smooth data using
dierent techniques to fill the missing data while the right panel shows the corresponding
L2-norm error.
Figure 8. Graduation of water temperature, 24 April 2019–1 October 2020. Blue line
represents the crude data. Red line represents the smoothed data.
5. Conclusions
In this paper, we presented an alternative method in data graduation that uses Sobolev polynomials.
We have proven that the resulting minimization problem has a unique solution in a suitable function
space. Furthermore, we formulated an approach using the Gram-Schmidt orthogonalization process
to find the solution in an approximate polynomial space. We applied this method in obtaining the
graduated values of male mortality rates in the Philippines, and COVID-19 data. We also demonstrated
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19217
the usefulness of our method in addressing missing data points and smoothing high frequency data.
Because the obtained results are polynomials, interpolation and extrapolation become straightforward.
Furthermore, inflection points and concavity of the the graduated values are easier to identify. The
regularity of polynomials makes the analysis simpler.
Our study has some limitations that can be explored in future research. First, although one can
calculate the smoothing parameter by minimizing the GCV score, the order of the derivative and the
degree of polynomial are entirely user-defined. If the user does not want to use the default values, the
user can fix the order of the derivative and choose the degree of the polynomial that will yield the least
L2-error. This can be time-consuming especially if the user sets the order of the derivative to a high
value. As a future study, we can explore how both of these relevant parameters can be chosen based
solely on data. Second, our study is posed in one-dimensional time series data. For future work, one
can extend our method to solve multi-dimensional data smoothing problems. For example, one can
consider mortality rates that depend on age and policy duration. Third, one can also consider using
other regularization terms (e.g., total variation) and other non-polynomial basis functions. Fourth, in
the case of high-frequency data, one can cut the computational cost if the interval is subdivided into
segments to reduce the data size [15]. Once the data is segmented, the smooth approximation can
be implemented using parallel computing. This means that the solution in each segment represents a
spline of the entire smooth approximation. However, this will require continuity conditions between
segments, which needs a rigorous theoretical analysis. Fifth, another area that can be explored is the
relation of our method with space-state models. Since we are treating the smoothing as a minimization
in an infinite dimensional Lebesgue space, an extensive study on how to formulate the corresponding
dierence equations needs to be carried out. These are exciting research directions but demand a
thorough investigation and would require other numerical optimization algorithms.
Acknowledgments
This research was supported by a grant from the Computational Research Laboratory of the Institute
of Mathematics, University of the Philippines Diliman.
Conflict of interest
The authors declare that they have no conflict of interest.
References
1. Philippine intercompany mortality study 2017, Actuarial Society of the Philippines,
2017. Available from: http://www.actuary.org.ph/wp-content/uploads/2017/05/
2017-PICM-Study-Final-Report-18May2017.pdf.
2. L. Ambrosio, V. M. Tortelli, Approximation of functional depending on jumps by
elliptic functional via t-convergence, Commun. Pure Appl. Math.,43 (1990), 999–1036.
https://doi.org/10.1002/cpa.3160430805
3. A. Brandenburg, Piecewise quadratic growth during the 2019 novel coronavirus epidemic, Infect.
Dis. Model.,5(2020), 681–690. https://doi.org/10.1016/j.idm.2020.08.014
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19218
4. R. J. Brooks, M. Stone, F. Y. Chan, L. K. Chan, Cross-validatory graduation, Insur. Math. Econ.,7
(1988), 59–66. https://doi.org/10.1016/0167-6687(88)90097-2
5. M. A. Buford, W. L. Hafley, Probability distributions as models for mortality, Forest Sci.,31 (1985),
331–341.
6. R. H. Byrd, J. C. Gilbert, J. Nocedal, A trust region method based on interior
point techniques for nonlinear programming, Math. Program.,89 (2000), 149–185.
https://doi.org/10.1007/PL00011391
7. F. Y. Chan, L. K. Chan, E. R. Mead, Properties and modifications of
Whittaker-Henderson graduation, Scand. Actuar. J.,1982 (1982), 57–61.
https://doi.org/10.1080/03461238.1982.10405433
8. G. Chowell, Fitting dynamic models to epidemic outbreaks with quantified uncertainty: A primer
for parameter uncertainty, identifiability, and forecasts, Infect. Dis. Model.,2(2017), 379–398.
9. D. Cioranescu, P. Donato, M. P. Roque, An introduction to second order partial dierential
equations: Classical and variational solutions, World Scientific, Singapore, 2018.
10. B. Efron, R. J. Tibshirani, An introduction to the bootstrap, CRC Press, 1992.
11. P. H. Eilers, A perfect smoother, Anal. Chemis.,75 (2003), 3631–3636.
https://doi.org/10.1021/ac034173t
12. D. Garcia, Robust smoothing of gridded data in one and higher dimensions with missing values,
Comput. Stat. Data Anal.,54 (2010), 1167–1178. https://doi.org/10.1016/j.csda.2009.09.020
13. L. Grafakos, Classical Fourier analysis, Springer, New York, 2008.
14. P. Graven, Smoothing noisy data with spline function: Estimating the correct degree of smoothing
by the method of Generalized Cross-Validaton, Numer. Math.,31 (1978), 377–403.
15. V. Guerrero, E. Silva, Smoothing a time series by segments of the data range, Commun. Stat.-Theor.
M.,44 (2015), 4568–4585. https://doi.org/10.1080/03610926.2014.901372
16. V. Guerrero, Estimating trends with percentage of smoothness chosen by the user, Int. Stat. Rev.,
76 (2008), 182–202. https://doi.org/10.1111/j.1751-5823.2008.00047.x
17. R. Hannah, E. Mathieu, L. Rod´
es-Guirao, C. Appel, C. Giattino, E. Ortiz-Ospina, et al.,
Coronavirus pandemic (COVID-19), Our World in Data, 2020. Available from: https://
ourworldindata.org/coronavirus.
18. S. Hansun, A new approach of moving average method in time series analysis, 2013 conference on
new media studies (CoNMedia), IEEE, 2013, 1–4.
19. Y. He, X. Wang, H. He, J. Zhai, B. Wang, Moving average based index for judging
the peak of COVID-19 epidemic, Int. J. Environ. Res. Pub. He.,17 (2021), 5288.
https://doi.org/10.3390/ijerph17155288
20. R. J. Hyndman, Moving averages, International Encyclopedia of Statistical Science,
Springer, Berlin, Heidelberg, 2011, 866–896. https://doi.org/10.1007/978-3-642-04898-2
https://doi.org/10.1007/978-3-642-04898-2 380
21. C. U. Jamilla, R. G. Mendoza, V. M. P. Mendoza, Parameter estimation in neutral delay dierential
equations using genetic algorithm with multi-parent crossover, IEEE Access,9(2021), 131348–
131364. https://doi.org/10.1109/ACCESS.2021.3113677
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19219
22. S. Katoch, S. S. Chauhan, V. Kumar, A review on genetic algorithm: Past, present, and future,
Multimed. Tools Appl.,9(2021), 8091–8126.
23. F. Knorr, Multidimensional Whittaker-Henderson graduation, Trans. Soc. Actuar.,36 (1984), 213–
255.
24. D. C. Lay, Linear algebra and its applications, 5 Eds., Pearson, Boston, 2016.
25. F. Macaulay, The Whittaker-Henderson method of graduation, The smooting of time series,
National Bureau of Economic Research, New York, 1931, 89–99.
26. J. L. Manejero, R. Mendoza, Variational approach to data graduation, Philipp. J. Sci.,149 (2020),
431–449.
27. F. Marcellan, Y. Xu, On Sobolev orthogonal polynomials, Expo. Math.,33 (2015), 308–352.
https://doi.org/10.1016/j.exmath.2014.10.002
28. F. Marcellan, M. Alfaro, M. L. Rezola, Orthogonal polynomials on Sobolev spaces: Old
and new directions, J. Comput. Appl. Math.,48 (1993), 113–131. https://doi.org/10.1016/0377-
0427(93)90318-6
29. E. Mathieu, H. Ritchie, E. Ortiz-Ospina, M. Roser, J. Hasell, C. Appel, et al., A global database
of COVID-19 vaccinations, Nat. Hum. Behav.,5(2021), 947–953. https://doi.org/10.1038/s41562-
021-01122-8 https://doi.org/10.1101/2021.03.22.21254100
30. R. Mendoza, S. Keeling, A two-phase segmentation approach to the impedance tomography
problem, Inverse Probl.,33 (2016), 015001. https://doi.org/10.1088/0266-5611/33/1/015001
31. D.B. Mumford, J. Shah, Optimal approximations by piecewise smooth functions and
associated variational problems, Commun. Pure Appl. Math.,32 (1989), 577–685.
https://doi.org/10.1002/cpa.3160420503
32. E. Nixdorf, M. Hannemann, M. Kreck, A. Schoßland, Hydrological records in 5 min
resolution of tributaries in the Mueglitz River Basin, Germany, Data Brief, 2021.
https://doi.org/10.1594/PANGAEA.927729
33. A. Nocon, W. Scott, An extension of the Whittaker-Henderson method of graduation, Scand.
Actuar. J.,2012 (2012), 70–79. https://doi.org/10.1080/03461238.2010.534257
34. K. R. O. Recio, R. G. Mendoza, Three-step approach to edge detection of texts, Philipp. J. Sci.,
148 (2019). 193–211.
35. J. B. E. Ri ˜
non, R. Mendoza, A. A. de los Reyes V, V. Y. Belizario Jr., V. M. P. Mendoza,
Management and control of schistosomiasis in Agusan del Sur, Philippines: A modeling study,
Research Square, 2020.
36. T. J. Rivlin, Chebyshev polynomials, Courier Dover Publications, 2020.
37. S. Sharma, V. Kumar, Application of genetic algorithms in healthcare: A review, Next Gener.
Healthc. Inform., 2021, 75–86. https://doi.org/10.1007/978-981-19-2416-3 5
38. W. I. Smirnow, Lehrgang der h¨oheren Mathematik: Teil V, VEB Deutscher Verlag der
Wissenschaften, Berlin, 1967.
39. A. N. Tikhonov, V. Y. Arsenin, Solutions of ill-posed problems, Wiley, New York, 1977.
40. L. Tribe, R. Smith, Modeling global outbreaks and proliferation of COVID-19, SIAM News, 2020.
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
19220
41. G. Wahba, Spline models for observational data, Society for Industrial and Applied Mathematics,
1990.
42. H. Weinert, Ecient computation for Whittaker-Henderson smoothing, Comput. Stat. Data An.,52
(2007), 959–974. https://doi.org/10.1016/j.csda.2006.11.038
43. H. Yamada, A note on Whittaker-Henderson graduation: Bisymmetry of the smoother matrix,
Commun. Stat.-Theor. M.,49 (2020), 1629–1634. https://doi.org/10.1080/03610926.2018.1563183
44. H. Yamada, F. T. Jahra, Explicit formulas for the smoother weights of Whittaker-
Henderson graduation of order 1, Commun. Stat.-Theor. M.,48 (2018), 3153–3161.
https://doi.org/10.1080/03610926.2018.1476713
©2022 the Author(s), licensee AIMS Press. This
is an open access article distributed under the
terms of the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/4.0)
AIMS Mathematics Volume 7, Issue 10, 19202–19220.
... The Whittaker-Henderson method and smoothing splines are extended further in [1] and [11], both of which solve for a continuous function that interpolates the smoothed data instead of discrete points. This paper mainly uses the smoothing approach in [1]. ...
... The Whittaker-Henderson method and smoothing splines are extended further in [1] and [11], both of which solve for a continuous function that interpolates the smoothed data instead of discrete points. This paper mainly uses the smoothing approach in [1]. The next section discusses this method. ...
... We apply the smoothing technique proposed by [1] to extract the underlying trend in infectious diseases. This approach generalizes the Whittaker-Henderson method to infinite dimension such that the resulting solution is represented by a polynomial. ...
Article
Full-text available
Trend analysis plays an important role in infectious disease control. An analysis of the underlying trend in the number of cases or the mortality of a particular disease allows one to characterize its growth. Trend analysis may also be used to evaluate the effectiveness of an intervention to control the spread of an infectious disease. However, trends are often not readily observable because of noise in data that is commonly caused by random factors, short-term repeated patterns, or measurement error. In this paper, a smoothing technique that generalizes the Whittaker-Henderson method to infinite dimension and whose solution is represented by a polynomial is applied to extract the underlying trend in infectious disease data. The solution is obtained by projecting the problem to a finite-dimensional space using an orthonormal Sobolev polynomial basis obtained from Gram-Schmidt orthogonalization procedure and a smoothing parameter computed using the Philippine Eagle Optimization Algorithm, which is more efficient and consistent than a hybrid model used in earlier work. Because the trend is represented by the polynomial solution, extreme points, concavity, and periods when infectious disease cases are increasing or decreasing can be easily determined. Moreover, one can easily generate forecast of cases using the polynomial solution. This approach is applied in the analysis of trends, and in forecasting cases of different infectious diseases.
... The four different DNN configurations are tested at four different scenarios, and the metrics are calculated from random samples, as shown in Tables 4-6. Since smooth-ing is a method that involves finding a sequence of values that exhibits the trend of a given set of data [51], the obtained graphs are smoothed to overcome the variability, as depicted in Figures 8-10. These figures present the results for the training and validation of PointNet++ in different environments using synthetic point clouds generated by ROADSENSE as input data. ...
Article
Full-text available
In the realm of transportation system management, various remote sensing techniques have proven instrumental in enhancing safety, mobility, and overall resilience. Among these techniques, Light Detection and Ranging (LiDAR) has emerged as a prevalent method for object detection, facilitating the comprehensive monitoring of environmental and infrastructure assets in transportation environments. Currently, the application of Artificial Intelligence (AI)-based methods, particularly in the domain of semantic segmentation of 3D LiDAR point clouds by Deep Learning (DL) models, is a powerful method for supporting the management of both infrastructure and vegetation in road environments. In this context, there is a lack of open labeled datasets that are suitable for training Deep Neural Networks (DNNs) in transportation scenarios, so, to fill this gap, we introduce ROADSENSE (Road and Scenic Environment Simulation), an open-access 3D scene simulator that generates synthetic datasets with labeled point clouds. We assess its functionality by adapting and training a state-of-the-art DL-based semantic classifier, PointNet++, with synthetic data generated by both ROADSENSE and the well-known HELIOS++ (HEildelberg LiDAR Operations Simulator). To evaluate the resulting trained models, we apply both DNNs on real point clouds and demonstrate their effectiveness in both roadway and forest environments. While the differences are minor, the best mean intersection over union (MIoU) values for highway and national roads are over 77%, which are obtained with the DNN trained on HELIOS++ point clouds, and the best classification performance in forested areas is over 92%, which is obtained with the model trained on ROADSENSE point clouds. This work contributes information on a valuable tool for advancing DL applications in transportation scenarios, offering insights and solutions for improved road and roadside management.
Article
Full-text available
Neutral delay differential equations (NDDEs) are differential equations containing time lags not only in the states but also in the state derivatives. NDDEs have applications in modeling physical and biological systems. An NDDE model may consist of several parameters, some of which can be determined using available data. Various numerical techniques have been studied to estimate parameters of mathematical models. In this study, parameter estimation is posed as an optimization problem. The use of heuristic algorithms in parameter estimation has gained popularity because of its ease of implementation, requiring only function evaluations. But to our knowledge, heuristic algorithms have never been employed in estimating parameters in NDDE models. In this work, we apply Genetic Algorithm with Multi-Parent Crossover (GA-MPC) to obtain parameter estimates of three NDDE models with a discrete delay. We compare the estimates to those obtained using standard heuristic algorithms. Results show that GA-MPC is capable of consistently identifying model parameters that provide a good fit of the model to the data.
Article
Full-text available
An effective rollout of vaccinations against COVID-19 offers the most promising prospect of bringing the pandemic to an end. We present the Our World in Data COVID-19 vaccination dataset, a global public dataset that tracks the scale and rate of the vaccine rollout across the world. This dataset is updated regularly and includes data on the total number of vaccinations administered, first and second doses administered, daily vaccination rates and population-adjusted coverage for all countries for which data are available (169 countries as of 7 April 2021). It will be maintained as the global vaccination campaign continues to progress. This resource aids policymakers and researchers in understanding the rate of current and potential vaccine rollout; the interactions with non-vaccination policy responses; the potential impact of vaccinations on pandemic outcomes such as transmission, morbidity and mortality; and global inequalities in vaccine access. The Our World in Data COVID-19 vaccination tracker charts the scale and rate of global vaccinations against COVID-19, making the data available to scientists, policymakers and the general public
Preprint
Full-text available
An effective rollout of vaccinations against COVID-19 offers the most promising prospect of bringing the pandemic to an end. We present the Our World in Data COVID-19 vaccination dataset, a global public dataset that tracks the scale and rate of the vaccine rollout across the world. This dataset is updated regularly, and includes data on the total number of vaccinations administered; first and second doses administered; daily vaccination rates; and population-adjusted coverage for all countries for which data is available (138 countries as of 17 March 2021). It will be maintained as the global vaccination campaign continues to progress. This resource aids policymakers and researchers in understanding the rate of current and potential vaccine rollout; the interactions with non-vaccination policy responses; the potential impact of vaccinations on pandemic outcomes such as transmission, morbidity, and mortality; and global inequalities in vaccine access.
Article
Full-text available
In this paper, the analysis of recent advances in genetic algorithms is discussed. The genetic algorithms of great interest in research community are selected for analysis. This review will help the new and demanding researchers to provide the wider vision of genetic algorithms. The well-known algorithms and their implementation are presented with their pros and cons. The genetic operators and their usages are discussed with the aim of facilitating new researchers. The different research domains involved in genetic algorithms are covered. The future research directions in the area of genetic operators, fitness function and hybrid algorithms are discussed. This structured review will be helpful for research and graduate teaching.
Article
Full-text available
The temporal growth in the number of deaths in the COVID-19 epidemic is subexponential. Here we show that a piecewise quadratic law provides an excellent fit during the thirty days after the first three fatalities on January 20 and later since the end of March 2020. There is also a brief intermediate period of exponential growth. During the second quadratic growth phase, the characteristic time of the growth is about eight times shorter than in the beginning, which can be understood as the occurrence of separate hotspots. Quadratic behavior can be motivated by peripheral growth when further spreading occurs only on the outskirts of an infected region. We also study numerical solutions of a simple epidemic model, where the spatial extend of the system is taken into account. To model the delayed onset outside China together with the early one in China within a single model with minimal assumptions, we adopt an initial condition of several hotspots, of which one reaches saturation much earlier than the others. At each site, quadratic growth commences when the local number of infections has reached a certain saturation level. The total number of deaths does then indeed follow a piecewise quadratic behavior.
Article
Full-text available
A pneumonia outbreak caused by a novel coronavirus (COVID-19) has spread around the world. A total of 2,314,621 laboratory-confirmed cases, including 157,847 deaths (6.8%) were reported globally by 20 April 2020. Common symptoms of COVID-19 pneumonia include fever, fatigue, and dry cough. Faced with such a sudden outbreak of emerging infectious disease, traditional models for predicting the peak of the epidemic often show inconsistent results. With the aim to timely judge the epidemic peak and provide support for decisions for resuming production and returning to normal life based on publicly reported data, we used a seven-day moving average of log-transformed daily new cases (LMA) to establish a new index named the “epidemic evaluation index” (EEI). We used SARS epidemic data from Hong Kong to verify the practicability of the new index, and then applied it to the COVID-19 epidemic analysis. The results showed that the epidemic peaked, respectively, on 9 February and 5 February 2020, in Hubei Province and other provinces in China. The proposed index can be applied for judging the epidemic peak. While the global COVID-19 epidemic reached its peak in the middle of April, the epidemic peaks in some countries have not yet appeared. Global and united efforts are still needed to eventually eliminate the epidemic.
Chapter
Metaheuristic algorithms were created to provide the best solutions for complicated data analysis jobs in an acceptable amount of time. These algorithms have been applied in numerous sectors of study due to their inherent characteristics. The majority of physicians are unaware of these algorithms, which could be useful in solving difficult medical problems. The genetic algorithms are used in healthcare in this research. Radiology, radiation, cancer, paediatrics, cardiology, endocrinology, surgery, obstetrics and gynaecology, pulmonology, infectious diseases, orthopaedics, rehabilitation medicine, neurology, and pharmacology are among the medical areas where the genetic algorithm could be useful. The genetic algorithm’s applications in illness screening, diagnosis, treatment planning, pharmacovigilance, prognosis, and healthcare management are discussed in this study, allowing clinicians to visualise how they might use this metaheuristic method in their practice.KeywordsGentic AlgorithmOptimizationMetaheuristicApplications
Article
The Whittaker-Henderson graduation is a data smoothing method commonly used in actuarial science for mortality table computation. This method is a finite-dimensional minimization problem, which aims to give a smooth approximation of the crude data. Our goal is to treat the Whittaker-Henderson method as a minimization problem in a Sobolev space. We take advantage of some results in the study of Sobolev spaces to analyze the solution to the graduation problem. Solving the arising minimization problem is equivalent to solving a partial differential equation (PDE). To solve the PDE numerically, we use the finite element method (FEM). We apply this variational approach to the data sets found in the literature. In particular, we test our method to check for patterns in (1) mortality rates for Filipino males, (2) global average temperature anomaly, and (3) monthly eBay share price. We also compare the results with that of the Whittaker-Henderson method.
Article
We proposed a three-step image segmentation approach to determine the edges of images containing old texts. In general, texts from old books and articles tend to be very noisy. Thus, we first employed a suitable denoising method to obtain a smooth approximation 𝑰𝒔 of a given image 𝑰 ̃. Then, the fuzzy edge map 𝑬 ̃ was obtained using the gradient of 𝑰𝒔. This gradient map gave an estimate of the edges of the texts. For the second step, the method of 𝒌 −means++ with two clusters was employed to separate the edges from rest of the image. Because a smooth approximation of the image was used, the edges obtained are "thick." And so, in the last step of the our method, the binary image generated from the previous step was post-processed using a thinning algorithm. We implemented our method to images containing Baybayin texts from the National Museum of the Philippines.