Content uploaded by A. Sathish Kumar
Author content
All content in this area was uploaded by A. Sathish Kumar on Apr 16, 2025
Content may be subject to copyright.
Available via license: CC BY 4.0
Content may be subject to copyright.
APPROXIMATION BY NEURAL NETWORK SAMPLING
OPERATORS IN MIXED LEBESGUE SPACES
ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Abstract. In this paper, we prove the rate of approximation for the Neural Network
Sampling Operators activated by sigmoidal functions with mixed Lebesgue norm in terms
of averaged modulus of smoothness for a bounded measurable functions on bounded
domain. In order to achieve the above result, we first establish that the averaged modulus
of smoothness is finite for certain suitable subspaces of Lp,q (R×R).Using the properties of
averaged modulus of smoothness, we estimate the rate of approximation of certain linear
operators in mixed Lebesgue norm. Then, as an application of these linear operators,
we obtain the Jackson type approximation theorem, in order to give a characterization
for the rate of approximation of neural network operators in-terms of averaged modulus
of smoothness in mixed norm. Lastly, we discuss some examples of sigmoidal functions
and using these sigmoidal functions, we show the implementation of continuous and
discontinuous functions by neural network operators .
1. Introduction and Preliminaries
In sampling and reconstruction, the signals are usually defined in time domains or
space fields. However, in practice, some signals are time-varying. This implies that the
signals can live in time-space domains at the same time. A suitable tool for analyzing
such time-space signals is mixed Lebesgue spaces. Because in mixed Lebesgue spaces,
we can consider the integrability of each variable independently. This is major different
from the classical Lebesgue spaces, see [8, 9, 23] and the references therein. Benedek was
the first person who initiated the study of mixed Lebesgue spaces, see [8, 9]. Inspired by
Benedek works, Fernandez [23] analysed the singular integral operators for functions in
mixed Lebesgue spaces. Further, Francia, Ruiz and Torrea [24] developed the theory for
Calder´on-Zygmund operators in the settings of mixed Lebesgue spaces. Recently, several
researchers have analyzed the sampling and reconstruction problems in mixed Lebesgue
spaces, see [32, 30, 28, 31, 4, 29] and the references therein. In this paper, we analyze the
Neural Network operators in mixed Lebesgue spaces in view of approximation theory.
The feed-forward neural network operators (FNNs) with one hidden layer are defined
by
Np(x) :=
p
X
j=0
cjσ(⟨αj.x⟩+βj), x ∈Rn, n ∈N
2010 Mathematics Subject Classification. 41A35; 94A20; 41A25.
Key words and phrases. Linear Operators, Modulus of Smoothness, Degree of Approximation, Neural
Network Operators, Mixed-Lebesgue spaces.
1
arXiv:2504.10206v1 [math.FA] 14 Apr 2025
2 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
where βj∈R,0≤j≤p, are called thresholds, αj∈Rnare the connection weights,
cj∈Rare the coefficients. Furthermore, ⟨αj.x⟩denotes the inner product of αjand x,
and σdenotes activation function of the neural network.
In past three decades, the approximation by NN operators have been widely analyzed by
several researchers in various settings. Using the tools from Functional Analysis, Cybenko
analysed the approximation of continuous function by NN operators in [11]. The approx-
imation of continuous functions by NN operators with single and two hidden layers are
discussed in [25]. Hornik et al. [27] approximated the measurable functions by multilayer
feedforward networks. Anastassiou [2] initiated the study of rate of approximation by NN
operators in-terms of modulus of smoothness. Further, he discussed some quantitative
convergence results in [3]. Recently, various approximation results for the neural network
operators have been discussed in different settings, see [1, 20, 12, 13, 14, 16, 17, 18, 19, 15]
and the references therein.
We begin with the definition of mixed Lebesgue spaces. For 1 ≤p, q < ∞,let Lp,q (R×Rd)
denote the space of all complex valued measurable functions fdefined on R×Rdsuch
that
∥f∥Lp,q := ZRZRd|f(x, y)|qdyp
q
dx!1
p
<∞.
Further, let L∞,∞(R×Rd) denote the set of all complex valued measurable functions on
Rd+1 such that ∥f∥L∞,∞:= ess sup |f|<∞. For 1 ≤p, q < ∞,ℓp,q(Z×Zd) denotes the
space of all complex sequences c=c(k1, k2)(k1∈Z,k2∈Zd)such that
∥c∥ℓp,q :=
X
k1∈Z
X
k2∈Zd|c(k1, k2)|q
p/q
1/p
<∞.
We denote by ℓ∞,∞(Z×Zd),the space of all complex sequences on Zd+1 such that
∥c∥ℓ∞,∞:= sup
k∈Zd+1 |c(k)|<∞.
We note that Lp,p(R×Rd) = Lp(Rd+1 ) and ℓp,p(Z×Zd) = ℓp(Zd+1 ) for 1 ≤p < ∞.
In this paper, we analyze neural network sampling operators in certain suitable sub-
spaces of Lp,q(R×R).We observe that the sampling operators are not well defined in
Lp,q(R×R),, so we need to restrict the sampling operator to certain suitable subspace
of Lp,q(R×R).Motivated by the work of Bardaro et al. [6], we define the subspace
Λp,q(R×Rd),and analyze the sampling type neural network operators in Λp,q (R×Rd)..
In order to define these subspace, we need certain special type of partitions called admis-
sible partitions, which is defined below:
Definition 1. We say that a sequence Σx= (xk)k∈Z⊂Ris an admissible partition of R
if
0<∆ := inf
k∈Z(xk−xk−1)≤sup
k∈Z(xk−xk−1) := ¯
∆<∞,
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES3
where ¯
∆and ∆denotes upper and lower mesh size respectively. Again in Rdfor every
i= 1,2, ..., d we take the admissible partition over the ith axis Σi= (yi,ji)ji∈Zsuch that
0<∆ := min
i=1,...,d inf
ji∈Z(yi,ji−yi,ji−1)≤max
i=1,...,d sup
ji∈Z(yi,ji−yi,ji−1) := ¯
∆<∞.
A sequence Σy= (yj)j∈Zd, yj= (y1,j1, ..., yd,jd)and j= (j1, ..., jd)is called an admissible
sequence if it is written as
Σ=Σ1×Σ2×... ×Σd.
For Σiwe consider the cube
Qj=
d
Y
i=1
[yi,ji−1, yi,ji),
with volume
∆j:=
d
Y
i=1
(yji−yji−1).
Now we introduce the notion of an admissible partition Σ in R×Rdand the corre-
sponding lp,q (Σ)-norm of a function f:R×Rd→C.
Definition 2. We say that a sequence Σ = (xk, yj)k∈Z,j∈Zdis an admissible sequence
or partition if Σx= (xk)k∈Zand Σy= (yj)j∈Zdare admissible partitions of Rand Rd
respectively.
For the admissible partition Σ = (xk, yj)k∈Z,j∈Zdwe construct
Qjk = [xk−1, xk)×
d
Y
i=1
[xi,ji−1, xi,ji)
with volume
∆jk := (xk−xk−1)×
d
Y
i=1
(xi,ji−xi,ji−1).
The lp,q(Σ)-norm of a function f:R×Rd→Cis defined by
∥f∥lp,q(Σ) :=
X
k∈Z
X
j∈Zd
sup
z∈Qjk |f(z)|q∆jk
p
q
1
p
where z= (x, y)∈R×Rd.
Definition 3. Let 1≤p, q < ∞. The Λp,q space is defined by
Λp,q =f∈M(R×Rd) : ∥f∥lp,q(Σ) <∞.
In this paper, we analyze the convergence behavior of linear operators in the settings
of a mixed Lebesgue norm. In particular, we prove the convergence rate of the family of
linear operators {Lη: Λp,q −→ Lp,q(R×R)}η∈I,where Iis an index set. Furthermore, as
an application of our analysis of these linear operators, we obtain the rate of convergence of
4 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
the following neural network sampling operators in mixed Lebesgue norm: Let I= [−1,1].
For f:I×I→R,the neural network operators is given by
Fnf(x, y) =
n
X
k=−n
n
X
j=−n
fk
n,j
nψσ(nx −k, ny −j)
n
X
k=−n
n
X
j=−n
ψσ(nx −k, ny −j)
,
where ψσ(x, y) := ϕσ(x)×ϕσ(y), ϕσis defined in the following way:
ϕσ(v) := 1
2[σ(v+ 1) −σ(v−1)], v ∈R.
Here, the sigmoidal function σis defined in the following way:
Definition 4. We say that a measurable function σ:R→Ris a sigmoidal function if
and only if limx→−∞ σ(x) = 0 and limx→∞ σ(x)=1.
Further, σ:R→Ris a non-decreasing sigmoidal function such that σ(2) > σ(0)
satisfies the following assumptions:
(A1)σ(v)−1
2is an odd function;
(A2)σ∈C2(R) is concave for x≥0;
(A3)σ(v) = O(|v|−1−α) as x→ −∞, for some α > 0.
The paper is organized as follows. Our main contribution in this paper is that we estimate
the rate of approximation of certain linear operators in mixed Lebesgue norm. In order to
do that in Sec 2, we introduce the averaged modulus of smoothness for functions in Lp,q(I),
where I⊂R×Rdand analyze their properties in a detailed way. In Sec 3, we consider the
suitable subspace of Lp,q(R×Rd) for which the averaged modulus of smoothness is finite.
We prove several properties of these subspaces in this section. Using these results, we prove
the rate of convergence of linear operators in-terms of averaged modulus of smoothness in
Sec 4. As an application of Sec 4, we obtain the rate of approximation of neural network
operators in term of averaged modulus of smoothness in Sec 5. In last section, we discuss
some examples of sigmoidal functions and using these sigmoidal functions, we show the
implementation of continuous and discontinuous functions by neural network operators .
2. The Averaged Modulus of Smoothness and its Properties
In this section, we introduce and study the properties of averaged modulus of smooth-
ness for functions in Lp,q(I),where I⊂R×Rd.
Further, let
(∆r
h,hf)(x, y) :=
r
X
j=0
(−1)r−jr
jf(x+jh, y+jh),
where x∈R,y∈Rd, r ∈Nand h= (h, ....., h)∈Rd.
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES5
Now we define the local modulus of smoothness for functions in M [a, b]×
d
Y
i=1
[ci, di]!
and M(R×Rd).Let x∈[a, b], yi∈[ci, di],0≤δ∗≤b−a
r,0≤δi≤di−ci
r, for i=
1,2, ..., d. Let us define δ:= min
i∈{1,2,...,d}{δ∗, δi}.
Definition 5. Let f∈M [a, b]×
d
Y
i=1
[ci, di]!.Then, the local modulus of smoothness is
defined by
ωr f, (x, y); δ;M [a, b]×
d
Y
i=1
[ci, di]!!
:= sup
t,t+rh∈[x−rδ
2,x+rδ
2]T[a,b]
sup
s,s+rh∈[y−rδ
2,y+rδ
2]TQd
i=1[ci,di]|(∆r
h,hf)(t, s)|.
For f∈M(R×Rd),the local modulus of smoothness at a point (x, y)∈R×Rdfor
δ≥0is defined by
ωr(f, (x, y); δ;M(R×Rd))
:= sup
t,t+rh∈[x−rδ
2,x+rδ
2]
sup
s,s+rh∈[y−rδ
2,y+rδ
2]|(∆r
h,hf)(t, s)|:∀i.
Remark 1. For f∈M(R×Rd), the local modulus of smoothness will coincide with the
local modulus of smoothness of f defined on some finite (k+ 1) cells with special boundary
points. For all (x, y)∈[a, b]×
d
Y
i=1
[ci, di]and for all c≥rδ
2with δ≥0we have,
ωr f, (x, y); δ;M [a−c, b −c]×
d
Y
i=1
[ci−c, di−c]!!=ωrf; (x, y); δ;M(R×Rd).
The following observations can be made from the definition of local modulus of smooth-
ness.
(i) ωr(f, (x, y); δ;M(R×Rd)) ≤2rsup
(x,y)∈R×Rd|f(x, y)|.
(ii) For δ≤δ′with δas given previously and δ′= min
i∈{1,2,...,d}{δ′∗, δ′
i}.
ωr(f, (x, y); δ;M(R×Rd)) ≤ωr(f, (x, y); δ′;M(R×Rd)).
Now we define the Lp,q-averaged modulus of smoothness for functions in
M [a, b]×
d
Y
i=1
[ci, di]!and M(R×Rd).
6 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Definition 6. Let 1≤p, q < ∞and f∈M [a, b]×
d
Y
i=1
[ci, di]!.Then, for 0≤δ≤
inf
i=1,2,...,d (b−a)
r,(di−ci)
rthe Lp,q-averaged modulus of smoothness of order r∈Nis
defined by
τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!p,q
:=
ωr f;.;δ;M([a, b]×
d
Y
i=1
[ci, di]!
Lp,q([a,b]×Qd
i=1[ci,di])
=
Z[a,b] ZQd
i=1[ci,di]ωr(f;.;δ;M([a, b]×
d
Y
i=1
[ci, di])
q
dy!p
q
dx
1
p
.
Let f∈M(R×Rd)and δ≥0. Then, the averaged modulus of smoothness is defined by
τr(f;δ;M(R×Rd))p,q := ∥ωr(f;.;δ;M(R×Rd)∥Lp,q(R×Rd)
= ZRZRdωr(f;.;δ;M(R×Rd)qdyp
q
dx!1
p
.
Proposition 1. Let f∈M(R×Rd), r ∈N,1≤p, q < ∞.Then, we have
lim
n→∞ τrf;δ;M[−n, n]×[−n, n]dp,q =τr(f;δ;M(R×Rd))p,q.
Proof. Let f∈M(R×Rd) and δ≥0. For n≥rδ and j∈Nwith j≥rδ
2we have
ωrf, (x, y); δ;M[−n, n]×[−n, n]d
≤ωr(f, (x, y); δ;M(R×Rd))
=ωrf, (x, y); δ;M[−n−j, n +j]×[−n−j, n +j]d.
Now taking the Lp,q norm on both side, we obtain
τrf;δ;M[−n, n]×[−n, n]dp,q
=ωrf, (x, y); δ;M[−n, n]×[−n, n]dLp,q
≤ωrf, (x, y); δ;MR×RdLp,q
≤ωrf, (x, y); δ;M[−n−j, n +j]×[−n−j, n +j]dLp,q
≤τrf;δ;M[−n−j, n +j]×[−n−j, n +j]dp,q .
We obtain the desired estimate by taking the limit as n→ ∞. Hence, the proof is
completed. □
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES7
Lemma 1. Let f∈M [a, b]×
d
Y
i=1
[ci, di]!,1≤q < p < ∞,1
p+1
p′= 1,1
q+1
q′= 1.
Then, we have
(i) τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!1,1≤K1τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!p,q
(ii) τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!p,q ≤K2
τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!1,1
1
p
,
where
K1:= (b−a)1
p′ d
Y
i=1
(di−ci)!1
q′
K2:= (2rsup
(x,y)∈[a,b]×Qd
i=1(di−ci)|f(x, y)|)1
p′
.
Furthermore, from (i) and (ii), we have
lim
δ→0+ τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!p,q
= 0
⇕
lim
δ→0+ τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!1,1
= 0.
Proof. (a) The proof of the proposition follows from the following inequalities
(a) ∥g∥1,1≤(b−a)1
p′Qd
i=1(di−ci)1
q′∥g∥p,q
(b) ∥g∥p,q ≤(sup
(x,y)∈[a,b]×Qd
i=1(di−ci)|g(x, y)|)1
p′
∥g∥
1
p
1,1.
By applying Holder’s inequality twice, we obtain the inequality (a). Now, we prove the
second inequality. By using the definition of mixed Lebesgue norm, we have
∥g∥p,q =
Z[a,b] ZQd
i=1[ci,di]|g(x, y)|qdy!p
q
dx
1
p
.
8 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Now, using Jensen’s inequality, we obtain
∥g∥p,q ≤ Z[a,b] ZQd
i=1[ci,di]|g(x, y)|q . p
qdy!dx!1
p
≤(sup
(x,y)∈[a,b]×Qd
i=1(di−ci)|g(x, y)|)p−1
p Z[a,b] ZQd
i=1[ci,di]|g(x, y)|dy!dx!1
p
≤(sup
(x,y)∈[a,b]×Qd
i=1(di−ci)|g(x, y)|)1
p′
∥g∥
1
p
1,1.
Hence, we get the required estimate. □
In a similar way, we have the following lemma for f∈M(R×Rd).
Lemma 2. Let f∈M(R×Rd),1≤q < p < ∞,1
p+1
p′= 1,1
q+1
q′= 1.Then, we have
τr(f;δ;M(R×Rd))p,q ≤Kτr(f;δ;M(R×Rd))1,1,
where K= 2rsup
(x,y)∈R×Rd|f(x, y)|!1
p′
. Furthermore, we have
lim
δ→0+ τr(f;δ;M(R×Rd))1,1= 0 =⇒lim
δ→0+ τr(f;δ;M(R×Rd))p,q = 0
Proof. Following the lines of the above lemma and using P roposition1 we can easily
establish the lemma. □
Proposition 2. Let 1≤p, q < ∞and f∈M[a, b]×Qd
i=1[ci, di]. Then f is Riemann
integrable if and only if
lim
δ→0+ τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!p,q
= 0.
Proof. By lemma 1, it is sufficient to show that fis Riemann integrable if and only if
lim
δ→0+ τr f;δ;M [a, b]×
d
Y
i=1
[ci, di]!!1,1
= 0.
As L1,1norm on R×Rdacts like an L1norm on Rd+1 and the rest of the proof follows in
the same line as in [5]. □
Now, we define the integral moduli of smoothness in the mixed Lebesgue spaces.
Definition 7. (i) Let 1≤p, q < ∞and f∈;Lp,q (a, b)×Qd
i=1(ci, di).Then the
integral moduli of smoothness is defined by
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES9
ωr(f;δ;Lp,q (a, b)×
d
Y
i=1
(ci, di)!
:= sup
0≤h≤δ
Z(a,b−rh) ZQd
i=1(ci,di−rh)|(∆r
h,hf)(u, v)|qdv!p
q
du
1
p
.
(ii) Similarly for f∈Lp,q(R×Rd)the integral moduli of smoothness is defined by
ωr(f;δ;Lp,q(R×Rd) = sup
0≤h≤δ(ZRZRd|(∆r
h,hf)(u, v)|qdvp
q
du)1
p
.
Proposition 3. If f∈Lp,q(R×Rd), r ∈Nand 1≤p, q ≤ ∞,then we have
lim
n→∞ ωr(f;δ;Lp,q (−n, n)×(−n, n)d=ωr(f;δ;Lp,q(R×Rd)).
Proof. From the definition of integral moduli of smoothness, we have
ωr(f;δ;Lp,q (−n, n)×(−n, n)d≤ωr(f;δ;Lp,q(R×Rd)),∀n∈N,
which implies that
lim
n→∞ ωr(f;δ;Lp,q (−n, n)×(−n, n)d≤ωr(f;δ;Lp,q(R×Rd)).(2.1)
Again, from the definition of integral moduli of smoothness for 0 ≤h≤δ, we obtain
lim
n→∞ ωr(f;δ;Lp,q (−n, n)×(−n, n)d
= lim
n→∞ sup
0≤h≤δ(Z(−n,n−rh)Z(−n,n−rh)d|(∆r
h,hf)(u, v)|qdvp
q
du)1
p
≥lim
n→∞ (Z(−n,n−rh)Z(−n,n−rh)d|(∆r
h,hf)(u, v)|qdvp
q
du)1
p
=(ZRZRd|(∆r
h,hf)(u, v)|qdvp
q
du)1
p
=ωr(f;δ;Lp,q(R×Rd)),
which implies
ωr(f;δ;Lp,q(R×Rd)) ≤ωr(f;δ;Lp,q (−n, n)×(−n, n)d(2.2)
We obtain the desired result by combining the estimates (2.1) and (2.2). □
Proposition 4. Let f∈M(a, b)×Qd
i=1(ci, di),r∈Nand 1≤p, q ≤ ∞.Then we
have
ωr f;δ;Lp,q (a, b)×
d
Y
i=1
(ci, di)!!≤τr f;δ;M (a, b)×
d
Y
i=1
(ci, di)!!p,q
.
Similarly, for f∈M(R×Rd)∩Lp,q(R×Rd)we have
ωr(f;δ;Lp,q(R×Rd)≤τr(f;δ;M(R×Rd)p,q.
10 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
3. Subspaces of Lp,q(R×Rd)
In this section, we prove various properties of the subspaces of Lp,q(R×Rd).
Lemma 3. A function f∈M(R×Rd)∈Λp,q if and only if there exist a ∆∗>0such
that ∥f∥lp,q(Σ) <∞for each admissible sequence Σwith lower mesh size ∆≥∆∗.
Proof. By the definition of Λp,q , we see that if f∈Λp,q then ∥f∥lp,q(Σ) <∞for any
admissible partition Σ so we are done. Now, We show the converse part.
Let Σ = (xk, yj)k∈Z,j∈Zdbe an arbitrary admissible sequence with lower mesh size ∆ >∆∗.
Now we can choose m∈Nsuch that m∆≥∆∗. So we have
∆∗≤m∆≤(xmk −xm(k−1))≤m¯
∆
∆∗≤m∆≤(yi,mji−yi,m(ji−1))≤m¯
∆,∀i= 1,2, ..., d.
From the above estimates, we see that the subsequence (xmk, ymj )k∈Z,j∈Zdhas lower mesh
size greater than equal to ∆∗. Furthermore, for N= (N0, Ny) where Ny= (N1, ..., Nd),
we get
X
j∈Zd
sup
y∈Qj|f(z)|q=
m−1
X
N1=0
...
m−1
X
Nd=0 X
j∈Zd
sup
y∈Qmj−Ny|f(z)|q.(3.1)
For all such Ny’s we have Qmj−Ny⊆g
Qmj := Qd
i=1[xi,m(ji−1)+Ni, xi,mji).Now, from the
above relation(3.1), we get
X
j∈Zd
sup
y∈Qj|f(z)|q≤
m−1
X
N1=0
...
m−1
X
Nd=0 X
j∈Zd
sup
y∈g
Qmj |f(z)|q=mdX
j∈Zd
sup
y∈g
Qmj |f(z)|q.
For ∞> p ≥q≥1, we obtain
X
j∈Zd
sup
y∈Qj|f(z)|q
p
q
≤mdp
q
X
j∈Zd
sup
y∈g
Qmj |f(z)|q
p
q
.
Hence, we have
X
k∈Z
X
j∈Zd
sup
z∈Qjk |f(z)|q∆jk
p
q
≤mdp
q+1 X
k∈Z
X
j∈Zd
sup
y∈^
Qmjk |f(z)|q
p
q
<∞,
where ]
Qmjk =g
Qmj ×(x(mk−1)+N0, xmk) and these cubes corresponds to the admissible
sequence (xmk, ymj )k∈Z,j ∈Zd. By the definition of Λp,q , we obtain the converse part for
p≥q. Now, from [26], we have if p1> p2>0 then ∥f∥Lp2,p1≤ ∥f∥Lp1,p2so we get the
result when p<q. Thus, the proof is completed. □
Proposition 5. If 1≤p, q < ∞, then Λp,q is a proper subspace of Lp,q(R×Rd).
Proof. For f(x, y)∈Λp,q, we have to show that
ZRZRd|f(x, y)|qdyp
q
dx!1
p
<+∞.
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES11
Let us take the admissible sequence as Σ = (xk,yj)k∈Z,j∈Zd, where xk=kand yj=j,
j= (j1, j2, ..., jd). Then the corresponding cube is given by
Qjk= [k−1, k]
d
Y
i=1
[ji−1, ji]
with volume ∆jk= 1 for all k∈Z,j∈Zd.
For p>q≥1, we have
ZRd|f(x, y)|qdyp
q
≤
X
j∈Zd
sup
y∈Qj|f(x, y)|qdy
p
q
.
In view of Λp,q and the above inequality, we obtain
ZRZRd|f(x, y)|qdyp
q
dx ≤X
k∈Z
X
j∈Zd
sup
y∈Qjk|f(x, y)|qdy
p
q
<∞.
Following (corollary 3.5.2) in [26], we obtain the result for p < q. This completes the
proof. □
Now we want to find some suitable subspaces of Lp,q(R×Rd) for which the τ−modulus
is finite. We consider the following subspaces.
Definition 8. Let 1≤p, q < ∞and r∈N. Then we define the following spaces:
(i) The space Fp,q by
Fp,q := (f∈M(R×Rd) : there exists η, ζ such that
f(x, y) = O(1 + |x|)(−1
p−η)(1 + ∥y∥)(−1
q−ζ)).
(ii) The space Ωp,q by
Ωp,q := (f∈M(R×Rd) : |f(x, y)| ≤ g(x, y), g ∈Lp,q(R×Rd),
g≥0,even and non-increasing).
12 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
(iii) The Sobolev spaces Wr(Lp,q(R×Rd)) by
Wr(Lp,q(R×Rd)) := (f∈Lp,q(R×Rd) : f(x, y) = ϕ(x, y)a.e., ϕ ∈ACr
loc(R×Rd)
and Dβϕ∈Lp,q(R×Rd)∀|β| ≤ r),
where Dβ=∂n0+|N|
∂xn0∂yn1
1∂yn2
2...∂ynd
d
,β= (n0, N), N = (n1, n2..., nd)and |β|=Pd
i=1 ni.
Proposition 6. The following statement holds
(i) Ωp,q and Wr(Lp,q(R×Rd)) are linear spaces.
(ii) Fp,q ⊊Ωp,q ⊊Λp,q
Proof. By the definition of Ωp,q and Wr(Lp,q (R×Rd)) it is clear that these two spaces
are linear. Let f(x, y)∈ Fp,q . Then one can see that |f(x, y)| ∈ Lp,q (R×Rd) and hence
f∈Ωp,q.Let f∈Ωp,q. Then by definition of Ωp,q,f∈M(R×Rd) and there exist a
function g∈Lp,q(R×Rd) such that |f(x, y)| ≤ g(x, y). Now let Σ = (xk, yj)k∈Z,j ∈Zdbe
an arbitrary admissible partition of R×Rd. We have to show that ∥f∥lp,q (Σ) <∞. For
p≥q, we have
X
k∈Z
X
j∈Zd
sup
z∈Qjk |f(z)|q∆jk
p
q
≤X
k∈Z
X
j∈Zd
sup
z∈Qjk |g(z)|q∆jk
p
q
≤(¯
∆)p
qX
k∈Z
X
j∈Zd
sup
z∈Qjk |g(z)|q
p
q
≤(¯
∆)p
qZRZRd|g(z)|qdyp
q
dx < ∞.
Similar result is true for p≤qby Corollary 3.5.2 in [26]. This implies that Ωp,q ⊂Λp,q.
□
Proposition 7. Let f∈Λp,q,1≤p, q < ∞and r∈N. Then, we have
ωr(f, (x, .;δ;M(R×Rd)) ∈Λp,q ,for every δ≥0.
Proof. Let Σ = (xk,yj)k∈Z,j∈Zdbe an admissible partition with ∆ ≥rδ
2. From Lemma 2,
it is sufficient to show that
∥ωr(f, .;δ;M(R×Rd))∥lp,q (Σ) <∞.
Now for z= (x, y), x∈Rand y= (y1, ..., yd)∈Rd, we consider
Cδ(z) = x−rδ
2, x +rδ
2×
d
Y
i=1 yi−rδ
2, yi+rδ
2.
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES13
Then, by property (1) of the local modulus of smoothness, we have
ωr(f, z;δ;M(R×Rd)) ≤2rsup
η∈Cδ(z)|f(η)|.
There exists P= (p0,p), p= (p1, ..., pd) such that
Cδ(z)⊂[xk−2, xk+1)×
d
Y
i=1
[yi,ji−2, yi,ji+1) =
1
[
p0=−1
1
[
p1=−1
...
1
[
pd=−1
Qjk+P.
Let us consider the case p≥q. We have
X
k∈Z
X
j∈Zd
sup
z∈Qjk |ωr(f, z;δ;M(R×Rd))|q
p
q
≤2rp X
k∈Z
X
j∈Zd
sup (|f(η)|q:η∈
1
[
p0=−1
1
[
p1=−1
...
1
[
pd=−1
Qjk+P)
p
q
≤2rp
1
X
p0=−1
1
X
p1=−1
...
1
X
pd=−1X
k∈Z
X
j∈Zd
sup
η∈Qjk+P|f(η)|q
p
q
≤2rp 3d+1 X
k∈Z
X
j∈Zd
sup
η∈Qjk |f(η)|q
p
q
<∞.
Similarly, the result is true for p≤qby Corollary 3.5.2 in [26]. This completes the proof
of the assertion. □
Corollary 1. If f∈Λp,q ,1≤p, q < ∞,δ≥0and r∈N, then we have
τr(f;δ;M(R×Rd))p,q <∞.
Proof. The proof straightly follows from the above proposition. □
Definition 9. The space Rloc(R×Rd)is defined by
Rloc(R×Rd) := {f:R×Rd→C;f is locally Riemann integrable on R×Rd}.
Proposition 8. For f∈Λp,q ∩Rloc(R×Rd),1≤p, q < ∞,r∈N, we have
lim
δ→0τr(f;δ;M(R×Rd))p,q = 0.
Proof. In view of Lemma 2, it is enough to show the result holds for L1,1-norm using
Lemma 2. Let ϵ > 0 be given. For large N > 0, let A= (N,∞)∪(−∞,−N), we have
ZAZAd
ωr(f, (x, y); δ;M(R×Rd))dydx < ϵ. (3.2)
14 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Then for all 0 ≤δ≤2
r, we get
τr(f;δ;M(R×Rd))1,1=Z[−N,N]Z[−N,N]d
ωr(f, (x, y); δ;M(R×Rd))dydx
+ZAZAd
ωr(f, (x, y); δ;M(R×Rd))dydx
≤IN+ϵ,
where
IN=Z[−N,N]Z[−N,N]d
ωr(f, (x, y); δ;M(R×Rd))dydx.
From Proposition 1, we obtain
IN≤τrf;δ;M[−N−1, N + 1] ×[−N−1, N + 1]d1,1.
Using Property (iii), we get
lim
δ→0τ1(f;δ;M(R×Rd))1,1= 0 =⇒lim
δ→0τr(f;δ;M(R×Rd))1,1= 0.
Since fis locally Riemann integrable on R×Rdhence it is Riemann integrable on [−N−
1, N + 1] ×[−N−1, N + 1]d. Therefore from Proposition 2, we have
lim
δ→0+ τrf;δ;M[−N−1, N + 1] ×[−N−1, N + 1]d1,1= 0.
This completes the proof. □
Proposition 9. Let f∈M [a, b]×
d
Y
i=1
[ci, di]!,1≤q≤p < ∞,r∈Nand
lim
δ→0+
τrf;δ;M[a, b]×Qd
i=1[ci, di]p,q
δr= 0.
Then fis a polynomial of degree ≤(d+ 1)(r−1).
Proof. Let us define
ϕ(x) := f(x, y1, ..., yd) where each yi, i = 1,2, ..., d are constant
similarly, for j= 1,2, ..., d, we define
ϕ(yj) := f(x, y1, ..., yd) where each xandyi, i =jare constant.
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES15
Now we show that each of ϕ(x) and ϕ(yj), j= 1,2, .., d are polynomials of degree ≤r−1.
In view of Proposition 19 of [6], Lemma 2 and the hypothesis we obtain
lim
δ→0+
τrf;δ;M[a, b]×Qd
i=1[ci, di]p,q
δr= 0
=⇒lim
δ→0+
τrϕ(x); δ;M[a, b]×Qd
i=1[ci, di]p,q
δr= 0
=⇒lim
δ→0+
τrϕ(x); δ;M[a, b]×Qd
i=1[ci, di]1,1
δr= 0
=⇒lim
δ→0+
τr(ϕ(x); δ;M([a, b]))1
δr= 0
=⇒ϕ(x) is a polynomial of degree ≤r−1.
Similarly, we can say that each of ϕ(yj) is a polynomial of degree ≤r−1. Thus, fis a
polynomial of degree ≤degree of ϕ(x) + Pd
j=1 degree of ϕ(yj)≤(d+ 1)(r−1).□
Proposition 10. If f∈Λp,q,1≤p, q < ∞,δ≥0and r∈Nand
lim
δ→0+
τr(f;δ;M(R×Rd))p,q
δr= 0,
then f(x, y) = 0 for all (x, y)∈R×Rd.
Proof. From Proposition 1, we have
lim
n→∞ τrf;δ;M[−n, n]×[−n, n]dp,q =τr(f;δ;M(R×Rd))p,q.
This implies that fis polynomial in [−n, n]×[−n, n]dfor arbitrary nin view of Proposition
8. Since f∈Lp,q(R×Rd), we get f(x, y) = 0, for all (x, y)∈[−n, n]×[−n, n]d. Since n
is arbitrary, we obtain f(x, y) = 0 for all (x, y)∈R×Rd.□
4. Error Estimation And An Interpolation Theorem
We define the modified Steklov function for f∈Λp,q as follows
fr,h =−
r
X
j=1
(−1)jr
jFr,hj
where hj=jh
r,h > 0, r∈Nand the usual Steklov function
Fr,h(x, y) = 1
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]d
f x+
r
X
i=1
ui,y+
r
X
i=1
vi!dv1...dvrdu1...dur,
here ui∈Rand vi∈Rdfor i= 1,2, ..., r.
Proposition 11. If f∈Λp,q,1≤p≤q < ∞,r∈Nand h > 0, then we have Fr,h ∈Λp,q.
16 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Proof. By using Jensen’s inequality, we have
|Fr,h(x, y)|q
≤1
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]df x+
r
X
i=1
ui,y+
r
X
i=1
vi!
q
dv1...dvrdu1...dur
≤2q
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]df x+
r
X
i=1
ui,y+
r
X
i=1
vi!−f(x, y)
q
dv1...dvrdu1...dur
+2q
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]d|f(x, y)|qdv1...dvrdu1...dur
:= I1+I2.
Now, for any admissible sequence (xk,yj)k∈Z,j∈Zdwe obtain
I1=2q
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]d|∆1
Pr
i=1 ui,Pr
i=1 vif(x, y)|qdv1...dvrdu1...dur
≤2q
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]d
[ω1(f, (x, y),2(r+ 1)h, M (R×Rd)]qdv1...dvrdu1...dur
≤2qsup
(x,y)∈Qjk
[ω1(f, (x, y),2(r+ 1)h, M (R×Rd)]q.
Now for I2, we have
I2≤2qsup
(x,y)∈Qjk|f(x, y)|q.
Combining I1and I2, we obtain
sup
(x,y)∈Qjk|Fr,h(x, y)|q≤2qsup
(x,y)∈Qjk
[ω1(f, (x, y),2(r+ 1)h, M (R×Rd)]q
+ 2qsup
(x,y)∈Qjk|f(x, y)|q
Now, taking summation on both sides and using triangle inequality, we get
X
j∈Zd
sup
(x,y)∈Qjk|Fr,h(x, y)|q
p
q
≤2p
X
j∈Zd
sup
(x,y)∈Qjk
[ω1(f, (x, y),2(r+ 1)h, M (R×Rd)]q
p
q
+ 2p
X
j∈Zd
sup
(x,y)∈Qjk|f(x, y)|q
p
q
.
Again, summing over all k∈Zon both sides and in view of Proposition 6, we obtain
Fr,h ∈Λp,q.□
We obtain the following corollary as an immediate consequence of the foregoing propo-
sition.
Corollary 2. If f∈Λp,q ,1≤p≤q < ∞,r∈Nand h > 0, then fr,h ∈Λp,q .
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES17
In order to Prove Lemma 4, we recall the following Theorem from [7].
Theorem 1. For each β∈Ndthere are natural numbers N, bk∈Nand vectors uk, vk∈
Rd,uk= (uk
1, ..., uk
d), vk= (vk
1, ..., vk
d),k= 1,2, ..., N such that
vk
i, vk
i+|β|uk
i≤ |βi|, i = 1, ..., d
and for all h∈Xdwe have
∆β(h) =
N
X
v=1
bk∆|β| d
X
i=1
uk
ihi!T d
X
i=1
vk
ihi!,
where T:YX→YXis a translation operator defined by T(h)f(x) = f(x+h)and X, Y
are linear spaces.
Lemma 4. If f∈Lp,q (R×Rd),1≤p≤q < ∞,r∈Nand h > 0, then there exists a
function fr,h with the following estimates:
(a) ∥f−fr,h∥p,q ≤K(r, d)ωr(f;δ;Lp,q(R×Rd)),
(b) δr∥Dβfr,h∥p,q ≤K(r, d)ωr(f;δ;Lp,q(R×Rd)),|β|=r,
(c) δ|β|∥Dβfr,h∥p,q ≤K(r, d)ωr(f;δ;Lp,q(R×Rd)),
where K(r, d)is positive constant depending only on rand d.
Proof. Let us take
fr,h(x, y) = −1
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]d
r
X
j=1
(−1)jr
j
×f(x+j
r
r
X
i=1
ui,y+j
r
r
X
i=1
vi)dv1...dvrdu1...dur.
Then, we have
∥f−fr,h∥p,q ≤1
hr(d+1) Z[0,h]
... Z[0,h]Z[0,h]d
... Z[0,h]d
∆1
Pr
i=1 ui
r,Pr
i=1 vi
r
f(x, y)
p,q
×dv1...dvrdu1...dur.
By using the Property 5 of the local modulus of smoothness, we get
∥f−fr,h∥p,q ≤ωr(f;√dδ;Lp,q (R×Rd)) ≤K(r, d)ωr(f;δ;Lp,q(R×Rd)),
where K(r, d) is positive constant depending only on rand d. To prove the remaining
two estimates, we represent the usual forward difference in terms of mixed differences as
follows:
∆β
h,h= ∆n0
h0∆|N|
h= ∆n0
h0∆n1
h1...∆nd
hd,
where h0= (h, 0), hi= (0,hi), 0= (0, ..., 0) ∈Rdand hi= (0, ..., h, ..., 0) ∈Rd,
his in the ithposition . We know that
fr,h =−
r
X
j=1
(−1)jr
jFr,hj,
18 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
where Fr,h is the usual Steklov function. Now, we claim that
DβFr,hj(x, y) = h−|β|∆β
jh
r,jh
r
f(x, y).
The proof of this claim follows from the representation of the finite difference as men-
tioned above and Proposition 21 (i). The representation allows us to use recursively
one-dimensional estimate (d+ 1) times in different variables to get our required estimate.
Thus, we obtain
Dβfr,h(x, y) = −h−|β|
r
X
j=1
(−1)jr
j∆β
jh
r,jh
r
f(x, y).
Applying Theorem 1, we have our desired estimate. □
Proposition 12. If f∈Λp,q ,1≤p≤q < ∞,r∈Nand h > 0, then there exists a
function fr,h ∈Λp,q with the following properties:
(a) fr,h ∈Wr(Lp,q(R×Rd)∩C(R×Rd)) and for its rth partial derivative Dβfr,h with
|β|=rwe have
∥Dβfr,h∥p,q ≤K(r, d)h−rτr(f;δ;M(R×Rd))p,q,
(b) |f(x, y)−fr,h(x, y)| ≤ ωr(f; (x, y); 2(r+ 1)h;M(R×Rd)),
(c) ∥f−fr,h∥p,q ≤K(r, d)τr(f; 2(r+ 1)h;M(R×Rd))p,q.
Proof. Applying the Lemma 4 and Proposition 4, we immediately obtain the part(a).
Using a similar analysis discussed in Lemma 4, we obtain (b). On taking Lp,q (R×Rd)
norm on both sides in (b), we obtain (c). □
Proposition 13. Let f∈Λp,q,1≤q≤p < ∞,r∈Nand Σbe an admissible partition
with upper mesh size ∆. Then we have
∥ωr(f;.; 2h;M(R×Rd))∥lp,q(Σ) ≤c(r)τrf;h+∆
r;M(R×Rdp,q
,
where c(r)is a constant depending on r.
Proof. Let Σ = (xk, yj)k∈Z, j∈Zdbe an admissible sequence with upper mesh size ∆. Now
fix k∈Z,j∈Zdand z= (x, y),y= (y1, ..., yd).Again let ξ= (u, η), η= (η1, ..., ηd)∈Rd
and u∈Rbe such that z, ξ ∈Qjk. Now we have
[x−rh, x +rh]×
d
Y
i=1
[yi−rh, yi+rh]⊆[u−rh−∆, u+rh+∆]×
d
Y
i=1
[ηi−rh−∆, ηi+rh +∆].
and,
ωr(f;z; 2h;M(R×Rd)) ≤ωrf;ξ; 2 h+∆
r;M(R×Rd),for z, ξ ∈Qjk.
Then, we have
sup
z∈Qjkωrf;z; 2h;M(R×Rdq∆jk≤ZQjkωrf;ξ; 2 h+∆
r;M(R×Rd)q
dξ,
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES19
which implies that
X
j∈Zd
sup
z∈Qjkωr(f;z; 2h;M(R×Rd))q∆jk
≤X
j∈ZdZQjkωrf;ξ; 2 h+∆
r;M(R×Rd)q
dξ.
Then we have
X
k∈Z
X
j∈Zd
sup
z∈Qjkωr(f;z; 2h;M(R×Rd))q∆jk
p
q
≤X
k∈Z
X
j∈ZdZQjk
[ωr(f;ξ; 2(h+∆
r); M(R×Rd))]qdξ
p
q
Therefore we have
∥ωr(f;.; 2h;M(R×Rd))∥lp,q(Σ) ≤c(r)τrf;h+∆
r;M(R×Rd)p,q
.
This completes the proof of the proposition. □
Now, we will discuss the basic interpolation theorem for the Lp,q averaged modulus of
smoothness in M(R×Rd). Let us consider {Lη: Λp,q −→ Lp,q(R×Rd)}η∈Ibe a family
of linear operators, Ibeing an index set. Further, let {Ση}η∈I, is a family of admissible
partitions ((xk,η , yj,η)k∈Z,j ∈Zd) with upper and lower mesh size ∆η,∆ηrespectively and
∆jk,η as defined previously. We have the following theorem
Theorem 2. Let {Ση}η∈I, be a family of admissible partitions with upper mesh size ∆η.
Let {Lη}η∈Ibe a family of linear operators from Λp,q into Lp,q (R×Rd),1≤q≤p < ∞
with the following properties:
(i) ∥Lηf∥Lp,q(R×Rd)≤C1∥f∥lp,q (Ση),
(ii) for g∈Wr(Lp,q(R×Rd)) ∩C(R×Rd), we have
∥Lηg−g∥Lp,q(R×Rd)≤C2∆η
sX
|β|=r∥Dβg∥Lp,q(R×Rd),
for some fixed r, s ∈Nwith s≤rand C1, C2are constants. Then for each f∈Λp,q and
∆η≤rthen the following estimate holds
∥Lηf−f∥Lp,q(R×Rd)≤K(r, d)τrf; ∆η
s
r;M(R×Rd)p,q .
Proof. Let us take the modified Steklov function fr,h and consider the following
∥Lηf−f∥Lp,q(R×Rd)
≤ ∥Lη(f−fr,h)∥Lp,q (R×Rd)+∥Lηfr,h −fr,h∥Lp,q (R×Rd)+∥fr,h −f∥Lp,q (R×Rd).(4.1)
From Proposition 12(iii), we get
∥f−fr,h∥p,q ≤K(r, d)τr(f; 2h;M(R×Rd))p,q.
20 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Since ∥Lηg−g∥Lp,q(R×Rd)≤C2∆η
sX
|β|=r∥Dβg∥Lp,q(R×Rd),and by Proposition 12 (i), we
obtain
∥Lηfr,h −fr,h∥Lp,q(R×Rd)≤C(r, d)∆η
sX
|β|=r∥Dβfr,h∥Lp,q (R×Rd)
≤C′(r, d)∆η
sh−rτr(f;h;M(R×Rd))p,q,
where C′(r, d) = K(r, d)C(r, d) In view of Proposition 13 and Proposition 12 (ii), and by
assumption (i), we obtain.
∥Lη(f−fr,h)∥Lp,q (R×Rd)≤C∥f−fr,h∥lp,q (Ση)
≤C∥ωr(f, .; 2h;M(R×Rd)∥lp,q (Ση)
≤Cτr(f; (h+ ∆η); M(R×Rd))p,q.
Putting h= ∆η
s
rand in (4.1) and combining the above estimates, we get
∥Lηf−f∥Lp,q(R×Rd)
≤Cnτr(f; ∆η
s
r;M(R×Rd))p,q +τr(f; 2∆η
s
r;M(R×Rd))p,qo
≤K(r, d)τrf; ∆η
s
r;M(R×Rd)p,q ,
where K(r, d) is positive constant depending only on rand d. This completes the proof
of our theorem.
□
5. Rate of Convergence of Neural Network Sampling Operators in
Mixed Lebesgue Spaces
In this section, we establish the rate of approximation for Neural Network Sampling
Operators in mixed Lebesgue spaces. Before discussing the approximation results, we
shall define the algebraic and absolute moment as follows:
Let ψ:R2→R. Then for any η∈N0=N∪{0}, n = (n1, n2)∈N2
0with |n|=n1+n2=
η, we define the algebraic moments of order ηas
m(n1,n2)(ψ, u, v) :=
∞
X
k=−∞
∞
X
j=−∞
ψ(u−k, v −j)(k−u)n1(j−v)n2
and the absolute moments are defined by
M(n1,n2)(ψ) :=
∞
X
k=−∞
∞
X
j=−∞ |ψ(u−k, v −j)| |(k−u)|n1|(j−v)|n2.
We define Mη(ψ) := max
|p|=ηM(n1,n2)(ψ).We note that Mη(ψ)<∞,for every 0 ≤η≤β
in view of section 6 of (See[[22]]). We can easily see that for ξ, η ∈N0with ξ < η,
Mη(χ)<∞implies that Mξ(χ)<∞.
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES21
Let ψ(x, y) be a sigmoidal function and let I= [−1,1]. For f:I×I→R,the Neural
Network Sampling Operators are defined by
Fnf(x, y) =
n
X
k=−n
n
X
j=−n
fk
n,j
nψσ(nx −k, ny −j)
n
X
k=−n
n
X
j=−n
ψσ(nx −k, ny −j)
.
Now we shall recall some of the crucial properties enjoyed by ψσ(x, y) (See [[21]]).
Lemma 5.
(i) For every u, v ∈R, we have
∞
X
k=−∞
∞
X
j=−∞
ψ(u−k, v −j)=1.
(ii) The series
∞
X
k=−∞
∞
X
j=−∞
ψ(u−k, v −j)converges uniformly on the compact subsets
of R×R.
(iii) For u, v ∈Iand n∈N, we have
1
∞
X
k=−∞
∞
X
j=−∞
ψ(nu −k, nv −j)≤1
ψσ(1,1).
In view of the above Lemma, it is easy to see that the above NN operators Fnare
well-defined. Indeed, we have
|Fnf(x, y)| ≤
n
X
k=−n
n
X
j=−nfk
n,j
nψ¯σ(nx −k, ny −j)
n
X
k=−n
n
X
j=−n
ψ¯σ(nx −k, ny −j)≤∥f∥∞
ψσ(1,1)M(0,0)(ψσ)<+∞.
Lemma 6. Let ψ(x, y)be a multivariate sigmoidal function defined earlier, f:I×I→R
be a bounded measurable function then,
∥Fnf∥p,q ≤ ∥f∥ℓp,q(Σ) (ψ¯σ(1,1))−1
2(1
p+1
q),
where Σ = Σx×Σyis an admissible partition such that Σx=k
n:k=−n, ..., nand
Σy=j
n:j=−n, ..., n.
22 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Proof. By the definition of the Lp,q norm and using Jensen’s inequality multiple times, we
obtain
∥Fnf∥p
p,q =Z1
−1 Z1
−1Pn
k=−nPn
j=−nf(k
n,j
n)ψσ(nx −k, ny −j)
Pn
k=−nPn
j=−nψσ(nx −k, ny −j)
q
dy!p
q
dx
≤1
(ϕσ(1)p
qR1
−1Pn
k=−nPn
j=−nf(k
n,j
n)qR1
−1ϕσ(ny −j)dyϕσ(nx −k)p
qdx
Pn
k=−nϕσ(nx −k)
≤1
(ϕσ(1)p
q+1
n
X
k=−n n
X
j=−nfk
n,j
n
q1
n!p
q1
n
≤∥f∥p
ℓp,q(Σ)
(ψ¯σ(1,1))−1
2(p
q+1) .
By taking pth square root on both sides of the above expression, we get the desired
estimate. □
Now, we will estimate the approximation error in terms of the τ-modulus of smoothness
in Lp,q norm.
Theorem 3. Let σbe a sigmoidal function satisfying (3) with β≥2 max (p, q),1≤q≤
p < ∞. Then for any f∈M(I×I), we have
∥Fnf−f∥p,q ≤C1τ1f, 1
np,q
+C2τ2f, 1
np,q
,
where n∈Nand C1, C2are positive constants.
Proof. Let 0 < h ≤1. Putting r= 2 in Proposition 12 we have,
∥Fnf−f∥p,q ≤ ∥Fnf−Fnf2,h∥p,q +∥Fnf2,h −f2,h∥p,q +∥f2,h −f∥p,q.
Now using Lemma 5, we obtain
∥Fnf−Fnf2,h∥p,q ≤ ∥Fn(f−f2,h )∥p,q
≤ ∥f−f2,h∥ℓp,q (Σ) (ψ¯σ(1,1))−1
2(1
p+1
q)
≤(ψ¯σ(1,1))−1
2(1
p+1
q)
X
k∈Z X
j∈Zω2f;k
n,j
n; 2hq1
n!p
q1
n
1
p
≤21
q+6
(ψ¯σ(1,1))−1
2(1
p+1
q)τ2f, h +1
2np,q
.
Now, in view of Proposition 12 (iii), we get
∥f2,h −f∥p,q ≤K(r)τr(f, 2h)p,q .
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES23
Further, f2,h ∈W2
p,q. Using local Taylor formula for fat (x, y), there exists a bounded
function ζsuch that lim(t,v)→(0,0) ζ(t, v) = 0 and we have
f2,h k
n,j
n=f2,h(x, y) + ∂f2,h
∂x (x, y)j
n−y+∂f2,h
∂y (x, y)j
n−y
+1
2
∂2f2,h
∂x2(x, y)k
n−x2
+1
2
∂2f2,h
∂y2(x, y)j
n−y2
+1
2
∂f2,h
∂x∂ y (x, y)k
n−xj
n−y
+ζk
n−x, j
n−y"k
n−x2
+j
n−y2#.
From the definition of the NN operator, we have
Fnf2,h(x, y) = f2,h(x, y) +
∂f2,h
∂x (x, y)
nPn
k=−nPn
j=−n(k−nx)ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
+
∂f2,h
∂y (x, y)
nPn
k=−nPn
j=−n(j−ny)ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
+
∂2f2,h
∂x2(x, y)
n2Pn
k=−nPn
j=−n(k−nx)2ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
+
∂2f2,h
∂y2(x, y)
n2Pn
k=−nPn
j=−n(j−ny)2ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
+
∂2f2,h
∂x∂y (x, y)
n2Pn
k=−nPn
j=−n(k−nx)(j−ny)ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
+1
n2Pn
k=−nPn
j=−nζk
n−x, j
n−yhk
n−x2+j
n−y2iψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
=f(x, y) + A1+A2+A3+A4+A5+R.
Using Minkowski inequality in the above expression, we get
∥Fnf2,h −f2,h∥p,q ≤
5
X
i=1 ∥Ai∥p,q +∥R∥p,q.
One can easily find the following estimates by using Jensen’s inequality to each Ai, i =
1, ..., 5 with Theorem 12 (iii).
∥A1∥p,q ≤M1,0(ψ¯σ)
ψ¯σ(1,1)
1
n
∂f2,h
∂x
p,q ≤C′
1
1
nh−1τ1(f, h).
∥A2∥p,q ≤M0,1(ψ¯σ)
ψ¯σ(1,1)
1
n
∂f2,h
∂y
p,q ≤C′
2
1
nh−1τ1(f, h).
24 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
∥A3∥p,q ≤M2,0(ψ¯σ)
ψ¯σ(1,1)
1
n2
∂2f2,h
∂x2
p,q ≤C′
3
1
n2h−2τ2(f, h).
∥A4∥p,q ≤M0,2(ψ¯σ)
ψ¯σ(1,1)
1
n2
∂2f2,h
∂y2
p,q ≤C′
4
1
n2h−2τ2(f, h).
∥A5∥p,q ≤M1,1(ψ¯σ)
ψ¯σ(1,1)
1
n2
∂2f
∂x∂ y
p,q ≤C′
5
1
n2h−2τ2(f, h).
Now, we will estimate the norm for the remainder as follows:
∥R∥p,q =
1
n2Pn
k=−nPn
j=−nζk
n−x, j
n−yhk
n−x2+j
n−y2iψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
p,q
≤D1+D2.
where, D1is estimated by
D1=
1
n2Pn
k=−nPn
j=−nζk
n−x, j
n−yk
n−x2ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
p,q
≤ ∥ζ∥∞
1
n2
Pn
k=−nPn
j=−nk
n−x2ψ¯σ(nx −k, ny −j)
Pn
k=−nPn
j=−nψ¯σ(nx −k, ny −j)
p,q
≤ ∥ζ∥∞
1
n2
M2,0(ψ¯σ)
ψ¯σ(1,1) .
Similarly, we obtain
D2≤ ∥ζ∥∞
1
n2
M0,2(ψ¯σ)
ψ¯σ(1,1) .
Since D1and D2are independent of fand ζ, so for sufficiently large n, we obtain that
D1, D2→0 as n→ ∞. This implies that R→0 as n→ ∞. Hence, by combining all the
estimates, we get the desired estimate. □
6. Examples and Applications
In this section, we discuss some well-known sigmoidal functions which will satisfy our
conditions. As a first example, we consider the logistics sigmoidal function. The logistics
sigmoidal function is defined by σℓ(u) := 1
1 + e−u, u ∈R.It is well known that σℓ(u)
satisfies the conditions (A1)−(A3) (see [10]). Since ψσℓ(u, v) := σℓ(u).σℓ(v),so ψσℓ(u, v)
also satisfies conditions (A1)−(A3).
Next, we consider the hyperbolic tangent sigmoidal functions as follows:
ψσH(u, v) := 1
2(tanh u+ 1)1
2(tanh v+ 1),
where
tanh z=e2z−1
e2z−1.
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES25
By (see [21]), tanh zsatisfies conditions (A1)−(A3) and hence ψσH(u, v) also satisfies
conditions (A1)−(A3).
6.1. Implementation Results. Now, we consider the function continuous function
f(x, y) = x2sin(x+y) + y2cos(xy)
and its graph is shown in figure 1.
Figure 1. The function f
The accuracy of the approximation ∥Fnf−f∥p,q using the logistic activation function
σl(x)σl(y) and hyperbolic tangent function 1
4(tanh(x+ 1))(tanh(y+ 1)) are presented in
Table 1 and Figures 2 and 3
Figure 2. Approximation F20fof fusing sigmoidal function σl.
Next, we consider the discontinuous function
g(x, y) = x2sin(x+y) + y2cos(xy)+1 if x > 0
−x2sin(x+y)−y2cos(xy)−1if x ≤0.
26 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
Figure 3. Approximation F20fof fusing hyperbolic tangent function .
nActivation function (p, q)∥Fnf−f∥p,q ∥Fng−g∥p,q
20 σl(x)σl(y) (2,3) 0.0642 0.8129
30 σl(x)σl(y) (2,3) 0.0385 0.7062
20 1
4(tanh(x+ 1))(tanh(y+ 1)) (2,3) 0.0263 0.7638
30 1
4(tanh(x+ 1))(tanh(y+ 1)) (2,3) 0.0161 0.6597
Table 1. Errors ∥Fnf−f∥p,q and ∥Fng−g∥p,q in approximation
The graph of gis shown in figure 4. The accuracy of the approximation ∥Fng−g∥p,q using
the logistic activation function σl(x)σl(y) and hyperbolic tangent function 1
4(tanh(x+
1))(tanh(y+ 1)) are presented in Table 1 and Figures 5 and 6
Figure 4. The function g
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES27
Figure 5. Approximation F20gof gusing sigmoidal function.
Figure 6. Approximation F20gof gusing hyperbolic tangent function.
7. Conclusion
The rate of approximation of neural network operators in-terms of averaged modulus of
smoothness is proved in this paper. In order to achieve our approximation, the properties
of averaged modulus of smoothness has been discussed in mixed norm. Further, suit-
able subspaces Λp,q(R×Rd),of Lp,q(R×R) are identified and established that averaged
modulus of smoothness is finite for all functions in Λp,q (R×Rd).Using the properties of
averaged modulus of smoothness, the rate of approximation of certain linear operators in
mixed Lebesgue norm is estimated. As an application of these linear operators, rate of
approximation of neural network operators in-terms of averaged modulus of smoothness in
mixed norm has been proved. Finally, some examples of sigmoidal functions has been dis-
cussed. Using these sigmoidal functions, implementation of continuous and discontinuous
functions by neural network operators has been provided.
28 ARPAN KUMAR DEY, A. SATHISH KUMAR, AND P. DEVARAJ
References
[1] G.A. Anastassiou, Multivariate sigmoidal neural network approximation, Neural Netw. 24 (2011)
378–386.
[2] G. A. Anastassiou, Rate of convergence of some neural network operators to the unit-univariate case,
J. Math. Anal. Appli. 212, 237–262 (1997).
[3] G. A. Anastassiou, Quantitative Approximations, Chapman&Hall/CRC, Boca Raton (2001).
[4] S. Arati, P. Devaraj, A. K. Garg, Random average sampling and reconstruction in shift-invariant
subspaces of mixed Lebesgue spaces, Results Math. 77 (2022), no. 6, Paper No. 223, 38 pp.
[5] C. Bardaro, I. Mantellini, R. Stens, J. Vautz and G. Vinti, Generalized sampling approximation
for multivariate discontinuous signals and application to image processing, New Perspectives on
Approximation and Sampling Theory, Appl. Numer. Harmon. Anal., Birkh¨auser, Springer, Cham,
(2014), 87-114.
[6] C. Bardaro, P.L. Butzer, R.L. Stens and G. Vinti, Approximation error of the Whittaker cardinal
series in terms of an averaged modulus of smoothness covering discontinuous signals, J. Math. Anal.
Appl., 316 (2006) 269-306.
[7] P. G. Binev and K. G. Ivanov, On a representation of mixed finite differences, SERDICA Bul. Math.
Publ (1985): 11-259.
[8] A. Benedek, R. Panzone, The space Lpwith mixed norm, Duke Math. J. 28 (1961) 301-324.
[9] A. Benedek, A.P. Calder´on, R. Panzone, Convolution operators on Banach space valued functions,
Proc. Natl. Acad. Sci. USA 48 (1962) 356-365.
[10] F. Cao, Z. Chen, The approximation operators with sigmoidal functions, Comput. Math. Appl. 58
(4) (2009) 758–765
[11] G. Cybenko, Approximation by superpositions of sigmoidal function, Math. Control Signals Systems.
2, 303-314 (1989).
[12] L. Coroianu, D. Costarelli, Best approximation and inverse results for neural network operators,
Results Math., 79 (2024) 193.
[13] L. Coroianu, D. Costarelli, U. Kadak, Quantitative estimates for neural network operators implied
by the asymptotic behaviour of the sigmoidal activation functions, Mediterr. J. Math., 19(5) (2022),
Article 211.
[14] L. Coroianu, D. Costarelli, M. Natale, A. Pantis, The approximation capabilities of Durrmeyer-type
neural network operators, J. Appl. Math. Comput., 70 (2024), 4581-4599.
[15] D. Costarelli, Approximation error for neural network operators by an averaged modulus of smooth-
ness, J. Approx. Theory. 294 (2023) 105944
[16] D. Costarelli, Interpolation by neural network operators activated by ramp functions, J. Math. Anal.
Appl., 419 (2014), 574-582.
[17] D. Costarelli, Neural network operators: Constructive interpolation of multivariate functions, Neural
Networks, 67 (2015), 28-36.
[18] D. Costarelli, Density results by deep neural network operators with integer weights, Math. Modell.
Anal., 27 (2022), 547-560.
[19] D. Costarelli, Convergence and high order of approximation by Steklov sampling operators, Banach
J. Math. Anal. (2024) 18 (2024) 70.
[20] D. Costarelli, R. Spigler, Approximation results for neural network operators activated by sigmoidal
functions, Neural Netw. 44 (2013) 101–106.
[21] D. Costarelli, R. Spigler, Multivariate neural network operators with sigmoidal activation functions,
Neural Netw. 48 (2013) 72–77.
[22] D. Costarelli and G. Vinti, Rate of approximation for multivariate sampling Kantorovich operators
on some functions spaces, J. Integral Equations Applications 26(4): 455-481.
[23] D.L. Fernandez, Vector-valued singular integral operators on Lp-spaces with mixed norms and ap-
plications, Pacific J. Math. 129 (2) (1987) 257-275.
[24] J. L. Francia, F. J. Ruiz, J. L. Torrea, Calder´on-Zygmund theory for operator-valued kernels, Adv.
Math. 62(1), 7-48 (1986).
APPROXIMATION BY NEURAL NETWORK SAMPLING OPERATORS IN MIXED LEBESGUE SPACES29
[25] K. I. Funahashi, On the approximate realization of continuous mappings by neural networks, Neural
Networks. 2, 183-192 (1989).
[26] W. R. Grey, Inclusions Among Mixed-Norm Lebesgue Spaces (2015). Electronic Thesis and Disser-
tation Repository.
[27] K. Hornik, M. Stinchombe, H. White, Multilayer feedforward networks are universal approximators,
Neural Networks. 2, 359-366 (1989).
[28] Y. Jiang, J. Kou, Semi-Average Sampling for Shift-Invariant Signals in a Mixed Lebesgue Space,
Numer. Funct. Anal. Optim. 41:9 (2020) 1045-1064.
[29] Y. Jiang, W. Li, Random sampling in multiply generated shif-invariant sub-spaces of mixed Lebesgue
spaces Lp,q(R×Rd), J. Comput. Appl. Math. 386, 113237 (2021).
[30] R. Li, b. Liu, R. Liu, Q. Zhang, Nonuniform sampling in principal shift-invariant subspaces of mixed
Lebesgue spaces Lp,q (Rd+1),J. Math. Anal. Appl. 453 (2017) 928–941.
[31] G. Prashant, D. Patel, S. Sivananthan, Random sampling and reconstruction in reproducing kernel
subspace of mixed Lebesgue spaces, Math. Methods Appl. Sci. 46 (2023), no. 5, 5119-5138.
[32] R. Torres and E. Ward, Leibniz’s rule, sampling and wavelets on mixed Lebesgue spaces, J. Fourier
Anal. Appl. 21(5) (2015) 1053-1076.
Department of Mathematics, Indian Institute of Technology Madras, Chennai-600036,
Tamil Nadu, India
Email address:arpanmathematics@gmail.com
Department of Mathematics, Indian Institute of Technology Madras, Chennai-600036,
Tamil Nadu, India
Email address:sathishkumar@iitm.ac.in, mathsatish9@gmail.com
School of Mathematics, Indian Institute of Science Education and Research, Thiru-
vananthapuram, India
Email address:devarajp@iisertvm.ac.in