ArticlePDF Available

Abstract

Recently, two different copula-based approaches have been proposed to estimate the conditional quantile function of a variable with respect to a vector of covariates : the first estimator is related to quantile regression weighted by the conditional copula density, while the second estimator is based on the inverse of the conditional distribution function written in terms of margins and the copula. Using empirical processes, we show that even if the two estimators look quite different, their estimation errors have the same limiting distribution. Also, we propose a bootstrap procedure for the limiting process in order to construct uniform confidence bands around the conditional quantile function.
On copula-based conditional quantile estimators
Bruno R´emillarda, Bouchra Nasrib,, Taoufik Bouezmarnic
aGERAD, CRM, and Department of Decision Sciences, HEC Montr´eal, 3000 chemin de la Cˆote Sainte-Catherine,
Montr´eal(Qc), H3T 2A7. Canada
bDepartment of Water Sciences, Institut national de recherche scientifique, Eau-Terre-Environnement, 490 rue de la
couronne, Qu´ebec,G1K 9A9. Canada
cDepartment of Statistics, Universit´e de Sherbrooke, 2500 boul. de l’Universit´e, Sherbrooke, J1K 2R1. Canada
Abstract
Recently, two different copula-based approaches have been proposed to estimate the conditional quan-
tile function of a variable Ywith respect to a vector of covariates X: the first estimator is related to
quantile regression weighted by the conditional copula density, while the second estimator is based on
the inverse of the conditional distribution function written in terms of margins and the copula. Using
empirical processes, we show that even if the two estimators look quite different, their estimation
errors have the same limiting distribution. Also, we propose a bootstrap procedure for the limiting
process in order to construct uniform confidence bands around the conditional quantile function.
Keywords: Conditional quantile function, copula, quantile regression, bootstrap
1. Introduction
Copulas, or dependence functions, are very popular to model the dependence between variables,
because one can remove the effect of marginal distributions, provided the latter are continuous. This
is why dependence measures based on the copula are so robust, compared to the traditional Pearson
correlation coefficient. Copulas also enter naturally when computing the conditional distribution
function of a random variable Ygiven covariates X= (X1, . . . , Xd). See, e.g., Bouy´e and Salmon [5]
when d= 1. This relation between the conditional distribution of Ygiven X=xand the associated
Corresponding author
Preprint submitted to Statistics and Probability Letters
copula was used recently to propose conditional quantile estimators, as alternative to the quantile
regression methods [11] or the parameter approach [6,15,16].
A first copula-based estimator of the conditional quantile was proposed by Noh et al. [19] and is
based on a weighted quantile regression method. The asymptotic limiting distribution was proved to
be Gaussian. More recently, a more intuitive estimator of the plug-in type was proposed in Kraus and
Czado [12], Nasri and Bouezmarni [14], who compared the estimated MISE of various competitors,
including the estimator proposed by Noh et al. [19]. From the simulations performed in Kraus and
Czado [12], Nasri and Bouezmarni [14], it seems that the plug-in estimator performs better than the
other copula-based estimator. However the asymptotic behavior of this estimator was not discussed.
In Section 2, we describe the estimators of Noh et al. [19] and Kraus and Czado [12] and we discuss
their implementation. Another closely related parametric estimator proposed in Nasri and Bouezmarni
[14] is also discussed. In Section 3, we study the asymptotic limiting distribution of the estimators
viewed as stochastic processes over (0,1) and we show that the two semi-parametric estimators have
the same limiting distribution. We also propose a bootstrapping method for constructing uniform
confidence bands for the conditional quantile functions.
2. Estimation of conditional quantiles
One way to model the dependence between a variable of interest Yand covariates X= (X1, . . . , Xd)
is to use dependence functions called copulas; see, e.g., Nelsen [17]. More precisely, suppose that
(Y1,X1),...,(Yn,Xn) are i.i.d. observations of (Y, X) with (unconditional) continuous margins F0,
F1, . . . , Fd, and copula Cwith density c. Set F(x)=(F1(x1), . . . , Fd(xd)).
By definition, a copula is a joint distribution function of uniform random variables. According to
Sklar’s theorem [17], since the margins are assumed to be continuous, there exists a unique copula C
so that the joint distribution function of (Y, X) can be written in terms of the copula and the margins
viz.
P(Yy, Xx) = C{F0(y),F(x)}, y R,xRd.(1)
Note that the copula Cis the cdf of (U, V), where U=F0(Y) and V=F(X).
2
2.1. Copula-based conditional quantiles
Denote by H(y, x) the conditional distribution function of Ygiven X=x. The expression of the
conditional distribution function Hin terms of the copula function and the marginal distributions
appeared explicitly first in a preliminary version of Bouy´e and Salmon [5] in the case d= 1. However,
it is easy to extend it to any d1, and one can easily show that
H(y, x) = P(Yy|X=x) = C{F0(y),F(x)}, y R,xRd,(2)
where C(u, v) is the conditional distribution function of Ugiven VF(X) = vF(x). In fact,
according to R´emillard [20, Proposition 8.6.2], for u[0,1] and v= (v1, . . . , vd)(0,1)d,
C(u, v) = ∂v1· ··vdC(u, v1, . . . , vd)
∂v1· ··vdC(1, v1, . . . , vd),
and uC(u, v) = c(u, v)/R1
0c(z, v)dz, so C(u, v) = Ru
0c(z, v)dz/ R1
0c(z, v)dz.
Now, the associated conditional quantile function Q(α, x), α(0,1), is given by
Q(α, x) = inf{yR:H(y, x)α}.(3)
Using (2), we get that Qdepends only on the margins F0,Fand the copula Cviz.
Q(α, x) = F1
0[Γ{α, F(x)}],(4)
where Γ(α, v) is the quantile of order αof the distribution function C(u, v), u[0,1], with v(0,1)d
fixed. Note that (4) is the basic equation for defining the plug-in estimator.
Next, using (2), one gets that Q(α, x) is also a solution of
arg min
aE [c{F0(Y),F(x)}ρα(Ya)] ,(5)
where c(u, v) = uC(u, v), ρα(y) = y{αI(y < 0)}= (1 α)|y|I(y < 0) + αyI(y0), yR, and I
is the indicator function. The latter equation is used by Noh et al. [19] to construct an estimator of
Q(α, x).
3
2.2. Estimation of the copula and the margins
To estimate the conditional quantile using copulas, one needs to estimate the copula Cassociated
with (Y, X) or (U, V), and the margins F0,F. First, one can assume that Yi=F1
0(Ui) and Xij =
F1
j(Vij), where (U1,V1),...,(Un,Vn) are i.i.d. observations from copula C.
2.2.1. Estimation of the copula
For sake of simplicity, we assume that the copula belongs to a parametric family {Cθ:θ∈ O}, so
the estimation of the copula is given as Cθn, where θnis a rank-based consistent estimator [7] of the
true parameter θ0. One can use the pseudo-MLE method proposed by Genest et al. [8]. Consequently,
the quantile function Γ(α, v)Γθ(α, v) can be estimated by Γθn(α, v), α(0,1), v(0,1)d. The
parametric family approach is also what Noh et al. [19] and Kraus and Czado [12] considered. In
fact, in the case of several covariates, Kraus and Czado [12] used a particular case of a parametric
copula family, namely a D-vine model [2,1], which is a construction of a copula using a given set of
parametric bivariate copula families. Note that instead of considering a parametric family of copulas,
one could estimate the density of the copula non-parametrically, so that all the conditional quantile
estimators discussed here could also be computed. However the convergence is slower and it often
suffers from the curse of dimensionality [4,9], with the possible exception of pair-copula construction
[13]. The next step is to estimate the margins.
2.2.2. Estimation of the margins
Motivated by the two-step inference function for margins (IFM) method [10], one could use para-
metric families to estimate each of the margins. This would make sense in several applications. For
copula-based quantile estimators, this approach was suggested in Nasri and Bouezmarni [14], where
a parametric copula-based estimator was proposed. Note that as discussed in Noh et al. [18], if the
estimation of the margins is incorrect, the estimation of the copula parameter θcan be biased. One
can also consider non-parametric estimators, namely for any yRand any x= (x1, . . . , xd)Rd,
Fn0(y) = 1
n+ 1
n
X
i=1
I(Yiy), Fnj(xj) = 1
n+ 1
n
X
i=1
I(Xij xj), j ∈ {1, . . . , d},(6)
4
and set Fn(x) = (Fn1(x1), . . . , Fnd(xd)). Further note that Fn0(y) = DnF0(y), where Dnis the
empirical distribution function of the Ui’s and Fn(x) = BnF(x), where Bnis the vector of empirical
marginal distribution functions of V1,...,Vd. Noh et al. [19] propose a kernel-based estimator ˆ
Fn0
for F0such that n1/2supy|ˆ
Fn0(y)Fn0(y)|P r
0 as n→ ∞. This was also used in Kraus and Czado
[12]. Even if ˆ
Fn0is continuous, the precision of the estimation might not be better and there is always
the question of the choice of the bandwidth. This is why we will use the estimators given by (6). For
the rest of the section, let xbe given and set v=F(x). It then follows that Fn(x) = Bn(v). For sake
of simplicity, xor vmight be omitted. We present the copula-based estimators we will study.
2.3. Weighted quantile regression estimator
Surprisingly, the natural plug-in estimator did not appear first in the literature. In fact, Noh et al.
[19] proposed a copula-based model mixed with a quantile regression approach using (5) viz.
Qn,wqr (α, x) = arg min
a"n
X
i=1
ρα(Yia)cθn{Fn0(Yi),Fn(x)}#,(7)
even if the solution is not necessarily unique. In fact they take cθn(u, v) instead of taking cθn(u, v)
but it leads to the same estimator; see, e.g., (8). However, a unique way to define a solution to (7) is
by using the empirical weighted distribution function Hndefined for any yRby
Hn(y, x) =
n
X
i=1
I(Yiy)wi,n =Gn{F0(y),v},with Gn(u, v) =
n
X
i=1
I(Uiu)wi,n,
where, for any i∈ {1, . . . , n},
wi,n =cθn{Fn0(Yi),Fn(x)}
Pn
j=1 cθn{Fn0(Yj),Fn(x)}=cθn{Fn0(Yi),Fn(x)}
Pn
j=1 cθn{Fn0(Yj),Fn(x)}=cθn{Dn(Ui),Bn(v)}
Pn
j=1 cθn{Dn(Uj),Bn(v)}.(8)
The estimator Qn,wqr (α, x) is then defined as the quantile of level αof Hn, i.e.,
Qn,wqr (α, x) = H1
n(α, x) = F1
0G1
n(α, v), α (0,1).(9)
If ˆa= arg mina[Pn
i=1 ρα(Yia)cθn{Fn0(Yi),Fn(x)}], then Hna, x)αHna,x). Hence
H1
n(α, x) satisfies (7).
5
It is easy to show that Hnis a consistent estimator of the distribution function H(y, x) =
C{F0(y),v},yR. Also Gnis a consistent and asymptotically unbiased estimator of the distri-
bution function C(u, v), u[0,1].
2.4. Plug-in estimators
Expression (4) provides a natural way for estimating the conditional quantile. We now describe
both parametric and semi-parametric estimators of Q(α, x).
2.4.1. Parametric estimator
In the parametric approach, we assume that the marginal distributions F0and Fbelong to para-
metric families denoted by F0β0(·) and Fβ(·) respectively. If βn0and βnare consistent estimators of
β0and β, and if Cθ,F0β0(·) and Fβ(·) are continuous functions of the parameters, then for any yR,
ˇ
Hn(y, x) = Cθn{F0βn0(y),Fβn(x)}is clearly a consistent estimator of H(y, x), yielding
Qn,p(α, x) = ˇ
H1
n(α, x) = F1
0βn0[Γθn{α, Fβn(x)}], α (0,1).(10)
2.4.2. Semiparametric estimator
Here, the marginal distributions are estimated using (6). Next, H(y, x) is estimated by
˜
Hn(y, x) = Cθn{Fn0(y),Fn(x)}=˜
Gn{F0(y),v}, y R,(11)
where ˜
Gn(u, v) = Cθn{Dn(u),Bn(v)}, which is a consistent estimate of Cθ0(u, v), u[0,1]. As a
result, the estimation of Q(α, x) is defined for any α(0,1) by
Qn,sp(α, x) = ˜
H1
n(α, x) = F1
n0[Γθn{α, Fn(x)}] = F1
0˜
G1
n(α, v).(12)
3. Asymptotic behavior of the copula-based estimators
In this section we find the asymptotic distribution of the conditional quantile functions for the
proposed estimators, extending the results of Noh et al. [19]. As a result, we obtain that the estimation
error of the plug-in estimator and the weighted quantile regression estimator converge to the same
6
limiting distribution. We also propose, in Section 3.4, a different bootstrap algorithm that can be
used to construct uniform confidence bands about the conditional quantile function.
As before, xis fixed and v=F(x). Throughout this section, we assume that the density f0=F0
0
exists and is positive everywhere. If the support is not R, just transform Yaccordingly. This way
F0(y)(0,1) for any yR. Also suppose that the density cof the (d+ 1)-dimensional copula Cis
positive on (0,1)d+1. Then H(·,x) is continuously differentiable with density hsatisfying h(y, x) =
f0(y)c(u, v)>0, for any yR. Further set Q(u, x) = H1(u, x) and Γ(u, v) = C1(u, v), u(0,1).
3.1. Convergence of the parametric estimator
In what follows, β0F0β0(y) is a p0-dimensional column vector, βFβis a p×dmatrix, vCθ(u, v)
is a d-dimensional column vector, θCθ(u, v) = ˙
Cθ(u, v) is a q-dimensional column vector which rep-
resent the partial derivatives with respect to β0,β,vand θof F0β0,Fβ,Cθand Cθrespectively.
Throughout this section, we assume that these derivatives are continuous, and that cθ(u, v) is contin-
uously differentiable with respect to u(0,1).
Set Bn0=n1/2(βn0β0), Bn=n1/2(βnβ), and Θn=n1/2(θnθ0). Finally, define ˇ
Hn(y) =
n1/2ˇ
Hn(y, x)− C(y, x)for any yR, and Qn,p(u) = n1/2{Qn,p (u, x)Q(u, x)},u(0,1). The
proof of the following theorem, giving the asymptotic behavior of the parametric quantile process,
follows readily from the Delta method [21]. To simplify notations, set ˙
C(u, v) = θCθ(u, v)|θ=θ0and
vCθ0(u, v) = vC(u, v).
Theorem 1. Assume that (Bn0,Bn,Θn) converges in law to a centered Gaussian vector (B0,B,Θ).1
Then, as n→ ∞,ˇ
Hnconverges in D(R)2to a continuous centered Gaussian process ˇ
H, denoted
ˇ
Hn ˇ
H=ˇ
GF0β0, where
ˇ
G(u) = Θ>˙
C(u, v) + B>βFβnF1
β(v)ovC(u, v) + cθ0(u, v)B>
0β0F0β0nF1
0β0(u)o, u [0,1].
1See, e.g. Joe [10] for sufficient regularity conditions.
2Convergence in D(I) means that for any close interval [a, b]I, the process converges in law in the Skorokhod
topology on D([a, b]). In particular, continuous functions of the process converges in law. See, e.g., Billingsley [3].
7
Furthermore, Qn,p Qpin D(0,1), where Qp(u) = ˇ
H{Q(u,x)}
h{Q(u,x),x},u(0,1). In particular, for any
[a, b](0,1), n1/2sup
u[a,b]|Qn,p(u, x)Q(u, x)|converges in law to sup
u[a,b]
ˇ
H{Q(u, x)}
h{Q(u, x),x}
.
3.2. Convergence of the semiparametric estimator
We now study the convergence of the process Qn,sp(u) = n1/2{Qn,sp (u, x)Q(u, x)},u(0,1).
Before stating the theorem, define Dn(u) = n1/2{Dn(u)u}, and B
B
Bn(v) = n1/2(Bn(v)v), u[0,1],
v(0,1)d. The proof of this theorem follows from the Delta method [21].
Theorem 2. Assume that (Dn,B
B
Bn,Θn) converges in D[0,1]1+d×Rqto (D,B
B
B,Θ), where Band B
B
Bare
centered Gaussian processes and Θis a centered random vector.3Then, as n→ ∞,˜
Gnconverges in
D([0,1]) to ˜
G=H+Dcθ0(·,v), where H(u) = Θ>˙
C(u, v) + B
B
B(v)>vC(u, v), u[0,1]. Furthermore,
Qn,sp Qsp in D(0,1), where Qsp(u) = ˜
G{Γ(u,v)}
h{Q(u,x),x},u(0,1). In particular, for any [a, b](0,1),
n1/2sup
u[a,b]|Qn,sp(u, x)Q(u, x)|converges in law to sup
u[a,b]
˜
G{Γ(u, v)}
h{Q(u, x),x}
.
3.3. Convergence of the weighted quantile regression estimator
We now study the convergence of the process Qn,wqr (u) = n1/2{Qn,wqr(u, x)Q(u, x)}. It ex-
tends the results in Noh et al. [19], where only the convergence at a single value was proven. In
order to formulate the result, we need to define another sequence of stochastic processes, namely
Gn(u) = n1/2Pn
i=1 {I(Uiu)cθ0(Ui,v)− Cθ0(u, v)},u[0,1]. It follows from the theory of stochas-
tic processes [21] that (Dn,B
B
Bn,
Gn) converges in D[0,1]2+dto centered Gaussian processes (D,B
B
B,
G).
We can now show that the two estimators have the same limiting distribution.
Theorem 3. Assume that (Dn,B
B
Bn,
Gn,Θn) converges in D[0,1]2+d×Rqto centered Gaussian
processes (D,B
B
B,
G,Θ). Then, as n→ ∞,Gnconverges in D([0,1]) to G=˜
G. Furthermore, Qn,wgr
Qn,wqr in D(0,1), where Qwqr(u) = G{Γ(u,v)}
h{Q(u,x),x},u(0,1). In particular, for any [a, b](0,1),
n1/2sup
u[a,b]|Qn,wqr (u, x)Q(u, x)|converges in law to sup
u[a,b]
G{Γ(u, v)}
h{Q(u, x),x}
.
3This assumption is satisfied for most well-behaved rank-based estimator of θ. See, e.g., Genest and R´emillard [7].
8
Proof. Set ˙
c(u, v) = θcθ(u, v)|θ=θ0,vc(u, v) = vcθ0(u, v). It suffices to prove the convergence
of Gn(u) = n(Gn(u)− Cθ0(u, v)). Write Gn(u) = 1
nPn
i=1 I(Uiu)cθn{Dn(Ui),Bn(v)}/sn, where
sn=1
nPn
i=1 cθn{Dn(Ui),Bn(v)}.
Set rn(u) = cθn{Dn(u),B
B
Bn(v)}cθ0(u, v){Θ>
n˙
c(u,v)+ucθ0(u,v)Dn(u)+vc(u,v)>B
B
Bn(v)}
n1/2,u[0,1]. By
hypothesis, as n→ ∞,n1/2sup
u[0,1] |rn(u)|converges in probability to 0. It follows that
Gn(u) = n1/2
sn
n
X
i=1
I(Uiu){cθn{Dn(Ui),Bn(v)} − cθ0(Ui,v)}+
Gn(u)/sn− Cθ0(u, v)n1/2(1 1/sn)
={Ln(u) +
Gn(u)− Cθ0(u, v)Ln(1) − Cθ0(u, v)
Gn(1)}/sn,
where Ln(u) = n1/2Pn
i=1 I(Uiu){cθn{Dn(Ui),Bn(v)} − cθ0(Ui,v)}. Now,
Ln(u) = Θ>
n(1
nX
i=1
I(Uiu)˙
c(Ui,v))+1
n
n
X
i=1
I(Uiu)Dn(Ui)uc(Ui,v)
+B
B
Bn(v)>(1
n
n
X
i=1
I(Uiu)vc(Ui,v))+oP(1)
=Θ>
n˙
C(u, v) + Zu
0
Dn(z)uc(z, v)dz +B
B
Bn(v)>vC(u, v) + oP(1).
Next, assuming that ucθ0(u, v)0 as u0, we have
Zu
0
Dn(z)zcθ0(z, v)dz =n1/2
n
X
i=1 Zu
0
zcθ0(z, v){I(Uiz)z}dz
=n1/2
n
X
i=1
I(Uiu){cθ0(u, v)cθ0(Ui,v)} − n1/2{ucθ0(u, v)− Cθ0(u, v)}
=cθ0(u, v)Dn(u)
Gn(u).
As a result, Ln H+Dc(·,v)
G=˜
G
G. so, Gn G=˜
Gin D([0,1]).
Remark 1.Note that Theorems 2and 3are still valid if we choose the kernel distribution for marginals
instead of the empirical distribution functions since their asymptotic behavior is the same. This is
also true for the bootstrapping procedure defined next.
9
3.4. Bootstrapping
Algorithm 1 (Bootstrapping ˜
G).First, estimate θusing a regular rank-based estimator θnof the
form θn=Tn(U1,n,V1,n, . . . , Un,n,Vn,n) in the sense of Genest and R´emillard [7], and set vn=Fn(x).
Then, for each k∈ {1, . . . , N}, repeat the following steps:
Generate (U?
i,V?
i)Cθn,i∈ {1, . . . , n}, and compute the empirical margins D?
n,F?
n;
Calculate the pseudo-observations U?
i,n =D?
n(U?
i), V?
i,n =F?
n(V?
i), i∈ {1, . . . , n};
Estimate θ?
n=TnU?
1,n,V?
1,n, . . . , U ?
n,n,V?
n,n;
Set ˜
G(k)
n(u) = n1/2Cθ?
n{D?
n(u),B?
n(vn)}−Cθn(u, vn),u[0,1].
The next theorem shows the consistency of the proposed bootstrap.
Theorem 4. Under the conditions of Theorem 2, as n→ ∞,˜
G(1)
n,..., ˜
G(N)
nconverge to independent
copies of ˜
G.
Proof. From Genest and R´emillard [7], (D?
n,B
B
Bn,B
B
B?
n,Θn,Θ?
n) D,B
B
B,B
B
B,Θ,Θ+Θ, where
D,B
B
B,Θis an independent copy of (D,B
B
B,Θ). Hence, since n1/2{B?
n(vn)v}=B
B
B?
n(vn) + B
B
Bn(v),
it follows from the Delta Method and Theorem 2that
˜
G(k)
n(u) = ˙
C(u, v)>Θ?
n+vC(u, v){B
B
B?
n(vn) + B
B
Bn(v)}+cθ0(u, v)D?
n(u) + oP(1)
˙
C(u, v)>Θ+Θ+vC(u, v){B
B
B(v) + B
B
B(v)}+cθ0(u, v)D(u)
=˙
C(u, v)>Θ+vC(u, v)B
B
B(v) + cθ0(u, v)D(u) + H(u) = ˜
G(u) + H(u),
where ˜
Gis an independent copy of ˜
G, while n1/2{Cθn(u, vn)− Cθ0(u, v)} H. As a result,
˜
G(1)
n,..., ˜
G(N)
nconverge to independent copies of ˜
G.
Remark 2.Note that as shown in Genest and R´emillard [7], most interesting estimators are regular.
In particular, estimators of the class R1: this means that there exists a continuously differentiable
function Jso that E[J(U, V)] = 0 and Θn=n1/2Pn
i=1 J{Dn(Ui), Bn(Vi)}+oP(1). For example,
pseudo-maximum likelihood estimators, as defined in Genest et al. [8], belong to this class.
10
3.4.1. Construction of the uniform 100(1 α)% confidence band for Q
To construct the uniform confidence band on [a, b](0,1), we generate Nprocesses ˜
G(k),k
{1, . . . , N }and they are evaluated at u∈ A ={a+j(ba)/m;j= 0, . . . , m}, where mis fixed but
large enough (say m= 1000). The density f0is estimated with a Gaussian kernel estimator fn0, so
h(u) = h{Q(u, x),x}is estimated by hn(u) = fn0Qn,sp(u, x)cθn(u, vn), when vn=Fn(x). One then
computes bk,n = maxu∈A
˜
G(k)(u)/hn(u), k∈ {1, . . . , N }, and let bn(α) be the associated quantile of
order 1α. The uniform confidence band about Q(·,x) is given by Qn,sp(u, x)±n1/2bn(α), u[a, b].
A 95% confidence interval about a single point Q(u, x) is given by Qn,sp(u, x)±n1/21.96ˆσ/hn(u) where
ˆσ2is the sample variance of the values ˜
G(k)(u), k∈ {1, . . . , N }.
Remark 3.Using our notations, the bootstrap algorithm proposed in Noh et al. [19] yields values
Q(k)
n,wqr ,k∈ {1, . . . , N }, so that Q(k)
n,wqr =n1/2nQ(k)
n,wqr Qoconverges to Q(k)
wqr +Qwq r, where Q(k)
wqr is
an independent copy of Qwqr. It then follows that their algorithm works for estimating the asymptotic
variance σ2
α, in the sense that what they call ˆσ2
boot satisfies ˆσ2
boot σ2
α
nif nand Nare large. However,
their procedure is slower than the one we propose since we do not need to compute Y?
i=F1
n0(U?
i)
and X?
i=F1
n(V?
i), i∈ {1, . . . , n}. In addition, computing ˜
Hnis faster than computing Hn.
4. Conclusion
We have shown that two seemingly different estimators for the conditional quantile function have in
fact the same limiting distribution. However, the plug-in estimator is easier and faster to implement,
in addition to being more accurate for small samples, as shown by simulations in Kraus and Czado
[12] and Nasri and Bouezmarni [14]. Therefore, this is the one we recommend.
Acknowledgements
Funding in support of this work was provided by the Natural Sciences and Engineering Research
Council of Canada and the Fonds de recherche du Qu´ebec – Nature et technologies. The authors
thank the editor and three anonymous reviewers for their constructive comments, which helped them
to improve the manuscript.
11
References
[1] K. Aas, C. Czado, A. Frigessi, and H. Bakken. Pair-copula constructions of multiple dependence.
Insurance: Mathematics and Economics, 44(2):182 – 198, 2009.
[2] T. Bedford and R. M. Cooke. Vines - a new graphical model for dependent random variables.
Ann. Statist, 30:1031–1068, 2002.
[3] P. Billingsley. Convergence of Probability Measures. Wiley Series in Probability and Statistics:
Probability and Statistics. John Wiley & Sons, Inc., New York, second edition, 1999.
[4] T. Bouezmarni, A. El Ghouch, and A. Taamouti. Bernstein estimator for unbounded copula
densities. Stat. Risk Model., 30(4):343–360, 2013.
[5] E. Bouy´e and M. Salmon. Dynamic copula quantile regressions and tail area dynamic dependence
in Forex markets. The European Journal of Finance, 15(7-8):721–750, 2009.
[6] V. Chavez-Demoulin and A. Davison. Generalized additive modeling of sample extremes. Applied
Statistics, 54:207–222, 2005.
[7] C. Genest and B. R´emillard. Validity of the parametric bootstrap for goodness-of-fit testing in
semiparametric models. Annales de l’Institut Henri Poincare. Probabilites et Statistiques, 44:
337–366, 2008.
[8] C. Genest, K Ghoudi, and L. Rivest. A semiparametric estimation procedure of dependence
parameters in multivariate families of distributions. Biometrika, 82:543–52, 1995.
[9] P. Janssen, J. Swanepoel, and N. Veraverbeke. Bernstein estimation for a copula derivative with
application to conditional distribution and regression functionals. TEST, 25(2):351–374, 2016.
[10] H. Joe. Multivariate Models and Dependence Concepts. Chapman and Hall London, 1997.
[11] R. Koenker. Quantile Regression. Cambridge University Press, 2005. ISBN 0-521-60827-9.
12
[12] D. Kraus and C. Czado. D-vine copula based quantile regression. Computational Statistics &
Data Analysis, 110:1–18, 2017.
[13] T. Nagler and C. Czado. Evading the curse of dimensionality in nonparametric density estimation
with simplified vine copulas. J. Multivariate Anal., 151:69–89, 2016.
[14] B. Nasri and T. Bouezmarni. Copula-based conditional quantiles and inference. In Bouchra Nasri,
editor, M´ethodes d’estimation des quantiles conditionnels en hydro-climatologie, PhD Thesis,
chapter 3, pages 115–150. INRS-ETE, 2017.
[15] B. Nasri, S. El-Adlouni, and T. B. M. J. Ouarda. Bayesian estimation for GEV-B-spline model.
Open Journal of Statistics, 3:118–128, 2013.
[16] B. Nasri, Y. Tramblay, S. El Adlouni, E. Hertig, and T. B. M. J. Ouarda. Atmospheric pre-
dictors for annual maximum precipitation in North Africa. Journal of Applied Meteorology and
Climatology, 55:1063–1076, 2016.
[17] R. B. Nelsen. An Introduction to Copulas. Springer New York, 2006.
[18] H. Noh, A. El Gouch, and T. Bouezmarni. Copula-based regression estimation and inference.
Journal of the American Statistical Association, 108:676–688, 2013.
[19] H. Noh, A. El Ghouch, and I. Van Keilegom. Semiparametric conditional quantile estimation
through copula-based multivariate models. Journal of Business & Economic Statistics, 33(2):
167–178, 2015.
[20] B. R´emillard. Statistical Methods for Financial Engineering. Chapman and Hall/CRC Financial
Mathematics Series. Taylor & Francis, 2013.
[21] A. W. van der Vaart and J. A. Wellner. Weak convergence and empirical processes. Springer
Series in Statistics. Springer-Verlag, New York, 1996.
13
... Furthermore, in 2013, Wang and al. [4] proposed another estimation method by relaxing the linearity assumption while Noh et al. [5] focused on a semi-parametric method for estimating conditional quantiles based on quantile regression and copulas. In 2017, Nasri & Bouezmarni [6] proposed two parametric and semi-parametric estimation methods based on copulas. ...
... Authors such as Kraus and Czado [20], Nasri and Bouezmarni [21] have used copulas based on the plug-in method. More recently, Remillard, Nasri and Bouezmarni [6] have used copulas and the direct estimation method of conditional quantiles to estimate the conditional quantile. The asymptotic normality of their estimator has been studied. ...
... This relation was established in its first version, i.e. in the case m = 1 in Bouyé & Salomon 14. In the case of dimension m ≥ 2 [6], the relation is given for all real x and y by: ...
Article
Full-text available
The value at risk (VaR) plays a fundamental role in modeling risk in financial studies. We propose a approach in estimating the VaR for heavy-tailed distribution by taking into account the effects of certain covariates on the variable of interest. This method, involves estimating the extreme conditional quantiles by using the assciated copula. Morever, we use Bernstein copulas to estimate the intermediate conditional quantile in a non-parametric approach of the direct method.Then, the extreme conditional quantile is also estimated and we study the asymptotic properties of this new estimator.
... Recently, the connection between marginal distribution functions and copulas have been used to find an explicit expression for the conditional quantile function [18,29]. As a result, the conditional quantile function Q t depends only on the margins G t , F t and the copula C t viz. ...
... The asymptotic behavior of the parametric quantile process, follows readily from the Delta method [33]. It is also similar to the proofs in Rémillard et al. [29]. Appendix B.3. ...
Article
Full-text available
In this paper, we consider non-stationary random vectors, where the marginal distributions and the associated copula may be time-dependent. We propose estimators for the unknown parameters and we establish the limiting distribution of the estimators of the copula and the conditional copula, together with a parametric bootstrap method for constructing confidence bands around the estimator and for testing the adequacy of the model. We also consider three examples of functionals of the copula-based model under non-stationarity: conditional quantiles, conditional means, and conditional expected shortfalls. The asymptotic distribution of the estimation errors is shown to be Gaussian, and bootstrapping methods are proposed to estimate their asymptotic variances. The finite-sample performance of our estimators is investigated through Monte Carlo experiments, and we give three examples of implementation of the proposed methodology.
... Recently, the connection between marginal distribution functions and copulas have been used to find an explicit expression for the conditional quantile function [18,29]. As a result, the conditional quantile function Q t depends only on the margins G t , F t and the copula C t viz. ...
... The asymptotic behavior of the parametric quantile process, follows readily from the Delta method [33]. It is also similar to the proofs in Rémillard et al. [29]. Appendix B.3. ...
... where Q(·, x) is the conditional quantile function, defined as the inverse of H(·, x). For more details on copula-based conditional quantiles, see the recent articles [14], [16]. The function g can also be expressed as ...
... Note that B-vines can be particularly useful in conditional mean regression (OLS, GAM, GLM, etc,) and conditional quantile settings, where the distribution of the covariates is often given; see, e.g., [14]. It can also be used in our replication context when the target S exists; in this case, we could look at B-vines constructed from popular bivariate families like Clayton, Gumbel, Frank, Gaussian and Student, and find the ones that fit best the data, in the same spirit as the choice of vines for copula models in the R packages CDVine or VineCopula. ...
... While they are often used as convenience tools to glue together arbitrary marginal distributions, copulas have also been applied to solve regression problems (Song, 2000, Oakes and Ritz, 2000, Pitt et al., 2006, Kolev and Paiva, 2009, Song et al., 2009, Yin and Yuan, 2009, Noh et al., 2013, Cooke et al., 2015. Another recent development is in quantile regression (Koenker, 2005), with contributions from the econometrics and statistics literatures, respectively in the context of univariate quantile auto-regression (Bouyé andSalmon, 2009, Chen et al., 2009) and conditional quantile estimation (Noh et al., 2015, Rémillard et al., 2017. A problem with those approaches is that they use parametric copula families, which only allow for monotonic regression functions (Dette et al., 2014). ...
... Noh et al. (2015) and De Backer et al. (2017) provide similar results for quantile regression, the latter using a semiparametric method for estimating the copula density. Rémillard et al. (2017) establish weak convergence of the conditional quantile estimate as a process of the quantile level and prove validity of a parametric bootstrap procedure. ...
Preprint
Full-text available
Thanks to their aptitude to capture complex dependence structures, copulas are frequently used to glue random variables into a joint model with arbitrary one-dimensional margins. More recently, they have been applied to solve statistical learning problems such as regressions or classification. Framing such approaches as solutions of estimating equations, we generalize them in a unified framework. We derive consistency, asymptotic normality, and validity of the bootstrap for copula-based Z-estimators. We further illustrate the versatility such estimators through theoretical and simulated examples.
... Quantile regression: Copula-based regression has been extended to the quantile regression case, both for IID data and for time series by Noh et al. (2015). The quantile regression case is also examined by Rémillard et al. (2017). ...
Preprint
Full-text available
Predictions and forecasts of machine learning models should take the form of probability distributions, aiming to increase the quantity of information communicated to end users. Although applications of probabilistic prediction and forecasting with machine learning models in academia and industry are becoming more frequent, related concepts and methods have not been formalized and structured under a holistic view of the entire field. Here, we review the topic of predictive uncertainty estimation with machine learning algorithms, as well as the related metrics (consistent scoring functions and proper scoring rules) for assessing probabilistic predictions. The review covers a time period spanning from the introduction of early statistical (linear regression and time series models, based on Bayesian statistics or quantile regression) to recent machine learning algorithms (including generalized additive models for location, scale and shape, random forests, boosting and deep learning algorithms) that are more flexible by nature. The review of the progress in the field, expedites our understanding on how to develop new algorithms tailored to users' needs, since the latest advancements are based on some fundamental concepts applied to more complex algorithms. We conclude by classifying the material and discussing challenges that are becoming a hot topic of research.
... Here, an alternative method based on the bootstrap approach, described in Algorithm 1 (postponed in Supplementary materials) is proposed. The validity (consistency) of the bootstrap can be proven using arguments similar to those in Rémillard et al. (2017). Note, that the algorithm can be adapted for estimating the asymptotic variance of the parametric estimator π n (x). ...
Article
Full-text available
The paper proposes a new class of link functions for generalized binary regression based on copula models. The idea consists of writing the predictive probability of success (PPOS) in terms of marginal distributions and the conditional distribution for the copula. The proposed link functions provide flexible models and include the probit regression. A remarkable relationship with the logistic regression is also established in the case of a single covariate. To model the PPOS, a parametric family for the copula is considered and either a parametric or a nonparametric estimator for the marginal distributions is used. The asymptotic properties of these estimators are established and a simulation study is carried out to evaluate their performance. Finally, the methodology is illustrated by analyzing a data set on burn injury.
Article
Composite quantile regression (CQR) is becoming increasingly popular due to its robustness from quantile regression. Recently, the CQR method has been studied extensively with single-index models. However, the numerical inference of CQR methods for single-index models must involve iteration. In this study, we propose a non-iterative CQR (NICQR) estimation algorithm and derive the asymptotic distribution of the proposed estimator. Moreover, we extend the NICQR method to the analysis of massive datasets via a divide-and-conquer strategy. The proposed approach significantly reduces the computing time and the required primary memory. Simulation studies and two real data applications are conducted to illustrate the finite sample performance of the proposed methods.
Article
The existing literature of copula-based regression assumes that complete data are available, but this assumption is violated in many real applications. The present paper allows the regressand and regressors to be missing at random (MAR). We formulate a generalized regression model which unifies many prominent cases such as the conditional mean and quantile regressions. A semiparametric copula and the target regression curve are estimated via the calibration approach. The consistency and asymptotic normality of the estimated regression curve are proved. We show via Monte Carlo simulations that the proposed approach operates well in finite samples, while a benchmark equal-weight approach fails with substantial bias under MAR. An empirical application on revenues and R&D expenses of German manufacturing firms highlights a practical use of our approach.
Article
Full-text available
The high precipitation variability over North Africa presents a major challenge for the population and the infrastructure in the region. The last decades have seen many flood events caused by extreme precipitation in this area. There is a strong need to identify the most relevant atmospheric predictors to model these extreme events. In the present work, the effect of 14 different predictors calculated from NCEP-NCAR reanalysis, with daily to seasonal time steps, on the maximum annual precipitation (MAP) is evaluated at six coastal stations located in North Africa (Larache, Tangier, Melilla, Algiers, Tunis, and Gabes). The generalized extreme value (GEV) B-spline model was used to detect this influence. This model considers all continuous dependence forms (linear, quadratic, etc.) between the covariates and the variable of interest, thus providing a very flexible framework to evaluate the covariate effects on the GEV model parameters. Results show that no single set of covariates is valid for all stations. Overall, a strong dependence between the NCEP-NCAR predictors and MAP is detected, particularly with predictors describing large-scale circulation (geopotential height) or moisture (humidity). This study can therefore provide insights for developing extreme precipitation downscaling models that are tailored for North African conditions.
Article
Full-text available
We consider a new approach in quantile regression modeling based on the copula function that defines the dependence structure between the variables of interest. The key idea of this approach is to rewrite the characterization of a regression quantile in terms of a copula and marginal distributions. After the copula and the marginal distributions are estimated, the new estimator is obtained as the weighted quantile of the response variable in the model. The proposed conditional estimator has three main advantages: it applies to both iid and time series data, it is automatically monotonic across quantiles, and, unlike other copula-based methods, it can be directly applied to the multiple covariates case without introducing any extra complications. We show the asymptotic properties of our estimator when the copula is estimated by maximizing the pseudo-log-likelihood and the margins are estimated nonparametrically including the case where the copula family is misspecified. We also present the finite sample performance of the estimator and illustrate the usefulness of our proposal by an application to the historical volatilities of Google and Yahoo.
Article
Full-text available
The stationarity hypothesis is essential in hydrological frequency analysis and statistical inference. This assumption is often not fulfilled for large observed datasets, especially in the case of hydro-climatic variables. The Generalized Extreme Value distribution with covariates allows to model data in the presence of non-stationarity and/or dependence on covariates. Linear and non-linear dependence structures have been proposed with the corresponding fitting approach. The objective of the present study is to develop the GEV model with B-Spline in a Bayesian framework. A Markov Chain Monte Carlo (MCMC) algorithm has been developed to estimate quantiles and their posterior distributions. The methods are tested and illustrated using simulated data and applied to meteorological data. Results indicate the better performance of the proposed Bayesian method for rainfall quantile estimation according to BIAS and RMSE criteria especially for high return period events.
Article
Quantile regression, that is the prediction of a random variable's quantiles conditioned on other random variables taking on certain values, has perpetually gained importance in statistical modeling and financial applications. We introduce a new quantile regression method based on sequentially fitting a likelihood optimal D-vine copula to given data resulting in highly flexible models with easily extractable conditional quantiles. As a subclass of regular vine copulas, D-vines enable the modeling of multivariate copulas in terms of bivariate building blocks, a so-called pair-copula construction (PCC). The proposed algorithm works fast and accurate even in high dimensions and incorporates an automatic variable selection. In a simulation study the improved accuracy and saved computational time of the approach in comparison with established quantile regression methods is highlighted. An extensive financial application to international credit default swap (CDS) data including stress testing and Value at Risk (VaR) prediction demonstrates the usefulness of the proposed method.
Article
Bernstein estimators attracted considerable attention as smooth nonparametric estimators for distribution functions, densities, copulas and copula densities. The present paper adds a parallel result for the first-order derivative of a copula function. This result then leads to Bernstein estimators for a conditional distribution function and its important functionals such as the regression and quantile functions. Results of independent interest have been derived such as an almost sure oscillation behavior of the empirical copula process and a Bahadur-type almost sure asymptotic representation for the Bernstein estimator of a regression quantile function. Simulations demonstrate the good performance of the proposed estimators.
Article
Practical applications of nonparametric density estimators in more than three dimensions suffer a great deal from the well-known curse of dimensionality: convergence slows down as dimension increases. We show that one can evade the curse of dimensionality by assuming a simplified vine copula model for the dependence between variables. We formulate a general nonparametric estimator for such a model and show under high-level assumptions that the speed of convergence is independent of dimension. We further discuss a particular implementation for which we validate the high-level assumptions and establish asymptotic normality. Simulation experiments illustrate a large gain in finite sample performance when the simplifying assumption is at least approximately true. But even when it is severely violated, the vine copula based approach proves advantageous as soon as more than a few variables are involved. Lastly, we give an application of the estimator to a classification problem from astrophysics.