ArticlePDF Available

Dependence modelling with regular vine copula models: A case-study for car crash simulation data


Abstract and Figures

The analysis of car crash output parameters such as firewall intrusion points assist the overall engineering process. Such data are nowadays collected from many numerical simulations and it is not possible for the engineer to analyse this growing amount of data by hand. Therefore, data mining and statistical methods are needed. Here, we propose to use the flexible class of regular vine (R-vine) copulas for modelling the dependence between such output variables. R-vine copulas are multivariate copulas constructed hierarchically from bivariate copulas as building blocks. We introduce the concept of such constructions and their graphical tree representation. Applied to simulated frontal crash data of a Ford Taurus such graphs help us to illustrate the dependence structure among different firewall intrusion locations. The big advantage of R-vines compared with standard approaches such as the multivariate normal distribution or the multivariate Gaussian copula is the ability to model asymmetries and dependence in the tails. Our application demonstrates the strong potential of R-vines in the engineering context and opens further application areas.
Content may be subject to copyright.
Dependence modeling with R-vine copula models: A
case study for car crash simulation data
Ulf Schepsmeier
Lehrstuhl f¨ur Mathematische Statistik, Technische Universit¨
at M¨unchen, Parkring 13,
85748 Garching-Hochbr¨uck, Germany.
Claudia Czado
Lehrstuhl f¨ur Mathematische Statistik, Technische Universit¨
at M¨unchen, Parkring 13,
85748 Garching-Hochbr¨uck, Germany.
Summary. The analysis of car crash output parameters such as fire wall intrusion
points assist the overall engineering process. Such data are nowadays collected
from many numerical simulations and it is not possible for the engineer to analyze the
growing amount of data by hand. Therefore, data mining and statistical methods are
needed. Here, we propose the flexible class of regular vine (R-vine) copula models
for modeling the dependence among such output variables.
R-vine copulas are multivariate copulas constructed hierarchically from bivariate cop-
ulas as building blocks. We introduce the concept of such constructions and their
graphical tree representation. Applied to simulated frontal crash data of a Ford Tau-
rus such graphs help us to assist in the engineering process showing the depen-
dence structure among different fire wall intrusion locations. The big advantage of
R-vines compared to standard approaches like the multivariate normal distribution or
the multivariate Gaussian copula is the ability to model different tail dependencies.
Our example demonstrates the strong potential of R-vines in the engineering context
and opens further application areas.
Keywords: car crash simulation, dependence modeling, FEM model, R-vine
2Schepsmeier and Czado
1. Introduction
Preventing mortal injuries of car drivers in a crash and virtual product development
are probably the main research and innovation areas in the automotive industry.
Nowadays research is based on hundreds of computer simulation runs with (stochas-
tically) varied input parameters such as speed or the thickness of different plates.
Finite element models (FEM; see for example Chaskalovic, 2008) are essential tools
to analyze the influence of design parameters on the functional properties of the car
structure. Since CPU costs are falling exponentially and FEM-crash model sizes
are increasing at the same time the amount of data is growing fast (Blumhardt,
2001). Therefore, data mining and statistical methods are more and more in focus
of engineers (Mei and Thole, 2008). Here we will introduce and apply a new statis-
tical model class to analyze such high dimensional data in the engineering context.
Regular vine (R-vine) copula models are a flexible class of multivariate copulas and
are especially suited for dependence modeling in high dimensions.
One analysis aspect is the look at so called output parameters like the Head
Injury Criterion (HIC) Index or the fire wall intrusion. The fire wall is a group of
metal plates between the engine and the passenger space shielding the car passenger
from noise, smell, emissions and most importantly from fire and deformed metal
pieces in case of a frontal crash. The fire wall intrusion is measured at different
FEM-nodes and indicates the displacement of parts in a crash. Thus, the analysis
of the fire wall intrusion grants important information about the structural behavior
of the car during a crash and the safety of the passengers. Thereby it significantly
assist in the overall engineering process.
Such an evaluation is performed in this case study using R-vine copula models.
In general, copulas allow to model univariate effects and joint dependence structures
separately. By Sklar’s Theorem (Sklar, 1959) we can decompose a continuous mul-
tivariate cumulative distribution function (cdf) F(x1,...,xd) of random variables
X1,...,Xdwith marginal cdfs F1(x1), . . . , Fd(xd) into a unique copula distribution
Dependence modeling with R-vines 3
function Cand the marginals, i.e.
F(x1,...,xd) = C(F1(x1),...,Fd(xd)).(1)
The problem of standard multivariate copulas such as the multivariate Gaussian
copula is that they are less flexible in higher dimensions. In the bivariate case
parametric copulas are very flexible and are able to model most of the data’s fea-
tures, e.g. tail dependence and asymmetries. Pair-copula constructions (PCC)
make use of the bivariate copulas as building blocks to decompose a multivari-
ate distribution function. Joe (1996) and later Bedford and Cooke (2001, 2002)
independently constructed multivariate densities using d(d1)/2 bivariate copu-
las. Bedford and Cooke (2001) introduced a set of nested trees to help to orga-
nize the decomposition and called the resulting graph structure a vine. Aas et al.
(2009) developed the statistical inference to this construction and their work form
the basis for subsequent work. In particular Czado et al. (2012) developed model
selection algorithms for a sub-class, while Dißmann et al. (2013) consider the gen-
eral class. Parsimony can be achieved through truncation (Brechmann et al., 2012)
and analytic expressions for standard errors of the parametric estimates are derived
(St¨ober and Schepsmeier, 2013). Surveys for estimation and model selection are also
available (Czado, 2010; Czado et al., 2013). The books by Kurowicka and Cooke
(2006) and Kurowicka and Joe (2010) provide details and further properties. Be-
cause of their very flexible and easy construction R-vine copula models became quite
popular in many fields. For example Hanea et al. (2006) use them for Bayesian
Belief Nets. But the main application area so far for the vine copulas are fi-
nance (Brechmann and Czado, 2013) or agricultural science and electricity loads
(Smith et al., 2010). Applications in the engineering context are not known to
the authors. We will show that vine copula models are very useful to analyze car
crash simulations gaining additional knowledge about model behavior with regard
to safety aspects. They improve the engineer’s understanding of the car structure
and behavior in a frontal crash.
In Section 2 we describe the fire wall intrusion data set extracted from a car crash
4Schepsmeier and Czado
simulation of a Ford Taurus, while our proposed model, the R-vine, is introduced
in Section 3. The analysis applying R-vines and our results are given in Section
4. The final Section 5 is summing up and suggests further research areas for the
R-vine copula models.
2. Fire wall intrusion data of a Ford Taurus
We consider a frontal crash simulation of a Ford Taurus, a model from the National
Crash Analysis Center, from 2001 with around 875.000 FE nodes and 300 time
steps. With LS-DYNA ( wecomputed 289
simulations of the whole crash varying selected parameters. In Figure 1 the Taurus
is illustrated for time step t= 300, i.e. after the crash.
With n1,...,nMNwe denote the finite element nodes, whose positions in the
three dimensional space are rxti
MR3, where r= 1,...,R = 289 denotes
simulation index and ti∈ {1,...,T = 300}the current time step. The fire wall
intrusion is now defined as the displacements of the 23 finite element nodes listed
in Table A of the Appendix, minus the displacement of a fixed point in the back of
the car, i.e.
fix rxtj
fixl= 1,...,23.(2)
The fire wall intrusion points are illustrated in Figure 1, right panel. For each
intrusion point the displacement vector (2) can be calculated for each time step ti
to tj, stored in a data matrix
1k2... kRdti,tj
2k2... kRdti,tj
Mk2... kRdti,tj
R23×289 (3)
The data is provided by the Fraunhofer Institute (SCAI), Schloss Birlinghoven, 53754
Sankt Augustin, Germany, in a joint project supported by the German Minister for Educa-
tion and Research (BMBF).
Dependence modeling with R-vines 5
Fig. 1. Ford Taurus from the right (left panel) for time step t= 300 and the fire wall with 23
marked intrusion points (right panel).
with M= 23 and R= 289. Here kxk2denotes the Euclidean norm of a vector x.
Keeping ti= 1 fixed we get an array of dimension 23 ×289 ×300 as our fire wall
intrusion data set. Preliminary analysis shows no significant displacement in time
up to tj= 149, therefore we concentrate on the time horizon tj= 150,...,300.
3. Regular vine copula model
Considering X1,...,Xdto be random continuous variables with marginal distribu-
tions F1(x1),...,Fd(xd) it can be a quite challenging problem to state the joint
cumulative distribution function (cdf) F(x1,...,xd) of them. Copulas are a pop-
ular statistical tool to solve this problem. The famous Theorem of Sklar (Sklar,
1959) given in (1) decomposes the problem into modeling of the margins and the
joint dependence structure modeled by the copula. In the bivariate case we have
F(x1, x2) = C(F1(x1), F2(x2)). If Cis two-times differentiable the bivariate copula
density is c(u1, u2) = 2C(u1,u2)
∂u1 u2, where (u1=F1(x1), u2=F2(x2)) are so called cop-
ula data on [0,1]2. This allows to write f(x1, x2) = c(F1(x1), F2(x2))f1(x1)f2(x2).
Classical one or two-parametric copulas are the Gaussian or Student’s t copula
arising from the elliptical copula family, or the Archimedean copulas like Clayton,
Gumbel or Frank copula (Joe, 1997).
A commonly well known dependence measure for copulas is the rank correlation
6Schepsmeier and Czado
coefficient Kendall’s τ. Let U1and U2be uniform distributed random variables on
[0,1], then Kendall’s τis defined as
τ= 4 Z Z C(u1, u2)dC(u1, u2)1,
where C(·,·) is the copula distribution function. For the Archimedean copulas a
closed form expression of Kendall’s τis available based on the copula specific gener-
ator function (Embrechts et al., 2003), while for the Gauss and Student’s t copula
the calculation is more involved (Frahm et al., 2003). An overview of Kendall’s τ
for different parametric copula families is given in Table A in the Appendix.
Beside Kendall’s τthere are several other dependence measures studied in the
literature, e.g. Spearman’s ρ. But for dependence among extreme values the concept
of tail dependence is more adequate. The concept of bivariate tail dependence
involves the amount of dependence in the upper-quadrant tail or lower-quadrant
tail of a bivariate distribution (Joe, 1997). The upper tail dependence coefficient
λUand lower tail dependence coefficient λLare defined as
λU= lim
12u+C(u, u)
1u(0,1] and λL= lim
C(u, u)
respectively. Again, Table A in the Appendix lists the tail behavior of some bivariate
copula families.
Multivariate copulas, extended from bivariate elliptical and Archimedean cop-
ulas, lack flexibility in higher dimensions. In particular in modeling different tail
dependencies for different pairs of variables. Vine copula models overcome this
problem. Using only bivariate (conditional) copulas as building blocks vine copula
models are very flexible in catching the underlying dependence and tail dependence
structure. First discussed by Joe (1996) and later by Bedford and Cooke (2001,
2002) a multivariate density is constructed by d(d1)/2 bivariate (conditional)
copulas. This process is called a pair-copula construction (PCC)(Aas et al., 2009),
which we will illustrate in a 3-dimensional example. In the following we denote
by fx|v(Fx|v) the conditional density (distribution) function of Xgiven V=v.
Further let cxy;vbe the copula density associated with the bivariate conditional
Dependence modeling with R-vines 7
distribution of (X, Y ) given V=v, while cxy denotes the bivariate copula density
corresponding to (X, Y ). In general cxy;vcan depend on the conditioning values v.
To allow for tractable statistical inference we assume that cxy;vis independent of
v. This is called the simplifying assumption (see for example St¨ober et al. (2013)
for details on this restriction).
Example 1 (3-dim pair-copula construction)
Let X1F1, X2F2and X3F3. We can decompose the joint density as
f(x1, x2, x3) = f3|12(x3|x1, x2)f2|1(x2, x1)f1(x1).
By Sklar’s Theorem applied to bivariate densities it is obvious that
f2|1(x2|x1) = c12(F1(x1), F2(x2))f2(x2)
f3|12(x3|x1, x2) = c13;2(F1|2(x1|x2), F3|2(x3|x2))f3|2(x3|x2)
f3|2(x3|x2) = c23(F2(x2), F3(x3))f3(x3).
Thus, we can represent the joint density as a product of pair-copulas and marginal
densities, i.e.
f(x1, x2, x3) = f3(x3)f2(x2)f1(x1)c12(F1(x1), F2(x2))c23(F2(x2), F3(x3))
·c13;2(F1|2(x1|x2), F3|2(x3|x2)),
where the arguments of the conditional pair-copula are of the form F(x|v). For
every vjin the vector vwe can express F(x|v)as
F(x|v) = ∂Cx,vj;vj{F(x|vj), F (vj|vj)}
∂F (vj|vj)
with vj=v\{vj}in the general case.
Note that the construction is not unique. There is a huge number of possible
constructions (Morales-N´apoles, 2010). A set of nested trees Ti= (Vi, Ei) is used
to illustrate and order all these possible constructions. Each edge Eiin a tree
corresponds to a pair-copula in the PCC, while the nodes Viidentify the pair-copula
arguments. This set of trees is called a vine (Bedford and Cooke, 2001). The vine
tree structure for the 3-dimensional example is illustrated in Figure 2.
8Schepsmeier and Czado
1 2 3
1,2 2,3 T1
1,2 2,3
1, 3|2 T2
Fig. 2. Tree structure of the 3-dimensional example.
In general, a nested set of trees is a regular vine if and only if the trees fulfill the
following conditions (Bedford and Cooke, 2001):
(a) T1is a tree with nodes V1={1,...,d}and edges E1.
(b) For i2, Tiis a tree with nodes Vi=Ei1and edges Ei.
(c) If two nodes in Ti+1 are joint by an edge, the corresponding edge in Timust
share a common node (proximity condition).
A PCC is called a regular vine (R-vine) copula if all marginal densities are
uniform. We denote the R-vine structure given by the nested set of trees with V.
Since every pair-copula in the composition can be selected independently from a set
of bivariate copula families we get a set of pair-copulas B(V) with corresponding
parameter set θ(B(V)). Thus the R-vine is fully specified by RV (V,B(V),θ(B(V))).
For further details on vine copulas, their construction, estimation and properties we
refer to the work of Aas et al. (2009), Czado (2010), Chapter 5 of Mai and Scherer
(2012) and Dißmann et al. (2013).
To select the “best” fitting R-vine tree structure we follow the approach of
Dißmann et al. (2013) maximizing the sum of absolute Kendall’s τtree-wise by
a Maximum Spanning Tree (MST) algorithm. Here the main idea is to capture
most of the dependence in the first trees. Due to its sequential way of construction
Dependence modeling with R-vines 9
by nested trees the vine can be selected tree by tree. The copula family of the corre-
sponding pair-copulas illustrated by the edges is chosen by the Akaike Information
Criterion (AIC). The associated copula parameters can be estimated either sequen-
tially or in a joint maximum likelihood (ML) approach (Aas et al., 2009). Alter-
natively a Bayesian approach can be followed (Smith et al., 2010; Min and Czado,
2010; Gruber and Czado, 2014).
4. Analysis and Results
Given the proposed fire wall intrusion data three modeling scenarios are possible.
First, each simulation run can be modeled separately given the data from time
t= 151,...,300. Further, each time point can be modeled separately given all
simulations, and third all time points and simulations can be modeled in a joint
approach. We will concentrate on the second one to characterize the dynamic
behavior. Further, in an additional analysis we will first reduce the dimensionality
of the data set using Principal Component Analysis (PCA) and apply our R-vine
copula models to this processed data.
4.1. R-vine models for separate time steps
Keeping the time fixed one data set to be analyzed is of dimension 289 ×23, given
289 simulation runs for the 23 fire wall intrusion points. Assuming independently
generated simulation runs we utilize the empirical distribution function as an esti-
mate of the marginal distribution function and define copula data u1,...u23 using
the corresponding probability integral transform for each component. Fitting an
R-vine copula model (Vt,Bt(Vt),θt(Bt(Vt))), t = 151,...,300,to each of the 150
data sets, denoted as ( ˆ
Vt))), we get a dynamic picture of the un-
derlying dependence structure by extracting the joint distribution of the fire wall
intrusion points during the frontal car crash. For the selection of the pair-copula
families Bt(Vt) we allowed the elliptical bivariate Gaussian and t-copula, as well
as the Archimedean bivariate Clayton, Gumbel, Frank and Joe copula and their
10 Schepsmeier and Czado
rotated versions to fit negative dependencies. The selection of Vtand Bt(Vt) and
the estimation of θt(Bt(Vt)) were performed with the R-package VineCopula of
Schepsmeier et al. (2012).
In the following we will analyze the estimated R-vine copula models in detail
and justify why they are preferable against much simpler but limited models such
as the multivariate Gauss copula. The multivariate normal distribution or the
multivariate Gauss copula are often used models for multivariate data assuming
a linear dependence structure and no tail dependencies. The multivariate Gauss
copula can be represented as any R-vine with Gaussian pair-copulas, where the
parameters are determined by the associated partial correlations.
Log-likelihood, AIC and BIC: The log-likelihood of the estimated R-vine
copula models for each time point illustrated in Figure 3 is one commonly used
measure of goodness of fit. The AIC or BIC are classical model comparison mea-
sures, taking the model complexity into account. The log-likelihood of the R-vine
models are higher than the one achieved by the Gaussian copula by up to 150 points
over time. Further, due to the significant smaller number of model parameters in
the R-vine, on average 150 model parameters against 23*24/2=253 in the multi-
variate Gauss copula, the corresponding AIC and BIC of the R-vine copula models
are much better. On average the R-vine copula outperforms the multivariate Gauss
copula by 260 AIC points. The maximum difference in the AIC is 520. Similar huge
differences can be seen in a BIC based comparison.
Dependence strength modeling: Another aspect in our analysis is the depen-
dence behavior over time. A single measure of dependence at each time point could
be the relative sum of absolute pair-wise Kendall’s τs, aggregating the estimated
bivariate dependence, i.e.
¯τt:= 1
j=i+1 |ˆτt
i,j |,
where ˆτt
i,j is the empirical Kendall’s τvalue based on the observed data (ut
R2·150 at time point t. We plotted ¯τtin Figure 3 (top row), too, and find that
Dependence modeling with R-vines 11
the log-likelihood follows ¯τt, thus catching the time dependent overall dependence.
Thus the proposed R-vines are able to model the mean dependence. But much
simpler models such as the multivariate Gauss model achieve this too (see dotted
line in Figure 3 (top row)).
Tail dependence: R-vines are particularly well suited to account for tail de-
pendence of different pairs of variables. This can be seen from the percentage of
Non-Gaussian pair-copulas (Figure 4) over time. About 20% of the selected pair
copulas have either only lower or upper tail dependence such as Clayton, Gumbel
or Joe copula (tail asymmetric copula). Further 3-5% have upper and lower tail
dependence modeled with a Student’s t-copula.
Furthermore, the tail dependence is significant since the tail dependence coeffi-
cient calculated form the estimated parameters is in mean always greater than 0.15.
The maximum tail dependence per time step is between 0.7 and 0.95, and espe-
cially high for the last time steps (t > 225). An analysis of the degrees-of-freedom
parameters of the chosen Student’s t copula reveal a similar picture. The estimated
degrees-of-freedom parameter νis always smaller than 20. A high νindicates a
weak tail dependence and a convergence towards the Gauss copula, which is the
limiting case for ν→ ∞.
In Figure 4 (bottom line) we have a closer look at the very important first R-vine
tree. The percentage of Gaussian to Non-Gaussian pair-copulas remains the same
as in Figure 4 (top line) considering all trees. However a somewhat higher volatility
is observed in the first tree compared to all trees overall. No independence copulas
are chosen in the first R-vine trees.
Dependence structure: The vine tree structure visualizes the dependencies
between the 23 intrusion points. In Figure 5 we can clearly recognize how the
connections change in the progressing crash. Some of the edges of the first trees do
not vary at all, while others are only chosen at one time point. It is quite interesting
that between time step t= 200 and t= 250 only one edge, i.e. pair-copula, differ.
When we analyze all R-vine copula models and in particular their first tree structure
12 Schepsmeier and Czado
150 200 250 300
0.30 0.32 0.34 0.36 0.38 0.40
time step
rel. sum of Kendall's tau
7500 8000 8500 9000
150 200 250 300
17500 16500 15500 14500
time step
140 150 160 170
# parameter
Fig. 3. top row: Average absolute empirical Kendall’s τand the log-likelihood; bottom row:
Akaike Information Criterion (AIC) and number of parameters (R-vine) of the fitted R-vine
copula and multivariate Gauss copula models over time.
Dependence modeling with R-vines 13
150 200 250 300
0 20 40 60 80 100
Gaussian vs. NonGaussian
time step
percentage of used paircopulas
150 200 250 300
0 20 40 60 80 100
time step
percentage of used paircopulas
independence copula
Tail asymmetric copulas
150 200 250 300
0 20 40 60 80 100
Gaussian vs. NonGaussian
time step
percentage of used paircopulas
150 200 250 300
0 10 20 30 40
time step
percentage of used paircopulas
Tail asymmetric copulas
Fig. 4. top row: all trees; bottom row: only the first tree; left column: percentage of
Gaussian and Non-Gaussian pair-copulas selected; right column: composition of the Non-
Gaussian case.
we can recognize great similarities in the selected edges. The estimated probability
of appearance of a specific edge is illustrated in Figure 6. Here the thickness of an
edge corresponds to its probability of appearance. Only 42 of 253 possible edges are
selected at all. Nine of them appear in all 150 R-vine tree structures, while 17 still
appear with a probability greater than 75%. More than half of all used edges (22
of 42) appear in more than 75 of 150 cases. As interpretation we can conclude that
the 23 intrusion points more or less deform parallel over time, in particular between
14 Schepsmeier and Czado
Table 1. Percentage of Gaussian, Students’t and tail asymmetric copulas of the 9 most selected edges in the
copula Gauss Student’s t asym. copula
(22,23) 78 3 20
(8,22) 84 10 6
(1,9) 58 20 22
(10,17) 70 3 27
(2,3) 87 11 3
(3,4) 94 4 1
(11,12) 78 16 5
(5,13) 88 12 0
(7,15) 97 3 0
time step t= 200 and t= 250.
Furthermore, we can recognize that almost every intrusion location has its strongest
dependencies with locations in its closest neighborhood. The locations 21 and 23
are not connected since they are not on the same metal plate, thus differ in their
z-coordinates, which can not be seen in the two-dimensional plots. But their phys-
ical Euclidean distance would be the smallest among the neighbors of location 23.
The same is true for the locations 19 and 22.
Table 4.1 lists the percentage of the selected copula families on the nine most
chosen edges. In particular, the edges (22,23), (1,9) and (10,17) have a high per-
centage of tail asymmetric copulas. Thus, these points and their correlation may
be of higher interest to the engineer due to their clearly non-Gaussian behavior.
The time points t= 211 and t= 264 are chosen due to their relative sum of
empirical Kendall’s τ’s (¯τt) values. We highlighted two local maxima and two local
minima in Figure 3. The first peak is at t= 211, the second peak is at t= 224, while
the local minimum between them is a t= 219. Interestingly the R-vine structure of
the first tree is the same for all three time points (see Figure 5 top right panel). The
local minimum of ¯τtbetween t= 200,...,300 is the last chosen point (t= 264).
Dependence modeling with R-vines 15
Fig. 5. Fire wall intrusion points modeled by an R-vine. First R-vine tree plot of t= 150 (top
left), t= 211 ˆ=”first peak” (top right), t= 264 ˆ=”local minimum” (bottom left) and t= 300
(bottom right).
Fig. 6. Selected edges in the first R-vine tree for all 150 R-vines. The size of an edge
indicates its appearance probability.
16 Schepsmeier and Czado
4.2. R-vine models for PCA reduced data
The Principal Component Analysis (PCA) is the standard dimensionality reduction
method in the car engineering context to identify the intrinsic geometry of car crash
data or to reveal bifurcations in the time-dependent behavior of beams (Bohn et al.,
2013). Here, we apply this easy but powerful method to each of the 23 matrices of
the firewall intrusion point array defined at the end of Section 2. Each matrix repre-
sents the information for one intrusion point given the 289 simulation runs and the
last 150 time steps (t= 151,...,300). The first three principal components explain
96% of the data’s variability (in mean over the 23 locations: first component: 79%,
first two components: 92%, first three components: 96%) and are thus sufficient
to identfy the underlying structure. Thereby, we can reduce the dimension from
150 ×289 (or considering all 300 time steps, 300 ×289) to 3 ×289 for each of the
23 intrusion points. Thus, the analysis effort reduces from 300 (or in the subsection
before only the last 150 time steps) models to only three models.
Before we come to the analysis of the PCA reduced data some needed notation
and background on PCA. Let XRn×mbe our (standardized) data matrix and
R:= XTXRm×mits empirical correlation matrix. Further, let λ1λ2...
λmthe ordered eigenvalues of Rwith their corresponding orthonormal eigenvectors
T:= (v1,...,vm)Rm×m. Then, R·T= Λ ·Twith Λ := diag(λ1,...,λm) and
H:= Z·Tlead to the principal component decomposition
where F:= HΛ1
2Rn×mare the principal components and L:= TΛ1
the loading matrix (Fahrmeir et al., 1996).
As before we assume independent PCA entries of each PCA component vec-
tor fP CAj
m= (1fP CAj
m,...,RfP C Aj
m)T,j∈ {1,...,150},m∈ {1,...,M = 23},
R= 289 (principal component matrix Fm= (fP CA1
m,...,fP C A150
m) inherited from
independently generated simulation runs. We select the first three components.
Again we use the empirical distribution function for the marginal modeling result-
ing in copula data uP C Aj
1,...,uP C Aj
23 ,j∈ {1,2,3}and fit an R-vine copula model
Dependence modeling with R-vines 17
Table 2. Log-likelihood (), AIC, BIC, number of parameters (#par), percentage of
Non-Gaussian (%NG), t (%t), independence (%I) and tail-dependence (%Tail) pair-
copulas selected of the three R-vine copula models applied to the PCA reduced data
sets (top) and the min, mean and max of their analogs given the full data set (bottom).
PCA AIC BIC #par %NG %t %I %Tail
1 8830 -17342 -16760 159 75 4 41 19
2 11030 -21671 -20957 195 73 8 31 20
3 8704 -17072 -16456 168 79 5 38 22
R-vine models for t= 151,...,300
min 7367 -17610 -17020 132 68 1 37 11
mean 8259 -16210 -15660 151 73 3 43 16
max 8966 -14450 -13930 168 80 5 50 22
(Vj,Bj(Vj),θj(Bj(Vj))) to each of the three new data sets.
Results: The quantitative results including the log-likelihood (), AIC, BIC,
number of parameters (#par) and the composition of the selected bivariate copula
families are given in Table 4.2. All these results are comparable to the results of
the full data set analyzed before due to the same dimensionality of the (processed)
data sets. In particular, the log-likelihoods as well as the relative sums of Kendall’s
τvalues (¯τP C A1= 0.31, ¯τP CA2= 0.56, ¯τP C A3= 0.47) fit in the range of their
analogs given in Figure 3, except for P CA2having a higher mean sum of Kendall’s
τ. Considering the log-likelihood, AIC and BIC, again P C A2with = 11030 lies
beyond the maximum of the log-likelihoods given the full data set (= 8966).
Consequently the AIC and BIC are lower than their matching parts. The tail
asymmetry is comparable pronounced as in the full data set. Thus, R-vine copula
models are here preferable to multivariate Gaussian copulas, too. Although there are
some minor departures from the original data results the three PCA based models
can represent the underlying structure and properties of the fire wall intrusion
Furthermore, the most interesting dependence structure among the fire wall in-
18 Schepsmeier and Czado
trusion points is still recognizable in the R-vine tree structures of the three PCA
based R-vine copula models. As analyzed before only 42 of 253 possible edges in
the first tree are selected at all in the full data set. The three PCA based models
need only 33.
5. Summary and future work
The analysis of the fire wall intrusion points revealed several interesting points
which are very helpful for the understanding of the car structure and its behavior in
a frontal crash. The engineer can draw important conclusions for the improvement
of passenger security or car design.
First of all we recognized that the overall dependence of the fire wall intrusion
points vary over time. Furthermore, the relative sum of absolute Kendall’s τesti-
mates reveals a time dependent structure. The special R-vine construction allowed
us to detect more details. As expected many of the pair-wise dependencies are of an
elliptical shape. But plenty of them show tail dependence, violating the multivariate
normality assumption. Simpler models such as the multivariate Gauss copula can
not model such behaviors. The vine copula is much more flexible, since it allows
for different bivariate dependence structures.
Additionally, the R-vine tree structure allows us to investigate the whole de-
pendence structure over time. It turned out that the dependence structure varies
during the frontal crash but not much since only few edges of the first R-vine tree
structure change. Furthermore, the tree structure identifies which pairs of intrusion
points are most strongly related.
If the engineer is only interested in a rough picture of a crash or want to identify
the intrinsic geometry of a car crash the combination of PCA and R-vine copula
models is very helpful. Here, the R-vine tree structure can reveal the dependence
structure as well. Furthermore, all the major dependency properties characterized
in the full analysis of the fire wall intrusion data set can be found in the reduced
PCA data, too.
Dependence modeling with R-vines 19
This shows that the vine copula framework is useful to analyze the dynamic
behavior of crash simulation data. It is expected that for side crashes the vine
methodology will be even more successful than a Gaussian dependence model. The
R-vine tree structures show clear spatial features and approaches as proposed by
Erhardt et al. (2014) will be good starting points to capture this behavior. Since
these approaches were formulated in a 2D context, the extension to 3D will be nec-
essary in addition to incorporating the plate structures of the car. These extensions
will be part of future research.
The used model has been developed by The National Crash Analysis Center (NCAC)
of The George Washington University under a contract with the FHWA and NHTSA
of the US DOT (
Aas, K., C. Czado, A. Frigessi, and H. Bakken (2009). Pair-copula construction of
multiple dependence. Insurance: Mathematics and Economics 44, 182–198.
Bedford, T. and R. Cooke (2001). Probability density decomposition for condition-
ally dependent random variables modeled by vines. Ann. Math. Artif. Intell. 32,
Bedford, T. and R. Cooke (2002). Vines - a new graphical model for dependent
random variables. Annals of Statistics 30, 1031–1068.
Blumhardt, R. (2001). Fem - crash simulation and optimisation. International
Journal of Vehicle De sign 26 (4), 331–347.
Bohn, B., J. Garcke, R. Iza Teran, A. Paprotny, B. Peherstorfer, U. Schepsmeier,
and C.-A. Thole (2013). Analysis of car crash simulation data with nonlinear ma-
chine learning methods. In Proceedings of the ICCS 2013, Volume 18 of Procedia
Computer Science, pp. 621–630. Elsevier.
20 Schepsmeier and Czado
Brechmann, E. and C. Czado (2013). Risk Management with High-Dimensional
Vine Copulas: An Analysis of the Euro Stoxx 50. Statistics & Risk Model-
ing 30 (4), 307–342.
Brechmann, E., C. Czado, and K. Aas (2012). Truncated regular vines in high
dimensions with applications to financial data. Canadian Journal of Statis-
tics 40 (1), 68–85.
Chaskalovic, J. (2008). Finite Element Methods for Engineering Sciences: Theoret-
ical Approach and Problem Solving Techniques. Springer.
Czado, C. (2010). Pair-Copula Constructions of Multivariate Copulas. In P. Ja-
worski, F. Durante, W. H¨ardle, and T. Rychlik (Eds.), Copula Theory and Its
Application, Number 198 in Lecture Notes in Statistics, pp. 93–109. Springer.
Czado, C., E. Brechmann, and L. Gruber (2013). Selection of vine copulas. In
P. Jaworski, F. Durante, and W. H¨ardle (Eds.), Copulae in Mathematical and
Quantitative Finance: Proceedings of the Workshop Held in Cracow, 10-11 July
2012, Number 213 in Lecture Notes in Statistics. Springer.
Czado, C., U. Schepsmeier, and A. Min (2012). Maximum likelihood estimation of
mixed C-vines with application to exchange rates. Statistical Modelling 12 (3),
Dißmann, J., E. Brechmann, C. Czado, and D. Kurowicka (2013). Selecting and es-
timating regular vine copulae and application to financial returns. Computational
Statistics and D ata Analysis 59 (0), 52 – 69.
Embrechts, P., F. Lindskog, and A. McNail (2003). Modelling dependence with
copulas and applications to risk management. In S. Rachev (Ed.), Handbook of
heavy tailed distributions in finance, pp. 329–384. Amsterdam: Elsevier/North-
Erhardt, T., C. Czado, and U. Schepsmeier (2014). R-vine models for spatial time
series with an application to daily mean temperature. submitted for publication.
Dependence modeling with R-vines 21
Fahrmeir, L., W. Brachinger, A. Hamerle, and G. Tutz (1996). Multivariate statis-
tische verfahren. Walter de Gruyter.
Frahm, G., M. Junker, and A. Szimayer (2003). Elliptical copulas: applicability
and limitations. Stat. Probab. Lett. 63 (3), 275–286.
Gruber, L. and C. Czado (2014). Sequential bayesian model selection of regular
vine copulas. submitted for publication.
Hanea, A. M., D. Kurowicka, and R. M. Cooke (2006). Hybrid method for quan-
tifying and analyzing bayesian belief nets. Quality and Reliability Engineering
International 22 (6), 709–729.
Joe, H. (1996). Families of m-variate distributions with given margins and m(m-1)/2
bivariate dependence parameters. In L. R¨uschendorf and B. Schweizer and M. D.
Taylor (Ed.), Distributions with Fixed Marginals and Related Topics, Volume 28,
Hayward, CA, pp. 120–141. Inst. Math. Statist.
Joe, H. (1997). Multivariate Models and Dependence Concepts. Chapman und Hall,
Kurowicka, D. and R. Cooke (2006). Uncertainty Analysis with High Dimensional
Dependence Modelling. John Wiley & Sons Ltd, Chichester.
Kurowicka, D. and H. Joe (2010). Dependence Modeling: Vine Copula Handbook.
World Scientific Publishing Company, Incorporated.
Mai, J.-F. and M. Scherer (2012). Simulating Copulas: Stochastic Models, Sampling
Algorithms, and Applications. Singapore: World Scientific Publishing Co.
Mei, L. and C. Thole (2008). Data analysis for parallel car-crash simulation results
and model optimization. Simulation Modelling Practice and Theory 16 (3), 329 –
Min, A. and C. Czado (2010). Bayesian inference for multivariate copulas using
pair-copula constructions. Journal of Financial E conometrics 8 (4), 511–546.
22 Schepsmeier and Czado
Morales-N´apoles, O. (2010). Counting Vines. In H. Kurowicka, D. Joe (Ed.),
Dependence Modeling-Handbook on Vine Copulas, Singapore, pp. 189–218. World
Scientific Publishing.
Schepsmeier, U., J. Stoeber, and E. C. Brechmann (2012). VineCopula: Statistical
inference of vine copulas. R package version 1.0.
Sklar, M. (1959). Fonctions de r´epartition ´a n dimensions et leurs marges. Publ.
Inst. Statist. Univ. Paris 8, 229–231.
Smith, M., A. Min, C. Almeida, and C. Czado (2010). Modelling longitudinal
data using a pair-copula decomposition of serial dependence. The Journal of the
American Statistical Association 105 (492), 1467–1479.
St¨ober, J., C. Czado, and H. Joe (2013). Simplified pair copula constructions -
limitations and extensions. Journal of Multivariate Analysis. 119, 101–118.
St¨ober, J. and U. Schepsmeier (2013). Estimating standard errors in regular vine
copula models. Computational Statistics 28 (6), 2679–2707.
Dependence modeling with R-vines 23
Table 3. Coordinates of the 23 fire wall intrusion points.
nr. node-id x y z nr. node-id x y z
1 3184699 -1544.4 578.9 239.7 13 3188401 -1329.7 -191.2 400.5
2 3183912 -1520.7 489.5 223.1 14 3188442 -1330.5 -423.6 401.2
3 3183367 -1526.0 312.1 222.7 15 3179751 -1426.6 -511.1 372.7
4 3182514 -1522.8 146.5 227.6 16 3263455 -1649.4 -735.5 371.2
5 3180860 -1530.4 -185.4 239.0 17 3186460 -1311.8 475.1 537.5
6 3180230 -1543.1 -339.0 229.1 18 3186733 -1312.3 345.8 501.2
7 3179978 -1537.0 -479.6 224.6 19 3186555 -1310.8 469.5 688.9
8 3156178 -1649.7 737.0 366.8 20 3187573 -1297.5 223.7 613.5
9 3184686 -1486.9 571.2 344.5 21 3190789 -1280.1 -330.6 610.0
10 3186626 -1330.6 426.4 389.2 22 3405632 -1656.5 336.3 828.9
11 3187198 -1329.9 161.9 401.0 23 3405257 -1698.2 -298.4 740.0
12 3187318 -1331.3 0.9 386.2
A. Appendix
24 Schepsmeier and Czado
Table 4. Properties of parametric bivariate copula families.
Copula Parameter range Kendall’s τTail dependence
Gaussian ρ(1,1) 2
πarcsin(ρ) 0
Student-t ρ(1,1), ν > 12
πarcsin(ρ) 2tν+1 ν+ 1q1ρ
Clayton θ > 0θ
θ+2 (21,0)
Gumbel θ1 1 1
Frank aθR\ {0}14
θ+ 4 D1(θ)
Joe bθ > 12θ4+2γ+2 log 2+Ψ( 1
θ)+Ψ( 2+θ
aD1(θ) = Rθ
exp(x)1dx (Debye function)
bγ= limn→∞(Pn
i=1 1
ilog n)0.57721 (Euler’s constant), Ψ(x) = d
dx log(Γ(x))
(Digamma function)
... Kendall's Tau measures dependence irrespective of the assumed distribution and thus is suitable when linking several (non-Gaussian) copula families (Dissmann et al., 2013). For Archimedean copulas, the closed form expression of Kendall's tau is based on the copulaspecific generator function whereas their computation for Elliptical copulas are more complicated (Schepsmeier and Czado, 2016). The Kendall's Tau for various bivariate elliptical copulas and bivariate Archimedean copulas can be found in Brechmann and Schepsmeier (2013). ...
... For this reason, pair copula construction addresses this flexibility limitation. A pair copula construction (PCC) represents complex multivariate dependence structures through the construction of flexible high-dimensional copulas via a cascade of bivariate copulas as building blocks which are highly flexible in expressing the underlying dependence and tail dependence structure Schepsmeier and Czado, 2016;Dalla Valle et al., 2016). Joe (1996) initially employed the use of pair copula constructions to represent complex multivariate dependence structures which was based on Sklar's theorem utilizing cumulative distribution functions. ...
... These special types of R-vines are convenient to use since the initial tree (in the case of D-vines) and the order of the root nodes (in the case of C-vines) determine their structure entirely. However, these special cases of R-vines are restrictive cases with arbitrary R-vines copulas more flexible in modeling of complex dependencies in higher-order dimensions (Dissmann et al., 2013;Stöber et al., 2015;Schepsmeier and Czado, 2016). ...
Although there is a low frequency of train derailments, they have been a major concern due to their high consequences justifying the need to critically examine the severity of train derailments. Derailments may result in injury, loss of life and property, interruption of services and damage of the environment. Most derailment severity models have utilized point estimation approaches which focus on the central tendency of derailment severity outcomes. However, this approach is not reliable given the high variation in derailment severity. Thus, it is imperative to take into consideration the entire severity distribution by examining other statistics including conditional quantiles. Furthermore, derailment data has been found to exhibit tail dependence, skewness and non-normality of the marginal distributions and joint distribution of the variables. Therefore, it is not appropriate to examine their interrelationships using conventional correlation analysis. For these reasons, this paper employs vine copula quantile regression model, an interval estimation approach, to predict conditional mean and quantiles of derailment severity outcomes. This novel methodology automatically tackles prominent issues in classical quantile regression including quantile crossing at various levels and interactions between covariates. Vine copulas, which are multivariate copulas constructed hierarchically from bivariate copulas as building blocks, permit the modeling of the complex dependences between the variables. The vine copula quantile regression model was found to offer better accuracy for analyzing derailment severity at various confidence levels compared to the classical quantile regression approach. The findings provide greater comprehension of the influence of the covariates on train derailment severity.
... With the availability of estimated or observed data, dependency models are used in the field of engineering, such as Schepsmeier and Czado (2016), that used experiments of crash simulation data. Vine approaches were used to describe the dependence. ...
Full-text available
Vine copula had a great impact on the study and analysis of dependence structures in various sciences. In multivariate analyses with dimensions of more than two variables, it is associated with computational complexities that solve vine copulas and these problems. In this study, in order to provide an approach to simulate potential evapotranspiration based on meteorological parameters in Birjand meteorological station from different family copulas including R‐vine, independent R‐vine, Gaussian, independent Gaussian, C‐vine, C‐vine independent, D‐vine and D‐vine independent were used. In this regard, vine copula simulation and conditional density were used. In pair correlation analysis of the studied variables using Kendall's tau statistic, dependence structure confirmed the studied parameters. The results showed a minimum correlation of −0.32 and a maximum of 0.77. The results of Akaike's information criteria (AIC), Bayesian information criteria (BIC) and LogLike statistics in evaluating the performance of vine copula dependency structure introduced the C‐vine copula as the superior copula for analysing the pair dependence of the studied variables. By introducing the superior dependency structure and internal copulas, the tree sequence of the pair of values under study was obtained. Pair of simulated values was performed using vine copula. Comparison of Kendall's tau values in both simulation and observation modes showed that Kendall's tau values were close to each other in both modes and were approximately similar. The simulation results of vine copula potential evapotranspiration values and precipitation, temperature and relative humidity values showed 92% efficiency. The efficiency of C‐vine copula in dependence analysis and simulation of potential evapotranspiration (PET) values is very high, which shows the ability of vine family copulas in multivariate analysis.
... variables. Therefore, they cannot accurately describe the dependencies of higher-dimensional variables (Aas et al. 2006;Schepsmeier and Czado 2016;Yu et al. 2019;Jane et al. 2020;Tosunoglu et al. 2020;Wu et al. 2020;). ...
Full-text available
In hydrological research, flood events can be analyzed by flood hydrograph coincidence. The duration of the flood hydrograph is a key variable to calculate the flood hydrograph coincidence risk probability and determining whether flood hydrograph coincidence occurs, while the actual duration of the flood hydrograph is neglected in most of existing related research. This paper creatively proposes a novel method to analyze the flood hydrograph coincidence risk probability by establishing a five-dimensional joint distribution of flood volumes, durations and interval time for two hydrologic stations. More specifically, taking the annual maximum flood of the upper Yangtze River and input from Dongting Lake as an example, the Pearson Type III and the mixed von Mises distributions were used to establish the marginal distribution of flood volumes, flood duration and interval time. Subsequently, the five-dimensional joint distribution based on vine copula was established to analyze the flood hydrograph coincidence risk probability. The results were verified by comparison with a historical flood sequence, which show that during 1951–2002, the hydrograph coincidence probabilities corresponding to its flood event coincidence volumes of 2.00 × 1011 m3, 4.00 × 1011 m3, and 6.00 × 1011 m3 are 0.213, 0.123, and 0.049, respectively. It has provided theoretical support for flood control safety and risk management in the middle and lower Yangtze River. This study also demonstrates the significant beneficial role of regulation by the Three Gorges Water Conservancy Project in mitigating flood risk of the Yangtze River. The hydrograph coincidence probability corresponding to its flood event coincidence volume of 2.00 × 1011 m3 has decreased by 0.141.
... For the random effects distribution, the choice of the copula couldn't be other than the class of regular vine copulas (Bedford and Cooke, 2002) as other copulas such as Archimedean, nested Archimedean and elliptical copulas have limited dependence (see, e.g., Nikoloulopoulos 2013). Regular vine copulas are suitable for high-dimensional data (e.g., Schepsmeier and Czado 2016), hence given the low dimension d = 6, we use their boundary case, namely a drawable vine (D-vine) copula. D-vine copulas have become important in many applications areas such as finance (Aas et al., 2009;Nikoloulopoulos et al., 2012) and biological sciences (Killiches and Czado, 2018;Nikoloulopoulos, 2019b), to just name a few, in order to deal with dependence in the joint tails. ...
There is an extensive literature on methods for meta-analysis of diagnostic studies, but it mainly focuses on a single test. However, the better understanding of a particular disease has led to the development of multiple tests. A multinomial generalized linear mixed model (GLMM) is recently proposed for the joint meta-analysis of studies comparing multiple tests. We propose a novel model for the joint meta-analysis of multiple tests, which assumes independent multinomial distributions for the counts of each combination of test results in diseased and non-diseased patients, conditional on the latent vector of probabilities of each combination of test results in diseased and non-diseased patients. For the random effects distribution of the latent proportions, we employ a truncated drawable vine copula that can cover flexible dependence structures. The proposed model includes the multinomial GLMM as a special case, but can also operate on the original scale of the latent proportions. Our methodology is demonstrated with a simulation study and using a meta-analysis of screening for Down syndrome with two tests: shortened humerus and shortened femur. The comparison of our method with the multinomial GLMM yields findings in the real data meta-analysis that change the current conclusions.
... For Archimedean copulas, the closed-form expression of Kendall's tau is based on the copula-specific generator function. However, the computation of Kendall's tau for elliptical copulas is more complex (Schepsmeier and Czado 2016). The Kendall's tau for various bivariate elliptical copulas and bivariate Archimedean copulas can be found in Brechmann and Schepsmeier (2013). ...
The far-reaching consequences of train derailments have been a major concern to industry and government despite their relatively low occurrence. These consequences include injury, loss of life and property, interruption of services, and destruction of the environment. Thus, it is imperative to carefully examine train derailment severity. The majority of extant literature has failed to consider the multivariate nature of derailment severity and has instead focused mainly on only one severity outcome, namely, the number of derailed cars. However, it is also important to analyze the monetary damage incurred by railroads during derailments. In this paper, a joint mixed copula-based model for derailed cars and monetary damage is presented for the combined analysis of their relationship with a set of covariates that might affect both outcomes. Marginal generalized regression linear models are combined with a bivariate copula, which characterizes the dependence between the two variables. Copulas also address endogeneity due to similar unobserved or omitted variables that may affect both response variables. The copula-based regression model was found to be more appropriate than the independent multivariate regression model. The incorporation of the copula to characterize the dependence resulted in a greater effect on the dispersion estimate than the point estimates. Derailment speed was found to have the most pronounced effect on both response variables. However, it was found to have a greater impact on monetary damage than the number of derailed cars.
... In the following, we consider the statistics of the 10 df Student's t copula whose p value resulting from the multiplier goodness-of-fit (gof) test is equal to 28%. Log-likelihood, AIC and BIC are statistics commonly used in the literature to compare different copula structures and determine their relative ranking (see, e.g., Schepsmeier 2016; Schepsmeier and Czado 2016). The value of the log-likelihood of the estimated copula models is a gof measure, while AIC and BIC criteria are other comparison measures that take into account model complexity. ...
Full-text available
In Europe gas is sold according to two main methods: long-term contract (LTCs) and hub pricing. Europe is moving towards a mix of long term and spot markets, but the eventual outcome is still unknown. The fall of the European gas demand combined with the increase of the US shale gas exports and the rise of Liquefied Natural Gas availability on international markets have led to a reduction of the European gas hub prices. On the other side, oil-indexed LTCs failed to promptly adjust their positions, implying significant losses for European gas mid-streamers that asked for a re-negotiation of their existing contracts and obtained new contracts linked also to hub spot prices. The debate over the necessity of the oil-indexed pricing is still on-going. The supporters of the gas-indexation state that nowadays the European gas industry is mature enough to adopt hub-based pricing system. With the aim of analyzing this situation and determining whether oil-indexation can still be convenient for the European gas market, we consider both spot gas prices traded at the hub and oil-based commodities as possible underlyings of the LTCs. We investigates the dependence risk and the optimal resource allocation of the underlying assets of a gas LTC through pair-vine copulas and portfolio optimization methods with respect to five risk measures. Our results show that European LTCs will most likely remain indexed to oil-based commodities, even though a partial dependence on spot hub prices is conceded.
... 30 For Archimedean copulas, the closed-form expression of Kendall's tau is based on the copulaspecific generator function, whereas their computation for elliptical copulas are more complicated. 31 The Kendall's tau for various bivariate elliptical copulas and bivariate Archimedean copulas are shown in Tables 1 and 2, respectively. Other concordance measures include Gini's measure of association, Blomqvist's measure of association (or medial correlation coefficient), and Moran's coefficient. ...
Assessing and maintaining track geometry within acceptable limits are key components of railroad infrastructure maintenance operations. Track geometry conditions have a significant influence on rider comfort and safety. To maintain the ride quality and safety of the track, maintenance activities pertaining to track geometry, such as tamping, are performed. Tamping enhances the track geometry quality but fails to return the track geometry to an as-good-as-new condition. Majority of studies have evaluated tamping recovery using deterministic techniques, which assume that tamping recovery is dependent on the track geometry quality prior to tamping. However, they fail to capture the uncertainty of the recovery values. Probabilistic approaches are increasingly being used to account for the uncertainty but fail to model the underlying dependence between the variables, which may exhibit nonlinear dependences such as tail or asymmetric dependence. To accurately model the tamping recovery phenomenon, this research employs the copula models in combining arbitrary marginal distributions to form a joint multivariate distribution with the underlying dependence. Copula models are used to estimate the tamping recovery of track geometry parameters such as surface (longitudinal level), alignment, cross level, gage, and warp.
Copula mixed models for trivariate (or bivariate) meta-analysis of diagnostic test accuracy studies accounting (or not) for disease prevalence have been proposed in the biostatistics literature to synthesize information. However, many systematic reviews often include case-control and cohort studies, so one can either focus on the bivariate meta-analysis of the case control studies or the trivariate meta-analysis of the cohort studies, as only the latter contains information on disease prevalence. In order to remedy this situation of wasting data we propose a hybrid copula mixed model via a combination of the bivariate and trivariate copula mixed model for the data from the case-control studies and cohort studies, respectively. Hence, this hybrid model can account for study design and also due its generality can deal with dependence in the joint tails. We apply the proposed hybrid copula mixed model to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma.
Full-text available
Multivariate copulas are commonly used in economics, finance and risk management. They allow for very flexible dependency structures, even though they are applied to transformed financial data after marginal time dependencies are removed. This is necessary to facilitate statistical parameter estimation. In this paper we consider a very flexible class of mixed C-vines, which allows the variables to be ordered according to their influence. Vines are built from bivariate copulas only and the term 'mixed' refers to allowing the pair-copula family to be chosen individually for each term. In addition there are many C-vine structure specifications possible and therefore we propose a novel data driven sequential selection procedure, which selects both the C-vine structure and its attached pair-copula families with parameters. After the model selection maximum likelihood (ML) estimation of the parameters is facilitated using the sequential estimates as starting values. An extensive simulation study shows a satisfactory performance of ML estimates in small samples. Finally an application involving US-exchange rates demonstrates the need for mixed C-vine models.
Dependence Modeling with Copulas covers the substantial advances that have taken place in the field during the last 15 years, including vine copula modeling of high-dimensional data. Vine copula models are constructed from a sequence of bivariate copulas. The book develops generalizations of vine copula models, including common and structured factor models that extend from the Gaussian assumption to copulas. It also discusses other multivariate constructions and parametric copula families that have different tail properties and presents extensive material on dependence and tail properties to assist in copula model selection. The author shows how numerical methods and algorithms for inference and simulation are important in high-dimensional copula applications. He presents the algorithms as pseudocode, illustrating their implementation for high-dimensional copula models. He also incorporates results to determine dependence and tail properties of multivariate distributions for future constructions of copula models.
This book is a collaborative effort from three workshops held over the last three years, all involving principal contributors to the vine-copula methodology. Research and applications in vines have been growing rapidly and there is now a growing need to collate basic results, and standardize terminology and methods. Specifically, this handbook will (1) trace historical developments, standardizing notation and terminology, (2) summarize results on bivariate copulae, (3) summarize results for regular vines, and (4) give an overview of its applications. In addition, many of these results are new and not readily available in any existing journals. New research directions are also discussed. © 2011 by World Scientific Publishing Co. Pte. Ltd. All rights reserved.
Regular vine copulas can describe a wider array of dependency patterns than the multivariate Gaussian copula or the multivariate Student's t copula. This paper presents two contributions related to model selection of regular vine copulas. First, our pair copula family selection procedure extends existing Bayesian family selection methods by allowing pair families to be chosen from an arbitrary set of candidate families. Second, our method represents the first Bayesian model selection approach to include the regular vine density construction in its scope of inference. The merits of our approach are established in a simulation study that benchmarks against methods suggested in current literature. A real data example about forecasting of portfolio asset returns for risk measurement and investment allocation illustrates the viability and relevance of the proposed scheme.
This self-tutorial offers a concise yet thorough grounding in the mathematics necessary for successfully applying FEMs to practical problems in science and engineering. The unique approach first summarizes and outlines the finite-element mathematics in general and then, in the second and major part, formulates problem examples that clearly demonstrate the techniques of functional analysis via numerous and diverse exercises. The solutions of the problems are given directly afterwards. Using this approach, the author motivates and encourages the reader to actively acquire the knowledge of finite- element methods instead of passively absorbing the material, as in most standard textbooks. The enlarged English-language edition, based on the original French, also contains a chapter on the approximation steps derived from the description of nature with differential equations and then applied to the specific model to be used. Furthermore, an introduction to tensor calculus using distribution theory offers further insight for readers with different mathematical backgrounds.
So-called pair copula constructions (PCCs), specifying multivariate distributions only in terms of bivariate building blocks (pair copulas), constitute a flexible class of dependence models. To keep them tractable for inference and model selection, the simplifying assumption, that copulas of conditional distributions do not depend on the values of the variables which they are conditioned on, is popular.We show that the only Archimedean copulas in dimension d ≥ 3 which are of the simplified type are those based on the Gamma Laplace transform or its extension, while the Student-t copulas are the only one arising from a scale mixture of Normals. Further, we illustrate how PCCs can be adapted for situations where conditional copulas depend on values which are conditioned on, and demonstrate a technique to assess the distance of a multivariate distribution from a nearby distribution that satisfies the simplifying assumption.