Holger Dette

Holger Dette
Ruhr University Bochum | RUB

About

740
Publications
58,009
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,158
Citations

Publications

Publications (740)
Article
Full-text available
We consider the problem of detecting gradual changes in the sequence of mean functions from a not necessarily stationary functional time series. Our approach is based on the maximum deviation (calculated over a given time interval) between a benchmark function and the mean functions at different time points. We speak of a gradual change of size , i...
Preprint
Multivariate locally stationary functional time series provide a flexible framework for modeling complex data structures exhibiting both temporal and spatial dependencies while allowing for time-varying data generating mechanism. In this paper, we introduce a specialized portmanteau-type test tailored for assessing white noise assumptions for multi...
Article
Full-text available
Injuries to the lower extremity joints are often debilitating, particularly for professional athletes. Understanding the onset of stressful conditions on these joints is, therefore, important in order to ensure prevention of injuries as well as individualised training for enhanced athletic performance. We study the biomechanical joint angles from t...
Preprint
In this paper we take a different look on the problem of testing the independence of two infinite dimensional random variables using the distance correlation. Instead of testing if the distance correlation vanishes exactly, we are interested in the null hypothesis that it does not exceed a certain threshold. Our formulation of the testing problem i...
Chapter
We propose a reproducing kernel Hilbert space approach for testing relevant hypotheses regarding the slope function in function-on-function linear regression for time series. In contrast to exact nullity of the slope function, relevant hypotheses refer to a null hypothesis that the slope function vanishes only approximately and deviations from null...
Article
The identification of similar patient pathways is a crucial task in healthcare analytics. A flexible tool to address this issue are parametric competing risks models, where transition intensities may be specified by a variety of parametric distributions, thus in particular being possibly time‐dependent. We assess the similarity between two such mod...
Preprint
We consider a linear regression model with complex-valued response and predictors from a compact and connected Lie group. The regression model is formulated in terms of eigenfunctions of the Laplace-Beltrami operator on the Lie group. We show that the normalized Haar measure is an approximate optimal design with respect to all Kiefer's $\Phi_p$-cri...
Preprint
Testing for change points in sequences of high-dimensional covariance matrices is an important and equally challenging problem in statistical methodology with applications in various fields. Motivated by the observation that even in cases where the ratio between dimension and sample size is as small as $0.05$, tests based on a fixed-dimension asymp...
Article
Analyzing the covariance structure of data is a fundamental task of statistics. While this task is simple for low‐dimensional observations, it becomes challenging for more intricate objects, such as multi‐variate functions. Here, the covariance can be so complex that just saving a non‐parametric estimate is impractical and structural assumptions ar...
Preprint
Full-text available
We consider the problem of detecting gradual changes in the sequence of mean functions from a not necessarily stationary functional time series. Our approach is based on the maximum deviation (calculated over a given time interval) between a benchmark function and the mean functions at different time points. We speak of a gradual change of size $\D...
Preprint
In the common partially linear single-index model we establish a Bahadur representation for a smoothing spline estimator of all model parameters and use this result to prove the joint weak convergence of the estimator of the index link function at a given point, together with the estimators of the parametric regression coefficients. We obtain the s...
Preprint
Full-text available
This paper addresses the problem of deciding whether the dose response relationships between subgroups and the full population in a multi-regional trial are similar to each other. Similarity is measured in terms of the maximal deviation between the dose response curves. We consider a parametric framework and develop two powerful bootstrap tests for...
Article
Full-text available
Background The conduct of rare disease clinical trials is still hampered by methodological problems. The number of patients suffering from a rare condition is variable, but may be very small and unfortunately statistical problems for small and finite populations have received less consideration. This paper describes the outline of the iSTORE projec...
Article
In this paper, we compare two regression curves by measuring their difference by the area between the two curves, represented by their \(L^1\)-distance. We develop asymptotic confidence intervals for this measure and statistical tests to investigate the similarity/equivalence of the two curves. Bootstrap methodology specifically designed for equiva...
Article
Full-text available
We consider the problem of predicting values of a random process or field satisfying a linear model y(x)=θ⊤f(x)+ε(x)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y(x)...
Preprint
In the case where the dimension of the data grows at the same rate as the sample size we prove a central limit theorem for the difference of a linear spectral statistic of the sample covariance and a linear spectral statistic of the matrix that is obtained from the sample covariance matrix by deleting a column and the corresponding row. Unlike prev...
Preprint
Full-text available
In this paper we develop a novel bootstrap test for the comparison of two multinomial distributions. The two distributions are called {\it equivalent} or {\it similar} if a norm of the difference between the class probabilities is smaller than a given threshold. In contrast to most of the literature our approach does not require differentiability o...
Preprint
Full-text available
In this paper we compare two regression curves by measuring their difference by the area between the two curves, represented by their $L^1$-distance. We develop asymptotic confidence intervals for this measure and statistical tests to investigate the similarity/equivalence of the two curves. Bootstrap methodology specifically designed for equivalen...
Article
Full-text available
The portmanteau test provides the vanilla method for detecting serial correlations in classical univariate time series analysis. The method is extended to the case of observations from a locally stationary functional time series. Asymptotic critical values are obtained by a suitable block multiplier bootstrap procedure. The test is shown to asympto...
Article
For a spatiotemporal process {Xj (s,t) | s ∈ S, t ∈ T}j=1,…,n, where S denotes the set of spatial locations and T the time domain, we consider the problem of testing for a change in the sequence of mean functions {μj (s,t) | s ∈ S, t ∈ T}j=1,…,n. In contrast to most of the literature, we are not interested in arbitrarily small changes but only in c...
Preprint
For a given $p\times n$ data matrix $\textbf{X}_n$ with i.i.d. centered entries and a population covariance matrix $\bf{\Sigma}$, the corresponding sample precision matrix $\hat{\bf\Sigma}^{-1}$ is defined as the inverse of the sample covariance matrix $\hat{\bf{\Sigma}} = (1/n) \bf{\Sigma}^{1/2} \textbf{X}_n\textbf{X}_n^\top \bf{\Sigma}^{1/2}$. We...
Preprint
Full-text available
This paper takes a different look on the problem of testing the mutual independence of the components of a high-dimensional vector. Instead of testing if all pairwise associations (e.g. all pairwise Kendall's $\tau$) between the components vanish, we are interested in the (null)-hypothesis that all pairwise associations do not exceed a certain thre...
Chapter
The comparison of multivariate population means is a central task of statistical inference . While statistical theory provides a variety of analysis tools, they usually do not protect individuals’ privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a d...
Preprint
We present a general theory to quantify the uncertainty from imposing structural assumptions on the second-order structure of nonstationary Hilbert space-valued processes, which can be measured via functionals of time-dependent spectral density operators. The second-order dynamics are well-known to be elements of the space of trace-class operators,...
Article
Full-text available
We focus on estimating daily integrated volatility ( IV ) by realized measures based on intraday returns following a discrete-time stochastic model with a pronounced intraday periodicity (IP). We demonstrate that neglecting the IP-impact on realized estimators may lead to invalid statistical inference concerning IV for a common finite number of int...
Article
Full-text available
The recent availability of routine medical data, especially in a university‐clinical context, may enable the discovery of typical healthcare pathways, that is, typical temporal sequences of clinical interventions or hospital readmissions. However, such pathways are heterogeneous in a large provider such as a university hospital, and it is important...
Article
For the class of Gauss–Markov processes we study the problem of asymptotic equivalence of the nonparametric regression model with errors given by the increments of the process and the continuous time model, where a whole path of a sum of a deterministic signal and the Gauss–Markov process can be observed. We derive sufficient conditions which imply...
Preprint
Statistical inference for large data panels is omnipresent in modern economic applications. An important benefit of panel analysis is the possibility to reduce noise and thus to guarantee stable inference by intersectional pooling. However, it is wellknown that pooling can lead to a biased analysis if individual heterogeneity is too strong. In clas...
Article
Full-text available
With the goal of improving data based materials design, it is shown that by a sequential design of experiment scheme the process of generating and learning from the data can be combined to discover the relevant sections of the parameter space. The application is the energy of grain boundaries as a function of their geometric degrees of freedom, cal...
Article
With an increasing number of novel therapeutic options for lower urinary tract symptoms (LUTS), the spectrum of potential treatment pathways resulting from different combinations of treatment decisions is expanding and evolving. Treatment decisions are frequently made with little or no evidence from randomized controlled trials (RCTs) and thus requ...
Preprint
For a spatiotemporal process $\{X_j(s,t) | ~s \in S~,~t \in T \}_{j =1, \ldots , n} $, where $S$ denotes the set of spatial locations and $T$ the time domain, we consider the problem of testing for a change in the sequence of mean functions. In contrast to most of the literature we are not interested in arbitrarily small changes, but only in change...
Preprint
We develop methodology for testing hypotheses regarding the slope function in functional linear regression for time series via a reproducing kernel Hilbert space approach. In contrast to most of the literature, which considers tests for the exact nullity of the slope function, we are interested in the null hypothesis that the slope function vanishe...
Preprint
Most of the popular dependence measures for two random variables $X$ and $Y$ (such as Pearson's and Spearman's correlation, Kendall's $\tau$ and Gini's $\gamma$) vanish whenever $X$ and $Y$ are independent. However, neither does a vanishing dependence measure necessarily imply independence, nor does a measure equal to 1 imply that one variable is a...
Preprint
Full-text available
Frequency domain methods form a ubiquitous part of the statistical toolbox for time series analysis. In recent years, considerable interest has been given to the development of new spectral methodology and tools capturing dynamics in the entire joint distributions and thus avoiding the limitations of classical, $L^2$-based spectral methods. Most of...
Article
Full-text available
We consider the problem of designing experiments for the comparison of two regression curves describing the relation between a predictor and a response in two groups, where the data between and within the group may be dependent. In order to derive efficient designs we use results from stochastic analysis to identify the best linear unbiased estimat...
Preprint
Full-text available
Data based materials science is the new promise to accelerate materials design. Especially in computational materials science, data generation can easily be automatized. Usually, the focus is on processing and evaluating the data to derive rules or to discover new materials, while less attention is being paid on the strategy to generate the data. I...
Preprint
The comparison of multivariate population means is a central task of statistical inference. While statistical theory provides a variety of analysis tools, they usually do not protect individuals' privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a di...
Preprint
Full-text available
The recent availability of routine medical data, especially in a university-clinical context, may enable the discovery of typical healthcare pathways, i.e., typical temporal sequences of clinical interventions or hospital readmissions. However, such pathways are heterogeneous in a large provider such as a university hospital, and it is important to...
Preprint
The problem of constructing a simultaneous confidence band for the mean function of a locally stationary functional time series $ \{ X_{i,n} (t) \}_{i = 1, \ldots, n}$ is challenging as these bands can not be built on classical limit theory. On the one hand, for a fixed argument $t$ of the functions $ X_{i,n}$, the maximum absolute deviation betwee...
Preprint
In this work we introduce a new approach for statistical quantification of differential privacy in a black box setting. We present estimators and confidence intervals for the optimal privacy parameter of a randomized algorithm A, as well as other key variables (such as the novel "data-centric privacy level"). Our estimators are based on a local cha...
Preprint
In this paper we consider the linear regression model $Y =S X+\varepsilon $ with functional regressors and responses. We develop new inference tools to quantify deviations of the true slope $S$ from a hypothesized operator $S_0$ with respect to the Hilbert--Schmidt norm $\| S- S_0\|^2$, as well as the prediction error $\mathbb{E} \| S X - S_0 X \|^...
Preprint
Independent $p$-dimensional vectors with independent complex or real valued entries such that $\mathbb{E} [\mathbf{x}_i] = \mathbf{0}$, ${\rm Var } (\mathbf{x}_i) = \mathbf{I}_p$, $i=1, \ldots,n$, let $\mathbf{T }_n$ be a $p \times p$ Hermitian nonnegative definite matrix and $f $ be a given function. We prove that an approriately standardized vers...
Article
In this paper we propose statistical inference tools for the covariance operators of functional time series in the two sample and change point problem. In contrast to most of the literature, the focus of our approach is not testing the null hypothesis of exact equality of the covariance operators. Instead, we propose to formulate the null hypothese...
Article
Optimal portfolio selection problems are determined by the (unknown) parameters of the data generating process. If an investor wants to realize the position suggested by the optimal portfolios, he/she needs to estimate the unknown parameters and to account for the parameter uncertainty in the decision process. Most often, the parameters of interest...
Article
Clinical trials often aim to compare two groups of patients for efficacy and/or toxicity depending on covariates such as dose. Examples include the comparison of populations from different geographic regions or age classes or, alternatively, of different treatment groups. Similarity of these groups can be claimed if the difference in average outcom...
Article
In this note we consider the optimal design problem for estimating the slope of a polynomial regression with no intercept at a given point, say z. In contrast to previous work, we investigate the model on the non-symmetric interval.
Article
Full-text available
In this paper we construct optimal designs for frequentist model averaging estimation. We derive the asymptotic distribution of the model averaging estimate with fixed weights in the case where the competing models are non-nested. A Bayesian optimal design minimizes an expectation of the asymptotic mean squared error of the model averaging estimate...
Article
Diurnal fluctuations in volatility are a well-documented stylized fact of intraday price data. This warrants an investigation how this intraday periodicity (IP) affects both finite sample as well as asymptotic properties of several popular realized estimators of daily integrated volatility which are based on functionals of a finite number of intrad...
Preprint
We show that polynomials do not belong to the reproducing kernel Hilbert space of infinitely differentiable translation-invariant kernels whose spectral measures have moments corresponding to a determinate moment problem. Our proof is based on relating this question to the problem of best linear estimation in continuous time one-parameter regressio...
Preprint
Full-text available
We consider the problem of designing experiments for the comparison of two regression curves describing the relation between a predictor and a response in two groups, where the data between and within the group may be dependent. In order to derive efficient designs we use results from stochastic analysis to identify the best linear unbiased estimat...
Article
In the common time series model Xi,n = μ(i/n) +εi,n with non-stationary errors we consider the problem of detecting a significant deviation of the mean function μ from a benchmark g(μ) (suchastheini-tial value μ(0) or the average trend∫ 1 0 μ(t)dt). The problem is motivated by a more realistic modelling of change point analysis, where one is inter-...
Article
In traditional pharmacokinetic (PK) bioequivalence analysis, two one-sided tests (TOST) are conducted on the area under the concentration-time curve and the maximal concentration derived using a non-compartmental approach. When rich sampling is unfeasible, a model-based (MB) approach, using nonlinear mixed effect models (NLMEM) is possible. However...
Article
We study the problem of testing the equivalence of functional parameters, such as the mean or variance function, in the two sample functional data problem. In contrast to previous work, which reduces the functional problem to a multiple testing problem for the equivalence of scalar data by comparing the functions at each point, our approach is base...
Article
This article studies the problem whether two convex (concave) regression functions modelling the relation between a response and covariate in two samples differ by a shift in the horizontal and/or vertical axis. We consider a nonparametric situation assuming only smoothness of the regression functions. A graphical tool based on the derivatives of t...
Preprint
In this note we consider the optimal design problem for estimating the slope of a polynomial regression with no intercept at a given point, say z. In contrast to previous work, which considers symmetric design spaces we investigate the model on the interval $[0, a]$ and characterize those values of $z$, where an explicit solution of the optimal des...
Preprint
The Portmanteau test provides the vanilla method for detecting serial correlations in classical univariate time series analysis. The method is extended to the case of observations from a locally stationary functional time series. Asymptotic critical values are obtained by a suitable block multiplier bootstrap procedure. The test is shown to asympto...
Article
Full-text available
We propose a new sequential monitoring scheme for changes in the parameters of a multivariate time series. In contrast to procedures proposed in the literature which compare an estimator from the training sample with an estimator calculated from the remaining data, we suggest to divide the sample at each time point after the training sample. Estima...
Article
The classical approach to analyze pharmacokinetic (PK) data in bioequivalence studies aiming to compare two different formulations is to perform noncompartmental analysis (NCA) followed by two one-sided tests (TOST). In this regard, the PK parameters area under the curve (AUC) and $C_{\max}$ are obtained for both treatment groups and their geometri...
Article
This paper considers the problem of estimating a change point in the covariance matrix in a sequence of high-dimensional vectors, where the dimension is substantially larger than the sample size. A two-stage approach is proposed to efficiently estimate the location of the change point. The first step consists of a reduction of the dimension to iden...
Preprint
In this paper we propose statistical inference tools for the covariance operators of functional time series in the two sample and change point problem. In contrast to most of the literature the focus of our approach is not testing the null hypothesis of exact equality of the covariance operators. Instead we propose to formulate the null hypotheses...
Preprint
Change point detection in high dimensional data has found considerable interest interest in recent years. Most of the literature designs methodology for a retrospective analysis, where the whole sample is already available when the statistical inference begins. This paper develops monitoring schemes for the online scenario, where high dimensional d...
Preprint
In the common time series model $X_{i,n} = \mu (i/n) + \varepsilon_{i,n}$ with non-stationary errors we consider the problem of detecting a significant deviation of the mean function $\mu$ from a benchmark $g (\mu )$ (such as the initial value $\mu (0)$ or the average trend $\int_{0}^{1} \mu (t) dt$). The problem is motivated by a more realistic mo...
Article
We develop methodology for testing relevant hypotheses about functional time series in a tuning‐free way. Instead of testing for exact equality, e.g. for the equality of two mean functions from two independent time series, we propose to test the null hypothesis of no relevant deviation. In the two‐sample problem this means that an L2‐distance betwe...
Article
Full-text available
In this paper we study the theoretical properties of the simultaneous multiscale change point estimator (SMUCE) in piecewise‐constant signal models with dependent error processes. Empirical studies suggest that in this case the change point estimate is inconsistent, but it is not known if alternatives suggested in the literature for correlated data...
Preprint
We study the problem of testing the equivalence of functional parameters (such as the mean or variance function) in the two sample functional data problem. In contrast to previous work, which reduces the functional problem to a multiple testing problem for the equivalence of scalar data by comparing the functions at each point, our approach is base...
Preprint
Motivated by the need to statistically quantify differences between modern (complex) data-sets which commonly result as high-resolution measurements of stochastic processes varying over a continuum, we propose novel testing procedures to detect relevant differences between the second order dynamics of two functional time series. In order to take th...
Preprint
The estimation of covariance operators of spatio-temporal data is in many applications only computationally feasible under simplifying assumptions, such as separability of the covariance into strictly temporal and spatial factors.Powerful tests for this assumption have been proposed in the literature. However, as real world systems, such as climate...
Article
In this paper, we investigate the asymptotic distribution of likelihood ratio tests in models with several groups, when the number of groups converges with the dimension and sample size to infinity. We derive central limit theorems for the logarithm of various test statistics and compare our results with the approximations obtained from a central l...
Preprint
Full-text available
The classical approach to analyze pharmacokinetic (PK) data in bioequivalence studies aiming to compare two different formulations is to perform noncompartmental analysis (NCA) followed by two one-sided tests (TOST). In this regard the PK parameters $AUC$ and $C_{max}$ are obtained for both treatment groups and their geometric mean ratios are consi...
Preprint
Classical change point analysis aims at (1) detecting abrupt changes in the mean of a possibly non-stationary time series and at (2) identifying regions where the mean exhibits a piecewise constant behavior. In many applications however, it is more reasonable to assume that the mean changes gradually in a smooth way. Those gradual changes may eithe...
Preprint
Full-text available
We develop an estimator for the high-dimensional covariance matrix of a locally stationary process with a smoothly varying trend and use this statistic to derive consistent predictors in non-stationary time series. In contrast to the currently available methods for this problem the predictor developed here does not rely on fitting an autoregressive...
Article
Permutation tests were first introduced in Eden and Yates (1933), Fisher (1935) and Pitman (1937a, 1937b, 1938), and are popular nowadays due to several nice properties they possess and the cheap availability of computation power of modern computers. In this paper, we demonstrate potential power loss of permutation tests using the prototype permuta...
Preprint
Detecting structural changes in functional data is a prominent topic in statistical literature. However not all trends in the data are important in applications, but only those of large enough influence. In this paper we address the problem of identifying relevant changes in the eigenfunctions and eigenvalues of covariance kernels of $L^2[0,1]$-val...
Article
In this paper, we consider the optimal design problem for extrapolation and estimation of the slope at a given point, say z, in a polynomial regression with no intercept. We provide explicit solutions of these problems in many cases and characterize those values of z, where this is not possible.

Network

Cited By