Article

Adaptive Hausdorff estimation of density level sets

08/2009; DOI:doi:10.1214/08-AOS661
Source: arXiv

ABSTRACT Consider the problem of estimating the $\gamma$-level set $G^*_{\gamma}=\{x:f(x)\geq\gamma\}$ of an unknown $d$-dimensional density function $f$ based on $n$ independent observations $X_1,...,X_n$ from the density. This problem has been addressed under global error criteria related to the symmetric set difference. However, in certain applications a spatially uniform mode of convergence is desirable to ensure that the estimated set is close to the target set everywhere. The Hausdorff error criterion provides this degree of uniformity and, hence, is more appropriate in such situations. It is known that the minimax optimal rate of error convergence for the Hausdorff metric is $(n/\log n)^{-1/(d+2\alpha)}$ for level sets with boundaries that have a Lipschitz functional form, where the parameter $\alpha$ characterizes the regularity of the density around the level of interest. However, the estimators proposed in previous work are nonadaptive to the density regularity and require knowledge of the parameter $\alpha$. Furthermore, previously developed estimators achieve the minimax optimal rate for rather restricted classes of sets (e.g., the boundary fragment and star-shaped sets) that effectively reduce the set estimation problem to a function estimation problem. This characterization precludes level sets with multiple connected components, which are fundamental to many applications. This paper presents a fully data-driven procedure that is adaptive to unknown regularity conditions and achieves near minimax optimal Hausdorff error control for a class of density level sets with very general shapes and multiple connected components. Comment: Published in at http://dx.doi.org/10.1214/08-AOS661 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

0 0
 · 
0 Bookmarks
 · 
51 Views
  • Article: Nonparametric Estimation of Regression Level Sets
    [show abstract] [hide abstract]
    ABSTRACT: Let where f is an unknown regression function, (ξ1,...,ξn) are iid centered gaussian variables independent of the design (X 1,……X n), Consider the problem of estimating the level set from (X 1, Y 1),….. (X n, Y n)Consider under certain assumptions on the boundary smoothness of G. We propose piecewise-polynomial estimators based on the maximization of local empirical excess masses. With assumptions on the design we show that these estimators have optimal rates of convergence in an asymptotically minimax meaning, within studied classes of regressions. For “bad” design we obtain other, non-optimal, rates. We generalize these results to the N-dimensional case, N ≠ 2.
    Statistics: A Journal of Theoretical and Applied Statistics. 01/1997; 29(2):131-160.
  • Article: PLUG‐IN ESTIMATION OF GENERAL LEVEL SETS
    [show abstract] [hide abstract]
    ABSTRACT: Given an unknown function (e.g. a probability density, a regression function, …) f and a constant c, the problem of estimating the level set L(c) ={f≥c} is considered. This problem is tackled in a very general framework, which allows f to be defined on a metric space different from . Such a degree of generality is motivated by practical considerations and, in fact, an example with astronomical data is analyzed where the domain of f is the unit sphere. A plug-in approach is followed; that is, L(c) is estimated by Ln(c) ={fn≥c}, where fn is an estimator of f. Two results are obtained concerning consistency and convergence rates, with respect to the Hausdorff metric, of the boundaries ∂Ln(c) towards ∂L(c). Also, the consistency of Ln(c) to L(c) is shown, under mild conditions, with respect to the L1 distance. Special attention is paid to the particular case of spherical data.
    Australian &amp New Zealand Journal of Statistics 03/2006; 48(1):7 - 19. · 0.44 Impact Factor
  • Source
    Article: Wedgelets: nearly minimax estimation of edges
    [show abstract] [hide abstract]
    ABSTRACT: We study a simple “horizon model” for the problem of recovering an image from noisy data; in this model the image has an edge with $\alpha$-Hölder regularity. Adopting the viewpoint of computational harmonic analysis, we develop an overcomplete collection of atoms called wedgelets, dyadically organized indicator functions with a variety of locations, scales and orientations. The wedgelet representation provides nearly optimal representations of objects in the horizon model, as measured by minimax description length. We show how to rapidly compute a wedgelet approximation to noisy data by finding a special edgelet-decorated recursive partition which minimizes a complexity-penalized sum of squares. This estimate, using sufficient subpixel resolution, achieves nearly the minimax mean-squared error in the horizon model. In fact, the method is adaptive in the sense that it achieves nearly the minimax risk for any value of the unknown degree of regularity of the horizon, $1 \leq \alpha \leq 2$. Wedgelet analysis and denoising may be used successfully outside the horizon model. We study images modelled as indicators of star-shaped sets with smooth boundaries and show that complexity-penalized wedgelet partitioning achieves nearly the minimax risk in that setting also.

Full-text

View
2 Downloads
Available from

Keywords

$\gamma$-level
 
characterization precludes level sets
 
data-driven procedure
 
density level sets
 
density regularity
 
function estimation problem
 
general shapes
 
global error criteria
 
Hausdorff error criterion
 
level sets
 
Lipschitz functional form
 
minimax optimal Hausdorff error control
 
minimax optimal rate
 
parameter $\alpha$
 
parameter $\alpha$ characterizes
 
set estimation problem
 
sets
 
star-shaped sets
 
unknown $d$-dimensional density function $f$
 
unknown regularity conditions