Multivariate Chebyshev Inequalities

The Annals of Mathematical Statistics 12/1960; 31(4). DOI: 10.1214/aoms/1177705673
Source: OAI


If $X$ is a random variable with $EX^2 = \sigma^2$, then by Chebyshev's inequality, \begin{equation*}\tag{1.1}P\{|X| \geqq \epsilon\} \leqq \sigma^2/\epsilon^2.\end{equation*} If in addition $EX = 0$, one obtains a corresponding one-sided inequality \begin{equation*}\tag{1.2}\quad P\{X \geqq \epsilon\} \leqq \sigma^2/ (\epsilon^2 + \sigma^2)\end{equation*} (see, e.g., [8] p. 198). In each case a distribution for $X$ is known that results in equality, so that the bounds are sharp. By a change of variable we can take $\epsilon = 1$. There are many possible multivariate extensions of (1.1) and (1.2). Those providing bounds for $P\{\max_{1 \leqq j \leqq k} |X_j| \geqq 1\}$ and $P\{|\max_{1 \leqq j \leqq k} X_j \geqq 1\}$ have been investigated in [3, 5, 9] and [4], respectively. We consider here various inequalities involving (i) the minimum component or (ii) the product of the components of a random vector. Derivations and proofs of sharpness for these two classes of extensions show remarkable similarities. Some of each type occur as special cases of a general theorem in Section 3. Bounds are given under various assumptions concerning variances, covariances and independence. Notation. We denote the vector $(1, \cdots, 1)$ by $e$ and $(0, \cdots, 0)$ by 0; the dimensionality will be clear from the context. If $x = (x_1, \cdots, x_k)$ and $y = (y_1, \cdots, y_k)$, we write $x \geqq y(x > y)$ to mean $x_j \geqq y_j(x_j > y_j), j = 1, 2, \cdots, k$. If $\Sigma = (\sigma_{ij}): k \times k$ is a moment matrix, for convenience we write $\sigma_{jj} = \sigma^2_j, j = 1, \cdots, k$. Unless otherwise stated, we assume that $\Sigma$ is positive definite.

34 Reads
  • Source
    • "A direct way to reformulate (SVM-RCCP) into SDP model is to use multivariate Chebyshev inequality (Bhattacharyya et al. 2004; Shivaswamy et al. 2006). Let˜x ∼ (μ, Σ) denote random vector˜x with mean μ and convariance matrix Σ, the multivariate Chebyshev inequality (Marshall and Olkin 1960; Bertsimas and Popescu 2005) states that for an arbitrary closed convex set S, the supremum of the probability that˜x takes a value in S is "
    [Show abstract] [Hide abstract]
    ABSTRACT: Support vector machines (SVM) is one of the well known supervised classes of learning algorithms. Basic SVM models are dealing with the situation where the exact values of the data points are known. This paper studies SVM when the data points are uncertain. With some properties known for the distributions, chance-constrained SVM is used to ensure the small probability of misclassification for the uncertain data. As infinite number of distributions could have the known properties, the robust chance-constrained SVM requires efficient transformations of the chance constraints to make the problem solvable. In this paper, robust chance-constrained SVM with second-order moment information is studied and we obtain equivalent semidefinite programming and second order cone programming reformulations. The geometric interpretation is presented and numerical experiments are conducted. Three types of estimation errors for mean and covariance information are studied in this paper and the corresponding formulations and techniques to handle these types of errors are presented.
    Annals of Operations Research 10/2015; DOI:10.1007/s10479-015-2039-6 · 1.22 Impact Factor
  • Source
    • "The history of classical inequalities can be found in [69], and some generalizations in [14] and [147]; in the latter works, the connection between ˘ Ceby˘ sev inequalities and optimization theory is developed based on the work of Mulholland and Rogers [96], Godwin [47], Isii [59] [60] [61], Olkin and Pratt [104], Marshall and Olkin [92], and the classical Markov–Krein Theorem [69, pages 82 & 157], among others. We also refer to the field of majorization, as discussed in Marshall and Olkin [93], the inequalities of Anderson [4], Hoeffding [53], Joe [63], Bentkus et al. [11], Bentkus [9] [10], Pinelis [116] [117], and Boucheron, Lugosi and Massart [19]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The past century has seen a steady increase in the need of estimating and predicting complex systems and making (possibly critical) decisions with limited information. Although computers have made possible the numerical evaluation of sophisticated statistical models, these models are still designed \emph{by humans} because there is currently no known recipe or algorithm for dividing the design of a statistical model into a sequence of arithmetic operations. Indeed enabling computers to \emph{think} as \emph{humans} have the ability to do when faced with uncertainty is challenging in several major ways: (1) Finding optimal statistical models remains to be formulated as a well posed problem when information on the system of interest is incomplete and comes in the form of a complex combination of sample data, partial knowledge of constitutive relations and a limited description of the distribution of input random variables. (2) The space of admissible scenarios along with the space of relevant information, assumptions, and/or beliefs, tend to be infinite dimensional, whereas calculus on a computer is necessarily discrete and finite. With this purpose, this paper explores the foundations of a rigorous framework for the scientific computation of optimal statistical estimators/models and reviews their connections with Decision Theory, Machine Learning, Bayesian Inference, Stochastic Optimization, Robust Optimization, Optimal Uncertainty Quantification and Information Based Complexity.
  • Source
    • "That is, under all possible choices of class-conditional densities with a given mean and covariance matrix, the worst-case probability of misclassification of new data is minimized. For doing so, the authors exploited generalized Chebyshev inequalities [12] and particularly a theorem according to which the probability of misclassifying a point is bounded. Shivaswamy et al. [13], who extended Bhattacharyya et al. [14], also adopted a second order cone programming formulation and used generalized Chebyshev inequalities to design robust classifiers dealing with uncertain observations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a maximum margin classifier that deals with uncertainty in data input. Specifically, we reformulate the SVM framework such that each input training entity is not solely a feature vector representation, but a multi-dimensional Gaussian distribution with given probability density, i.e., with a given mean and covariance matrix. The latter expresses the uncertainty. We arrive at a convex optimization problem, which is solved in the primal form using a gradient descent approach. The resulting classifier, which we name SVM with Gaussian Sample Uncertainty (SVM-GSU), is tested on synthetic data, as well as on the problem of event detection in video using the large-scale TRECVID MED 2014 dataset, and the problem of image classification using the MNIST dataset of handwritten digits. Experimental results verify the effectiveness of the proposed classifier.
Show more


34 Reads
Available from