## No full-text available

To read the full-text of this research,

you can request a copy directly from the author.

In this paper the concept of a metric in the space of random variables defined on a probability space is introduced. The principle of three stages in the study of approximation problems is formulated, in particular problems of approximating distributions.
Various facts connected with the use of metrics in these three stages are presented and proved. In the second part of the paper a series of results is introduced which are related to stability problems in characterizing distributions and to problems of estimating the remainder terms in limiting approximations of distributions of sums of independent random variables.
Both the account of properties of metrics and the application of these facts in the second part of the paper are presented under the assumption that the random variables take values in a general space (metric, Banach or Hilbert space).
Bibliography: 11 titles.

To read the full-text of this research,

you can request a copy directly from the author.

... random variables, using the Gnedenko's Transfer Theorem (see [10] and [12]). The rates of convergence in weak limit theorems for geometric random sums will be also estimated via Zolotarev probability metric (see [3,29,30,31], and [24]). The received results are related to the class of heavy-tailed distributions which has been well-known such as exponential, Laplace and Linnik distributions (see [18]). ...

... random variables. Moreover, this metric could be compared with well-known metrics like Kolmogorov metric, total variation metric, Lévy-Prokhorov metric and the probability metric based on Trotter operator, etc. (see [3,13,29,30,31], and [14]). ...

... Definition 2.2. (Zolotarev [29] Let X, Y ∈ X. The Zolotarev probability metric on X between two random variables X and Y, denoted by d s (X, Y ), is defined by [3], Manou-Abi [24], Zolotarev [29] and Zolotarev [31]) ...

... random variables. Moreover, this metric can be compared with well-known metrics like Kolmogorov metric, total variation metric, Lévy-Prokhorov metric and probability metric based on Trotter's operator, etc. (see [3], [30], [31], [32], [10] and [11] for more details). ...

... Before stating the main results we first recall the definition of Zolotarev's probability metric and provide some auxiliary results which will be used in this paper. For a deeper discussion of Zolotarev's probability metric and its applications we refer the reader to [3], [30], [31], [32], [33], and [12]. We denote by X the set of all random variables defined on a probability space (Ω, A, P). ...

... 2. Let d Z (X n , X) −→ 0 as n → ∞. Then X n D −→ X as n → ∞. (see for instance [30], p. 424). ...

Let X1,X2,... be a sequence of independent, identically distributed random variables. Let ?p be a geometric random variable with parameter p?(0,1), independent of all Xj, j ? 1: Assume that ? : N ? R+ is a positive normalized function such that ?(n) = o(1) when n ? +?. The paper deals with the rate of convergence for distributions of randomly normalized geometric random sums ?(?p) ??p,j=1 Xj to symmetric stable laws in term of Zolotarev?s probability metric.

... Introducing Zolotarev's ζ distances (dual to smooth function norms). Leaving aside for a moment the problem of nonideal exponents in bounds like (29) with (50), it turns out that bounds with κ 3 on M 3,2 replaced by an even weaker norm, namely Zolotarev's (1976) ζ 3 , can be obtained easily, given Zolotarev's (1973) paper. We may, as Christoph (1979) essentially did, apparently independently of Zolotarev (1976), introduce ζ 3 , and more generally ζ r with in this paper for simplicity r ∈ N 0 , similarly to κ r in (45) above, roughly speaking by performing r integrations by parts on x r dM(x), rather than just zero as in (43) or one as in (45). ...

... Leaving aside for a moment the problem of nonideal exponents in bounds like (29) with (50), it turns out that bounds with κ 3 on M 3,2 replaced by an even weaker norm, namely Zolotarev's (1976) ζ 3 , can be obtained easily, given Zolotarev's (1973) paper. We may, as Christoph (1979) essentially did, apparently independently of Zolotarev (1976), introduce ζ 3 , and more generally ζ r with in this paper for simplicity r ∈ N 0 , similarly to κ r in (45) above, roughly speaking by performing r integrations by parts on x r dM(x), rather than just zero as in (43) or one as in (45). This leads to the expression |h M,r | dλ in (69), with h M,r defined by (55,56), and to an alternative representation in (65) via (58). ...

The classical Berry-Esseen error bound, for the normal approximation to the law of a sum of independent and identically distributed random variables, is here improved by replacing the standardised third absolute moment by a weak norm distance to normality. We thus sharpen and simplify two results of Ulyanov (1976) and of Senatov (1998), each of them previously optimal, in the line of research initiated by Zolotarev (1965) and Paulauskas (1969). Our proof is based on a seemingly incomparable normal approximation theorem of Zolotarev (1986), combined with our main technical result: The Kolmogorov distance (supremum norm of difference of distribution functions) between a convolution of two laws and a convolution of two Lipschitz laws is bounded homogeneously of degree 1 in the pair of the Wasserstein distances (L$^1$ norms of differences of distribution functions) of the corresponding factors, and also, inessentially for the present application, in the pair of the Lipschitz constants. Side results include a short introduction to $\zeta$ norms on the real line, simpler inequalities for various probability distances, slight improvements of the theorem of Zolotarev (1986) and of a lower bound theorem of Bobkov, Chistyakov and G\"otze (2012), an application to sampling from finite populations, auxiliary results on rounding and on winsorisation, and computations of a few examples. The introductory section in particular is aimed at analysts in general rather than specialists in probability approximations.

... Definition 3 Zolotarev's λ-metric d Z between two cumulative distribution functions P, Q on the real line is defined by [70] d Z (P, Q) = min ...

... Denote byb := e 2γ+4e 4γ+1ξÃ . If 4e 4γ+1Ãξ ≤ 1, then Corollary 2 implies that log p/p * ∞ ≤ 2γ + 4e 4γ+1ξÃ (70) and for all δ ∈ (0, 1) such that (4ebÃ) 2 m ≤ δk. Corollary 3 implies the existence of the maximum entropy densities p and q with probability at least 1 − δ and it holds that ...

Domain adaptation algorithms are designed to minimize the misclassification risk of a discriminative model for a target domain with little training data by adapting a model from a source domain with a large amount of training data. Standard approaches measure the adaptation discrepancy based on distance measures between the empirical probability distributions in the source and target domain. In this setting, we address the problem of deriving generalization bounds under practice-oriented general conditions on the underlying probability distributions. As a result, we obtain generalization bounds for domain adaptation based on finitely many moments and smoothness conditions.

... The distances ζ s defined by (1.5) are particularly useful for our purposes. Such distances occur very naturally in connection to the Lindeberg's proof of CLT, they are used as a tool in bounding other distances (the sup-norms over convex sets, bounded Lipschitz distance, etc) and they were advocated in [54]. In particular, the following fact is straightforward: if X 1 , . . . ...

... However, in some special cases, in particular, in the case of random vectors with independent components, one can improve bounds on ζ s -distance rather substantially. The following fact is very simple and well known (see [54] for similar statements). ...

Let $X^{(n)}$ be an observation sampled from a distribution $P_{\theta}^{(n)}$ with an unknown parameter $\theta,$ $\theta$ being a vector in a Banach space $E$ (most often, a high-dimensional space of dimension $d$). We study the problem of estimation of $f(\theta)$ for a functional $f:E\mapsto {\mathbb R}$ of some smoothness $s>0$ based on an observation $X^{(n)}\sim P_{\theta}^{(n)}.$ Assuming that there exists an estimator $\hat \theta_n=\hat \theta_n(X^{(n)})$ of parameter $\theta$ such that $\sqrt{n}(\hat \theta_n-\theta)$ is sufficiently close in distribution to a mean zero Gaussian random vector in $E,$ we construct a functional $g:E\mapsto {\mathbb R}$ such that $g(\hat \theta_n)$ is an asymptotically normal estimator of $f(\theta)$ with $\sqrt{n}$ rate provided that $s>\frac{1}{1-\alpha}$ and $d\leq n^{\alpha}$ for some $\alpha\in (0,1).$ We also derive general upper bounds on Orlicz norm error rates for estimator $g(\hat \theta)$ depending on smoothness $s,$ dimension $d,$ sample size $n$ and the accuracy of normal approximation of $\sqrt{n}(\hat \theta_n-\theta).$ In particular, this approach yields asymptotically efficient estimators in some high-dimensional exponential models.

... Definition 3 Zolotarev's λ-metric d Z between two cumulative distribution functions P, Q on the real line is defined by [70] d Z (P, Q) = min ...

... Denote byb := e 2γ+4e 4γ+1ξÃ . If 4e 4γ+1Ãξ ≤ 1, then Corollary 2 implies that log p/p * ∞ ≤ 2γ + 4e 4γ+1ξÃ (70) and for all δ ∈ (0, 1) such that (4ebÃ) 2 m ≤ δk. Corollary 3 implies the existence of the maximum entropy densities p and q with probability at least 1 − δ and it holds that ...

Domain adaptation algorithms are designed to minimize the misclassification risk of a discriminative model for a target domain with little training data by adapting a model from a source domain with a large amount of training data. Standard approaches measure the adaptation discrepancy based on distance measures between the empirical probability distributions in the source and target domain. In this setting, we address the problem of deriving learning bounds under practice-oriented general conditions on the underlying probability distributions. As a result, we obtain learning bounds for domain adaptation based on finitely many moments and smoothness conditions.

... We call ζ u, * v and ζ u,v second order Zolotarev-type distances since they are simple metric distances in the sense of Zolotarev (Zolotarev [1976]) and can be seen as the discrete counterparts of the distance ζ 2 defined on the set of real probability distributions (the distance ζ 2 , introduced in Zolotarev [1976] and further studied in Rio [1998], is associated to the set of continuously differentiable functions on R whose derivative is Lipschitz). The contraction property reads as follows: ...

... We call ζ u, * v and ζ u,v second order Zolotarev-type distances since they are simple metric distances in the sense of Zolotarev (Zolotarev [1976]) and can be seen as the discrete counterparts of the distance ζ 2 defined on the set of real probability distributions (the distance ζ 2 , introduced in Zolotarev [1976] and further studied in Rio [1998], is associated to the set of continuously differentiable functions on R whose derivative is Lipschitz). The contraction property reads as follows: ...

The object of this thesis is the study of some analytical and asymptotic properties of Markov processes, and their applications to Stein's method. The point of view consists in the development of functional inequalities in order to obtain upper-bounds on the distance between probability distributions. The first part is devoted to the asymptotic study of time-inhomogeneous Markov processes through Poincaré-like inequalities, established by precise estimates on the spectrum of the transition operator. The first investigation takes place within the framework of the Central Limit Theorem, which states the convergence of the renormalized sum of random variables towards the normal distribution. It results in the statement of a Berry-Esseen bound allowing to quantify this convergence with respect to the chi-2 distance, a natural quantity which had not been investigated in this setting. It therefore extends similar results relative to other distances (Kolmogorov, total variation, Wasserstein). Keeping with the non-homogeneous framework, we consider a weakly mixing process linked to a stochastic algorithm for median approximation. This process evolves by jumps of two sorts (to the right or to the left) with time-dependent size and intensity. An upper-bound on the Wasserstein distance of order 1 between the marginal distribution of the process and the normal distribution is provided when the latter is invariant under the dynamic, and extended to examples where only the asymptotic normality stands. The second part concerns intertwining relations between (homogeneous) Markov processes and gradients, which can be seen as refinment of the Bakry-Emery criterion, and their application to Stein's method, a collection of techniques to estimate the distance between two probability distributions. Second order intertwinings for birth-death processes are stated, going one step further than the existing first order relations. These relations are then exploited to construct an original and universal method of evaluation of discrete Stein's factors, a key component of Stein-Chen's method.

... These properties classify d s as an ideal probability metric in the sense of Zolotarev [38]. Few years after the publication of [24], Baringhaus and Grübel [2], in connection with the study of convex combinations of random variables with random coefficients, considered a Fourier metric similar to (3.1), defined by ...

... A second interesting functional that has been shown to be monotonically decreasing along the solution to Fokker-Planck type equations [23] is the Hellinger distance. For any given pair of probability densities f and h defined on R, the Hellinger distance d H (f, h) is [38] (3. 26) ...

We consider here a Fokker--Planck equation with variable coefficient of diffusion which appears in the modeling of the wealth distribution in a multi-agent society. At difference with previous studies, to describe a society in which agents can have debts, we allow the wealth variable to be negative. It is shown that, even starting with debts, if the initial mean wealth is assumed positive, the solution of the Fokker--Planck equation is such that debts are absorbed in time, and a unique equilibrium density located in the positive part of the real axis will be reached.

... We call ζ u, * v and ζ u,v second order Zolotarev-type distances since they are simple metric distances in the sense of Zolotarev (Zolotarev [1976]) and can be seen as the discrete counterparts of the distance ζ 2 defined on the set of real probability distributions (the distance ζ 2 , introduced in Zolotarev [1976] and further studied in Rio [1998], is associated to the set of continuously differentiable functions on R whose derivative is Lipschitz). The contraction property reads as follows: ...

This article investigates second order intertwinings between semigroups of birth-death processes and discrete gradients on the space of natural integers N. It goes one step beyond a recent work of Chafa{\"i} and Joulin which establishes and applies to the analysis of birth-death semigroups a first order intertwining. Similarly to the first order relation, the second order intertwining involves birth-death and Feynman-Kac semigroups and weighted gradients on N, and can be seen as a second derivative relation. As our main application, we provide new quantitative bounds on the Stein factors of discrete distributions. To illustrate the relevance of this approach, we also derive approximation results for the mixture of Poisson and geometric laws.

... These properties classify d s as an ideal probability metric in the sense of Zolotarev [48]. ...

The Luria--Delbr\"uck mutation model is a cornerstone of evolution theory and has been mathematically formulated in a number of ways. In this paper we illustrate how this model of mutation rates can be derived by means of classical statistical mechanics tools, in particular by modeling the phenomenon resorting to methodologies borrowed from classical kinetic theory of rarefied gases. The aim is to construct a linear kinetic model that can reproduce the Luria--Delbr\"uck distribution starting from the elementary interactions that qualitatively and quantitatively describe the variation of mutated cells. The kinetic description is easily adaptable to different situations and makes it possible to clearly identify the differences between the elementary variations leading to the formulations of Luria--Delbr\"uck, Lea--Coulson, and Kendall, respectively. The kinetic approach additionally emphasizes basic principles which not only help to unify existing results but also allow for useful extensions.

... Estimate ρ0 by the CNF and samples Q = {q j } m j=1 from ρ 1 to determine if their corresponding distributions are close in some sense. To measure the discrepancy we use a particular integral probability metric [75,51,38] known as maximum mean discrepancy (MMD) defined as follows [24]: Let x and y be random vectors in R d with distributions µ x and µ y , respectively, and let H be a reproducing kernel Hilbert space of functions on R d with Gaussian kernel (see [49] for an introduction) ...

A normalizing flow (NF) is a mapping that transforms a chosen probability distribution to a normal distribution. Such flows are a common technique used for data generation and density estimation in machine learning and data science. The density estimate obtained with a NF requires a change of variables formula that involves the computation of the Jacobian determinant of the NF transformation. In order to tractably compute this determinant, continuous normalizing flows (CNF) estimate the mapping and its Jacobian determinant using a neural ODE. Optimal transport (OT) theory has been successfully used to assist in finding CNFs by formulating them as OT problems with a soft penalty for enforcing the standard normal distribution as a target measure. A drawback of OT-based CNFs is the addition of a hyperparameter, $\alpha$, that controls the strength of the soft penalty and requires significant tuning. We present JKO-Flow, an algorithm to solve OT-based CNF without the need of tuning $\alpha$. This is achieved by integrating the OT CNF framework into a Wasserstein gradient flow framework, also known as the JKO scheme. Instead of tuning $\alpha$, we repeatedly solve the optimization problem for a fixed $\alpha$ effectively performing a JKO update with a time-step $\alpha$. Hence we obtain a "divide and conquer" algorithm by repeatedly solving simpler problems instead of solving a potentially harder problem with large $\alpha$.

... These properties classify d s as an ideal probability metric in the sense of Zolotarev [29]. ...

Inequality indices are quantitative scores that take values in the unit interval, with a zero score denoting complete equality. They were originally created to measure the heterogeneity of wealth metrics. In this study, we focus on a new inequality index based on the Fourier transform that demonstrates a number of intriguing characteristics and shows great potential for applications. By extension, it is demonstrated that other inequality measures, such as the Gini and Pietra indices, can be usefully stated in terms of the Fourier transform, allowing us to illuminate characteristics in a novel and straightforward manner.

... These properties classify d s as an ideal probability metric in the sense of Zolotarev [33]. ...

Originally developed for measuring the heterogeneity of wealth measures, inequality indices are quantitative scores that take values in the unit interval, with the zero score characterizing perfect equality. In this paper, we draw attention to a new inequality index, based on the Fourier transform, which exhibits a number of interesting properties that make it very promising in applications. As a by-product, it is shown that other inequality measures, including Gini and Pietra indices can be fruitfully expressed in terms of the Fourier transform, which allows to enlighten properties in a new and simple way.

... Each probability measure π from C (μ 1 , μ 2 ) is also called a coupling measure or coupling of μ 1 and μ 2 . More accurately, W p with p ∈ (0, 1) should be called Zolotarev distance (see [35]). Refer to [24,Chapter 7] and [6,Chapter 5] for further details on the L p -Wasserstein distance. ...

The asymptotic behavior of empirical measures has been studied extensively. In this paper, we consider empirical measures of given subordinated processes on complete (not necessarily compact) and connected Riemannian manifolds with possibly nonempty boundary. We obtain rates of convergence for empirical measures to the invariant measure of the subordinated process under the Wasserstein distance. The results, established for more general subordinated processes than (arXiv:2107.11568), generalize the recent ones in Wang (Stoch Process Appl 144:271–287, 2022) and are shown to be sharp by a typical example. The proof is motivated by the aforementioned works.

... Indeed, much of the theory of linear programming including the simplex algorithm has been motivated by findings for OT with early contributions by Hitchcock (1941), Kantorovich (1942), Dantzig (1948) and Koopmans (1951). Since then a surprisingly rich theory has emerged with important contributions by Kantorovich & Rubinstein 1958, Zolotarev (1976, Sudakov (1979), Kellerer (1984), Rachev (1985), Brenier (1987), Smith & Knott 1987), McCann (1997, Jordan et al. (1998), Ambrosio et al. (2008) and Lott & Villani (2009), among many others. We also refer to the excellent monographs by Rachev & Rüschendorf (1998), Villani (2008) and Santambrogio (2015) for further details. ...

We consider a general linear program in standard form whose right-hand side constraint vector is subject to random perturbations. For the corresponding random linear program, we characterize under general assumptions the random fluctuations of the empirical optimal solutions around their population quantities after standardization by a distributional limit theorem. Our approach is geometric in nature and further relies on duality and the collection of dual feasible basic solutions. The limiting random variables are driven by the amount of degeneracy inherent in linear programming. In particular, if the corresponding dual linear program is degenerate the asymptotic limit law might not be unique and is determined from the way the empirical optimal solution is chosen. Furthermore, we include consistency and convergence rates of the Hausdorff distance between the empirical and the true optimality sets as well as a limit law for the empirical optimal value involving the set of all dual optimal basic solutions. Our analysis is motivated from statistical optimal transport that is of particular interest here and distributional limit laws for empirical optimal transport plans follow by a simple application of our general theory. The corresponding limit distribution is usually non-Gaussian which stands in strong contrast to recent finding for empirical entropy regularized optimal transport solutions.

... Each probability measure π from C (µ 1 , µ 2 ) is also called a coupling measure or coupling of µ 1 and µ 2 . More accurately, W p with p ∈ (0, 1) should be called Zolotarev distance (see [30]). Refer to [19,Chapter 7] and [6,Chapter 5] for further details on the L p -Wasserstein distance. ...

The asymptotic behaviour of empirical measures has been studied extensively. In this paper, we consider empirical measures of given subordinated processes on complete (not necessarily compact) and connected Riemannian manifolds with possibly nonempty boundary. We obtain rates of convergence for empirical measures to the invariant measure of the subordinated process under the Wasserstein distance. The results, established for more general subordinated processes than [arXiv:2107.11568], generalize the recent ones in [Stoch. Proc. Appl. 144(2022), 271--287] and are shown to be sharp by a typical example. The proof is motivated by the aforementioned works.

... psychology ( p < 1) [27,36], the Zolotarev distance in spaces of random variables [40], and the d -distance in machine learning [13]. The terminology has not yet stabilized within the many generalizations of metric spaces and therefore let us determine which one we will work with here. ...

We construct a family of quasimetric spaces in generalized potential theory containing m -subharmonic functions with finite ( p , m )-energy. These quasimetric spaces will be viewed both in $${\mathbb {C}}^n$$ C n and in compact Kähler manifolds, and their convergence will be used to improve known stability results for the complex Hessian equations.

... Although the abstract metric spaces introduced by Fréchet at the beginning of the last century are of the utmost importance, in some applications, they are too restrictive and need a more general model. To name a few examples: the Minkowski p-distance in psychology (p < 1) [29,40], the Zolotarev distance in spaces of random variables [46] and the d ǫ -distance in machine learning [13]. The terminology has not yet stabilized within the many generalizations of metric spaces, and therefore let us determine which one we will work with here. ...

We construct a family of quasimetric spaces in generalized potential theory containing $m$-subharmonic functions with finite $(p,m)$-energy. These quasimetric spaces will be viewed both in $\mathbb{C}^n$ and in compact K\"ahler manifolds, and their convergence will be used to improve known stability results for the complex Hessian equations.

... The expressions of the non-central moments as well as the cumulative distribution function CDF F(x), survival function S(x), and hazard function h(x) are also determined under different available information on moments, see. 26 The rest of the article is organized as follows: section 2 presents minimum chi-squared divergence probability distributions. In section 3, the exponential distribution, and The Available information of the moments for the exponential distribution is presented. ...

In this paper, a new characterizing theorems of exponential distribution based on minimum chi-Squared divergence principle are presented. We study minimum chi-squared divergence probability distributions by this principle given a prior exponential distribution and the available information on moments. Some illustrative examples are included for special values of the parameters. We tabulated the results and the corresponding characteristics of the distribution are graphically, compared.

... where A is a constant depending on the distribution of X 1 . Zolotarev provided in [32], Theorem 1, a lower estimate for A. If X 1 = ξ 1 , that is for Rademacher variables, one has A(ξ 1 ) ≥ 1/2. This implies that for g(x) = x and f ≡ 0 we have ...

... In particular, this defines a distance on the space of probability measures on X (see e.g., Villani [53,Chapter 6]; or Zolotarev [58] for an early reference). To define barycenters with respect to this distance we need to choose an ambient space Y containing X , since any reasonable choice of such a barycenter should be allowed to have mass at positions that may differ from the support points of the N data measures. ...

We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on a myriad of simulated datasets, as well as a real-data example which are out of reach for state of the art algorithms for computing Wasserstein barycenters.

... The Zolotarev distance is also defined for α > 1, but we do not consider it here (cf. [22,23]). ...

We explore upper bounds on Kantorovich transport distances between probability measures on the Euclidean spaces in terms of their Fourier-Stieltjes transforms, with focus on non-Euclidean metrics. The results are illustrated on empirical measures in the optimal matching problem on the real line.

... Yükseklikle heyelanların ilişkisi literatürde göreceli olarak daha yüksek seviyelerde bulunulan alanların heyelanlara karşı daha duyarlı olduğu yönündedir (Gökçeoğlu ve Ercanoğlu, 2001). Koukis and Ziourkas (1991) tarafından yapılan çalışmada, özellikle dağlık bölgelerdeki yüksek kesimlerin daha fazla yağış alması ve vadilere oranla daha dik olan kesimlerde sismik ivmenin yatay bileşeninin 1.2 ile 1.5 kat daha fazla etkimesi bu duyarlılığın iki temel gerekçesidir (Zolotarev, 1976;Nagarajan vd., 2000;Görüm, 2006 (Görüm, 2006). Bir hazırlayıcı koşul olarak eğim parametresine birçok çalışmada değinilmiştir (ör., Gökçeoğlu ve Ercanoğlu, 2001;Nefeslioğlu vd., 2011;Görüm, 2017 (Görüm, 2019). ...

... Let us define the global distance [28] between the estimates {X r (s)} and { X r } as ...

The paper is devoted to the optimal state filtering of the finite-state Markov jump processes, given indirect continuous-time observations corrupted by Wiener noise. The crucial feature is that the observation noise intensity is a function of the estimated state, which breaks forthright filtering approaches based on the passage to the innovation process and Girsanov’s measure change. We propose an equivalent observation transform, which allows usage of the classical nonlinear filtering framework. We obtain the optimal estimate as a solution to the discrete–continuous stochastic differential system with both continuous and counting processes on the right-hand side. For effective computer realization, we present a new class of numerical algorithms based on the exact solution to the optimal filtering given the time-discretized observation. The proposed estimate approximations are stable, i.e., have non-negative components and satisfy the normalization condition. We prove the assertions characterizing the approximation accuracy depending on the observation system parameters, time discretization step, the maximal number of allowed state transitions, and the applied scheme of numerical integration.

... where the infimum is taken over all collections of {X k , i ≤ n} and {π k (·), i ≤ n} defined on a common probability space. It is worth noting that this distance belongs to the class of so-called minimal distances between distributions of B-valued stochastic processes with paths in the above-mentioned Banach space (e.g., see Zolotarev, 1976). Alongside with (3), we also need the total variation distance between the distributions L(Y i ) of arbitrary B-valued random variables Y i , i = 1, 2: ...

We investigate approximation of a Bernoulli partial sum process to the accompanying Poisson process in the non-i.i.d. case. The rate of closeness is studied in terms of the minimal distance in probability.

... The effectiveness of those approximation methods is demonstrated with numerical examples, in which the methods are compared with another simple method of constructing discrete distributions based on random sampling. 標は本論文で扱うものの他にも Lévy 距離などさまざま なものがあり [9] ...

This paper studies approximation of probabilistic distributions behind discrete-time linear systems with stochastic dynamics for advances in control of the systems. We discuss two approximation methods leading to optimal discrete distributions whose errors from the probabilistic distribution behind the system are minimal in the sense of the L∞ and L1 norms of the differences between associated cumulative distribution functions. The effectiveness of those approximation methods is demonstrated with numerical examples, in which the methods are compared with another simple method of constructing discrete distributions based on random sampling.

... Thus, let us assume that k >_ 4. Applying the theorem of Zolotarev (cf.[3]) on quali tative stability, one can get that in our problem one has qualitative stability, i.e., as e-> 0, $(x)-* (a/k!)e-^x. It follows from this that c q @BULLET+ ■ a / k l, and cm-> 0 for m = 1, ..., k — 1. Hence there exists an eo(k > a ) such that for ...

A series of papers are devoted to the investigation of the stability of characterizations of the exponential distribution. A characterization of the exponential distribution of the type of persistance in the mean of the k-th order was obtained by O. M. Sakhobov and A. A. Geshev, and we study its stability.

... Note that under the conditions o f our theorem the criterion o f quality stability by V.M. Zolotarev[ 6]where /?(1) is a regularization o f R in both tx and t2. For ^ we have ...

Abstract is in the first page of our article scan version – it is a review of Donald R. Jensen. This review is also placed as Upload Figure of this article.

... We will use the words "metric" and "distance" for mappings M × M → [0, ∞] in a loose sense. Since all our results concern concrete metrics, there is no need to give a general definition (as, e.g., Definition 1 in Zolotarev [32]). For the sake of completeness, we include a proof that W ∞ satisfies the classical properties of a metric. ...

Strassen's theorem asserts that a stochastic process is increasing in convex
order if and only if there is a martingale with the same one-dimensional
marginal distributions. Such processes, respectively families of measures, are
nowadays known as peacocks. We extend this classical result in a novel
direction, relaxing the requirement on the martingale. Instead of equal
marginal laws, we just require them to be within closed balls, defined by some
metric on the space of probability measures. In our main result, the metric is
the infinity Wasserstein distance. Existence of a peacock within a prescribed
distance is reduced to a countable collection of rather explicit conditions. We
also solve this problem when the underlying metric is the stop-loss distance,
the L\'evy distance, or the Prokhorov distance. This result has a financial
application (developed in a separate paper), as it allows to check European
call option quotes for consistency. The distance bound on the peacock than
takes the role of a bound on the bid-ask spread of the underlying.

... of Borel subsets of the Euclidean space ~K , and let ~K be the set of infinitely divisible distributions from ~ For F , G~ ~K , ~>0 , we set Jl -ll < x} is used for the ~-neighborhood of the set X in (see [3,4]). We also need the distance in variation Yv~z (~, ~)=X~Py#, I ~ IX} -~{Xl I The inequalities (4), (8) refine the analogous estimates obtained in [5]. ...

Fix a container polygon $P$ in the plane and consider the convex hull $P_n$ of $n\geq 3$ independent and uniformly distributed in $P$ random points. In the focus of this paper is the vertex number of the random polygon $P_n$. The precise variance expansion for the vertex number is determined up to the constant-order term, a result which can be considered as a second-order analogue of the classical expansion for the expectation of R\'enyi and Sulanke (1963). Moreover, a sharp Berry-Esseen bound is derived for the vertex number of the random polygon $P_n$, which is of the same order as the square-root of the variance. The main idea behind the proof of both results is a decomposition of the boundary of the random polygon $P_n$ into random convex chains and a careful merging of the variance expansions and Berry-Esseen bounds for the vertex numbers of the individual chains.

We study the problem of approximating the recovery of a probability distribution on the unit interval from its first k moments. As main result we obtain an upper bound on the L1 reconstruction error under the regularity assumption that the log-density function has square-integrable derivatives up to some natural order r>1. Our bound is of order O(k−r). A comparative study relates our findings to alternative conditions on the distributions.

During the last several decades, the results related to geometric random sums of independent identically distributed random variables have become interesting results in probability theory and in insurance, risk theory, queuing theory and stochastic finance, etc. The negative–binomial random sums are extensions of geometric random sums. Up to the present, the negative–binomial random sums of independent identically distributed random variables have attracted much attention since they appear in many fields. However, for the strictly stationary m-dependent summands, such negative–binomial random sums are rarely studied. It seems that the complexity of the dependent structure has limited the use of classical tools based on independent random variables such as characteristic function. Up to now, a number of methods have been used to overcome the dependency such as the truncate method, Stein method, and Heinrich’s method. The paper deals with the order of approximation in weak limit theorems for normalized negative–binomial sums of strictly stationary m-dependent random variables generated by a sequence of independent, identically distributed random variables, using the moving average techniques. The orders of approximation in desired theorems are established in terms of the so-called Zolotarev ζ-metric. The obtained results are extensions and generalizations of several known results related to negative–binomial random sums and geometric random sums of independent, identically distributed random variables.

This thesis contributes to the mathematical foundation of domain adaptation as emerging field in machine learning. In contrast to classical statistical learning, the framework of domain adaptation takes into account deviations between probability distributions in the training and application setting. Domain adaptation applies for a wider range of applications as future samples often follow a distribution that differs from the ones of the training samples. A decisive point is the generality of the assumptions about the similarity of the distributions. Therefore, in this thesis we study domain adaptation problems under as weak similarity assumptions as can be modelled by finitely many moments.

This thesis contributes to the mathematical foundation of domain adaptation as emerging field in machine learning. In contrast to classical statistical learning, the framework of domain adaptation takes into account deviations between probability distributions in the training and application setting. Domain adaptation applies for a wider range of applications as future samples often follow a distribution that differs from the ones of the training samples. A decisive point is the generality of the assumptions about the similarity of the distributions. Therefore, in this thesis we study domain adaptation problems under as weak similarity assumptions as can be modelled by finitely many moments.

In this paper, we consider a strictly stationary sequence of m-dependent random variables through a compatible sequence of independent and identically distributed random variables by the moving averages processes. Using the Zolotarev distance, we estimate some rates of convergence in the weak limit theorems for normalized geometric random sums of the strictly stationary sequence of m-dependent random variables. The obtained results are extensions and generalizations of several known results on geometric random sums of independent and identically distributed random variables.

In the present paper we consider the application of the method of multiple transformations to the problems of the stability of characterization by the property of identical distribution and by the property of constant regression, and we consider again the questions related to the estimation of the stability of the characterization of the normal law by the property ,of the identical distribution of the monomial and of the statistics.

We present quantitative stability estimates for the problem of characterizing the normal law by the property of independence of the sample mean and of a definite tubular statistic.

The Wasserstein probability metric has received much attention from the machine learning community. Unlike the Kullback-Leibler divergence, which strictly measures change in probability, the Wasserstein metric reflects the underlying geometry between outcomes. The value of being sensitive to this geometry has been demonstrated, among others, in ordinal regression and generative modelling. In this paper we describe three natural properties of probability divergences that reflect requirements from machine learning: sum invariance, scale sensitivity, and unbiased sample gradients. The Wasserstein metric possesses the first two properties but, unlike the Kullback-Leibler divergence, does not possess the third. We provide empirical evidence suggesting that this is a serious issue in practice. Leveraging insights from probabilistic forecasting we propose an alternative to the Wasserstein metric, the Cram\'er distance. We show that the Cram\'er distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences. To illustrate the relevance of the Cram\'er distance in practice we design a new algorithm, the Cram\'er Generative Adversarial Network (GAN), and show that it performs significantly better than the related Wasserstein GAN.

In this book the authors consider so-called ill-posed problems and stability in statistics. Ill-posed problems are certain results where arbitrary small changes in the assumptions lead to unpredictable large changes in the conclusions. In a companion problem published by Nova, the authors explain that ill-posed problems are not a mere curiosity in the field of contemporary probability. The same situation holds in statistics. The objective of the authors of this book is to (1)identify statistical problems of this type, (2) find their stable variant, and (3)propose alternative versions of numerous theorems in mathematical statistics. The layout of the book is as follows. The authors begin by reviewing the central pre-limit theorem, providing a careful definition and characterization of the limiting distributions. Then, they consider pre-limiting behavior of extreme order statistics and the connection of this theory to survival analysis. A study of statistical applications of the pre-limit theorems follows. Based on these theorems, the authors develop a correct version of the theory of statistical estimation, and show its connection with the problem of the choice of an appropriate loss function. As It turns out,a loss function should not be chosen arbitrarily. As they explain, the availability of certain mathematical conveniences (including the correctness of the formulation of the problem estimation) leads to rigid restrictions on the choice of the loss function. The questions about the correctness of incorrectness of certain statistical problems may be resolved through appropriate choice of the loss function and/or metric on the space of random variables and their characteristics (including distribution functions, characteristic functions, and densities). Some auxiliary results from the theory of generalized functions are provided in an appendix.

In this book we study important new classes of probability metrics. These metrics
are convenient, as they allow for embedding a set of probability measures into Hilbert
or smooth Banach spaces. In addition, these metrics aid in obtaining relatively simple
solutions to the uniqueness problem in the recovery of a measure from a potential. This in
turn permits obtaining new statistical criteria for veri¯cation of nonparametric hypotheses
as well as ¯nding new characterizations of probability distributions and investigating their
stability. Subclasses of these metrics are essential in the derivation of a new point of view
on robust estimation methods.
Basing on the distances we give some constructions of multivariate free-of-distribution
two-sample tests. Such constructions allow further generalizations to provide statistical
test procedures for testing multivariate normality, multivariate stability and some other
hypotheses.
This book has been written on the base of my lectures at the Department of Probability
and Statistics of Charles University in 2002-2005, and I thank all faculty members of the
department for very good conditions for the work. This work was also partially supported
by MSM grant 002160839 and by the Academy of Sciences of Czech Republic.

The subject of this chapter is the application of the theory of probability metrics to limit theorems arising from summing independent and identically distributed (i.i.d.) random variables (RVs).

In this chapter, we discuss the problem of estimating the rate of convergence in limit theorems arising from the maxima scheme of independent and identically distributed (i.i.d.) random elements.

Preface
In this book we study important new classes of probability metrics. These metrics
are convenient, as they allow for embedding a set of probability measures into Hilbert
or smooth Banach spaces. In addition, these metrics aid in obtaining relatively simple
solutions to the uniqueness problem in the recovery of a measure from a potential. This in
turn permits obtaining new statistical criteria for veri¯cation of nonparametric hypotheses
as well as ¯nding new characterizations of probability distributions and investigating their
stability. Subclasses of these metrics are essential in the derivation of a new point of view
on robust estimation methods.
Basing on the distances we give some constructions of multivariate free-of-distribution
two-sample tests. Such constructions allow further generalizations to provide statistical
test procedures for testing multivariate normality, multivariate stability and some other
hypotheses.
This book has been written on the base of my lectures at the Department of Probability
and Statistics of Charles University in 2002-2005, and I thank all faculty members of the
department for very good conditions for the work. This work was also partially supported
by MSM grant 002160839 and by the Academy of Sciences of Czech Republic.

In the middle of the 1940’s it seemed that the topic of limit theorems of classical type (that is, problems of the limiting
behaviour of distributions of sums of a large number of terms that are either independent or connected in Markov chains) was
basically exhausted. The monograph written by me together with B.V. Gnedenko [1] was intended to summarize the results of
the previous years. In reality, however, starting from the late 1940’s more papers in these classical fields appeared. This
can be explained by several circumstances. First, it became clear that from a practical viewpoint the accuracy of remainder
terms obtained so far was far from sufficient. Secondly, certain problems that were solved earlier only under complicated
and very restrictive conditions unexpectedly obtained very simple complete solutions (in the sense of necessary and sufficient
conditions). These include, for example, the problem of “localization” of limit theorems, which turn out to have an exhaustive
solution both for the case of identically distributed independent summands and for the case of the distribution of the number
of separate states visited in a homogeneous Markov chain.

A metric-topological approach to stability problems of characterization of distributions is proposed. General conditions of stability are formulated and their applications are illustrated on certain well-known problems of characterization.

First an integral representation of a continuous linear functional dominated by a support function in integral form is given (Theorem 1). From this the theorem of Blackwell-Stein-Sherman-Cartier [2], [20], [4], is deduced as well as a result on capacities alternating of order 2 in the sense of Choquet [5], which includes Satz 4.3 of [23] and a result of Kellerer [10], [12], under somewhat stronger assumptions. Next (Theorem 7), the existence of probability distributions with given marginals is studied under typically weaker assumptions, than those which are required by the use of Theorem 1. As applications we derive necessary and sufficient conditions for a sequence of probability measures to be the sequence of distributions of a martingale (Theorem 8), an upper semi-martingale (Theorem 9) or of partial sums of independent random variables (Theorem 10). Moreover an alternative definition of Levy-Prokhorov's distance between probability measures in a complete separable metric space is obtained (corollary of Theorem 11). Section 6 can be read independently of the former sections.