About
224
Publications
48,937
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,561
Citations
Introduction
Current institution
Additional affiliations
July 2019 - present
August 1988 - June 2019
January 1970 - December 2002
Publications
Publications (224)
In general, the clustering problem is NP-hard, and global optimality cannot be established for non-trivial instances. For high-dimensional data, distance-based methods for clustering or classification face an additional difficulty, the unreliability of distances in very high-dimensional spaces. We propose a probabilistic, distance-based, iterative...
In general, the clustering problem is NP-hard, and global optimality cannot
be established for non-trivial instances. For high-dimensional data,
distance-based methods for clustering or classification face an additional
difficulty, the unreliability of distances in very high-dimensional spaces. We
propose a distance-based iterative method for clust...
An insurance model, with realistic assumptions about coverage, deductible and premium, is studied. Insurance is shown to decrease the variance of the cost to the insured, but increase the expected cost, a tradeoff that places our model in the Markowitz mean-variance model.
The Cauchy distribution
$$\mathfrak {C}(a,b)(x)=\frac{1}{\pi b(1+(\frac{x-a}{b})^2)},\quad -\infty < x <\infty,$$
with a,b real, b>0, has no moments (expected value, variance, etc.), because the defining integrals diverge. An obvious way to “concentrate” the Cauchy distribution, in order to get finite moments, is by truncation, restricting it to...
The probabilistic distance clustering method (called PDQ method) of the authors, 2,8 computes the cluster membership probabilities using the distances of the data points from the cluster centers, and the cluster sizes. The method is based on the joint distance function (JDF), a weighted harmonic mean of the above distances, that approximates the da...
Given a function u : R → R, the inverse Newton transform of u, denoted N −1 u, is the function f (x) = exp dx x − u(x) , wherever existing. The iterations x := u(x) coincide with the Newton iterates for N −1 u, and in this sense every iteration (for which N −1 u exists) is Newton. The correspondence u ←→ f = N −1 u may be useful since the zeros of...
Let f be a convex function bounded below with infimum f
min attained. A bracket is an interval [L, U] containing f
min. The Newton Bracketing (NB) method for minimizing f, introduced in [Levin and Ben-Israel, Comput. Optimiz. Appl. 21, 213–229 (2002)], is an iterative method that at each iteration transforms a bracket [L, U] into a strictly smaller...
An iterative method is proposed for the K facilities location problem. The problem is relaxed using probabilistic assignments, depending on the distances to the facilities. The probabilities, that decompose the problem into K single-facility location problems, are updated at each iteration together with the facility locations. The proposed method i...
Given a dataset D partitioned in clusters, the joint distance function (JDF) J(x) at any point x is the harmonic mean of the distances between x and the cluster centers. The JDF is a continuous function, capturing the data points in its lower level sets (a property called contour approximation), and is a useful concept in probabilistic clustering a...
First-order optimality conditions for convex programming are developed using a feasible directions approach. Numerical implementations
and applications are discussed. The concepts of constancy directions and minimal index set of binding constraints, central
to our theory, prove useful also in studying the stability of perturbed convex programs.
The probabilistic distance clustering method of [1] works well if the cluster sizes are approximately equal. We modify that method to deal with clusters of arbitrary size and for problems where the cluster sizes are themselves unknowns that need to be estimated. In the latter case, our method is a viable alternative to the expectation-maximization...
Consider a problem of minimizing a separable, strictly convex, monotone and differentiable function on a convex polyhedron generated by a system of m linear inequalities. The problem has a series-parallel structure, with the variables divided serially into n disjoint subsets, whose elements are considered in parallel. This special structure is expl...
The Newton Bracketing method (9) for the minimization of convex functions f : Rn ! R is extended to a-nely constrained convex minimization problems. The results are illustrated for a-nely constrained Fermat{Weber location problems.
Intensity-modulated radiation therapy (IMRT) gives rise to systems of linear inequalities, representing the effects of radiation on the irradiated body. These systems are often infeasible, in which case one settles for an approximate solution, such as an {α, β}-relaxation, meaning that no more than α percent of the inequalities are violated by no m...
We present a new iterative method for probabilistic clustering of data. Given clusters, their centers and the distances of data points from these centers, the probability of cluster membership at any point is assumed inversely proportional to the distance from (the center of) the cluster in question. This assumption is our working principle.
The me...
Semi-supervised clustering is an attempt to reconcile clustering (unsupervised learning) and classification (supervised learning, using prior information on the data). These two modes of data analysis are combined in a parameterized model, the parameter theta ∈ [0, 1] is the weight attributed to the prior information, theta = 0 corresponding to clu...
We study the geometry of datasets, using an extension of the Fisher linear discriminant to the case of singular covariance, and a new regularization procedure. A dataset is called linearly separable if its different clusters can be reliably separated by a linear hyperplane. We propose a measure of linear separability, easily computed as an angle th...
Purpose:IMRT has been widely adopted to create conformal dose distributions. This technology is particularly useful in situations where critical structures push against the target or targets to create a concavity in the PTVs. It is difficult to develop a set of dose constraints that will work in all cases, and current IMRT inverse planning has beco...
A heuristic method for solving large-scale multi-facility location problems is presented. The method is analogous to Cooper's method (SIAM Rev. 6 (1964) 37), using the authors' single facility location method (Comput. Optim. Appl. 21 (2002) 213) as a parallel subroutine, and reassigning customers to facilities using the heuristic of nearest center...
A directional Newton method is proposed for solving systems of m equations in n unknowns. The method does not use the inverse, or generalized inverse, of the Jacobian, and applies to systems of arbitrary m, n. Quadratic convergence is established under typ-ical assumptions (first derivative "not too small", second derivative "not too large"). The m...
A summary and restatement, in plain English and modern notation, of the results of E.H. Moore on the generalized inverse that bears his name.
.An iterative method for the minimization of convex functions f : R n ! R, called a Newton Bracketing (NB) method, is presented. The NB method proceeds by using Newton iterations to improve upper and lower bounds on the minimum value. The NB method is valid for n = 1, and in some cases for n ? 1 (sufficient conditions given here). The NB method is...
Consider m functions fi(x1,…,xn), the system of equations fi=0,i=1,…,m and the Newton iterations for this system that use the Moore–Penrose inverse of the Jacobian matrix. Under standard assumptions, the Newton iterations converge quadratically to a stationary point of the sum-of-squares ∑fi2. Approximating derivatives ẋ as differences Δx/Δt with Δ...
Directional Newton methods for functions f of n variables are shown to converge, under standard assumptions, to a solution of f(x) = 0. The rate of convergence is quadratic, for near-gradient directions, and directions along components of the gradient of f with maximal modulus. These methods are applied to solving systems of equations without rever...
Das Symbolische Rechnen zur Losung mathematischer Probleme erlebt einen neuen Aufschwung durch den Einsatz speziell dafur entwickelter Softwaresysteme. Nach dem Erfolg von Mathematica und Maple erobert nun MACSYMA den Markt. MACSYMA wurde seit den spaten 60er Jahren am MIT entwickelt und uber die Jahre hin verbessert und weiterentwickelt. Es bietet...
Two functions f and g are tangent at a point x0 if their graphs almost coincide, near x0, in the sense of Definition 4.1 below. A function f can have, at a given point x0, at most one tangent which is a linear function, say
l (x) = f (x0) + m(x – x0) ,
in which case
l is called the tangent line, or simply the tangent, of f at x0,
the slope m of l i...
The most basic concept in calculus is the limit. It is used in the study of continuity, derivatives, integrals, and all other important topics in calculus. Indeed, one cannot use calculus intelligently without understanding limits.
This chapter is devoted to the study of convergence of sequences (a0, a1, a2, . . . ) and series ∑
n=0∞ an of real numbers an. We use the notation N0 := {0, 1, 2, 3,. . .} and N := {1, 2, 3,. . .} throughout.
The derivative of a function f at a point ξ
$$f'\left( \xi \right) = \mathop {\lim }\limits_{\Delta x \to 0} {\rm{ }}{{f\left( {\xi + \Delta x} \right) - f\left( \xi \right)} \over {\Delta x}},$$
is the slope of the line tangent to the graph of f at the point P = (ξ ,f (ξ)). Restricting to Δx > 0 we see that f′(ξ) is the limit of the slopes of seca...
The following example is an illustration of the problems studied in this chapter.
Example 6.1. You are the president of the XYZ Widget Company, which is in the business of producing and selling widgets. You must decide how many widgets, say x, to produce in the coming season. The information available to you is:
the company can produce no more than...
The concept of function is central to mathematics. A function is a rule assigning values to certain objects. If a function is called f, the value it assigns to x is denoted by f (x). A well-defined function f assigns to each such x a single value f (x). However, several objects x1, x2,. . . , xn
may get the same value f (x1) = f (x2) = . . . =f(xn...
We study methods for computing antiderivatives, methods known collectively as integration techniques. Three such methods are covered here:
the change of variables (or substitution) method (Sect. 10.1),
integration by parts (Sect. 10.2),
the partial fractions expansion method (Sect. 10.3).
The first two are more than techniques. They are an essentia...
What do area, length, volume, work, and hydrostatic force have in common? All of these (and many other important concepts in science and engineering) can be modelled as Riemann sums (8.6)
$$\sum\limits_{k = 1}^n {f\left( {{\xi _k}} \right)} {\rm{ }}\Delta {x_k},$$
and computed as integrals (8.28),
$$\int\limits_{a}^{b} {f(x)dx: = \mathop{{\lim }}\l...
If f is differentiable, its derivative f′ can be computed using the limit (4.11),
$$f'\left( x \right) = \mathop {\lim }\limits_{\xi \to x} {\rm{ }}{{f\left( \xi \right) - f\left( x \right)} \over {\xi - x}},$$
which is often difficult. However, sometimes f has a special structure that allows differentiating it without evaluating the limit (4.11)....
Chapter 12 dealt with infinite series of numbers and criteria for their convergence. In the same way one can study infinite series of functions
$$\sum\limits_{k = 0}^\infty {{u_k}\left( x \right)} $$
where the functions uk (x) are all defined on a common interval.
In this chapter we study functions of the form
$$F\left( x \right) = \int\limits_a^x {f\left( t \right)} {\rm{ d}}t$$
called indefinite integrals of f. If f. is continuous, then F is an antiderivative of f, see Theorem 9.13. Indefinite integrals allow an easy computation of definite integrals as follows,
$$\int\limits_a^b {f\left( x \right)} {\rm{...
Calculus is based on the concept of limit and on two limiting operations:
integration, computing integrals which are limits of appropriate sums;
differentiation, computing derivatives which are limits of appropriate differences.
This chapter is a brief introduction to the exponential, trigonometric, and hyperbolic functions, and their inverses. These functions, together with the polynomial and rational functions of Chap. 1, are used throughout calculus.
This is the third supplementary volume to Kluwer's highly acclaimed twelve-volume Encyclopaedia of Mathematics. This additional volume contains nearly 500 new entries written by experts and covers developments and topics not included in the previous volumes. These entries are arranged alphabetically throughout and a detailed index is included. This...
The problem is to predict a value y 2 Y (output, class) from an observed value of a vector x 2 X (predictors, inputs, attributes), the relations between y and x given in (empirical) data D = {(xi,yi) : i = 1,...,N}, listing N observed pairs. We propose an estimation algorithm using a classification of D in clusters {1,...,m}, based on a distance fu...
The classical Newton–Kantorovich method for solving systems of equations f (x) = 0 uses the inverse of the Jacobian of f at each iteration. If the number of equations is different than the number of variables, or if the Jacobian cannot be assumed nonsingular, a generalized inverse of the Jacobian can be used in a Newton method whose limit points ar...
A heuristic method for solving large-scale location-allocation problems is presented. The method uses the authors' single facility location method [19] as a parallel subroutine, and updates the assignments of customers to facilities using the heuristic of Nearest Center Reclassification. Numerical results are reported. Page 2 RRR 36-2001 1
.A directional Halley method for functions f of n variables is shown to converge, at a cubic rate, to a solution. To avoid the second derivative needed in Halley method we propose a directional quasi-Halley method, with one more function evaluation per iteration than the directional Newton method, but with convergence rates comparable to the Halley...
. Consider m functions f i (x 1 ; Delta Delta Delta ; x n ), the system of equations f i = 0 ; i = 1; Delta Delta Delta ; m and the Newton iterations for this system that use the Moore--Penrose inverse of the Jacobian matrix. Under standard assumptions, the Newton iterations converge quadratically to a stationary point of the sum-of-squares P f 2 i...
.An iterative method for the minimization of convex functions f : R n ! R, called a Newton Bracketing (NB) method, is presented. The NB method proceeds by using Newton iterations to improve upper and lower bounds on the minimum value. The NB method is valid for n = 1, and in some cases for n ? 1 (sufficient conditions given here). The NB method is...
The Envelope Theorem is a statement about derivatives along an optimal trajectory. In Dynamic Programming the Envelope Theorem can be used to characterize and compute the Optimal Value Function from its derivatives. We illustrate this here for the Linear-Quadratic Control Problem, the Resource Allocation Problem, and the Inverse Problem of Dynamic...
Markov decision processes are solved recursively, using the Bellman optimality principle,
$$V(s,t): = \mathop {\max }\limits_{a \in A(s)} \left\{ {r(s,a) + \alpha \sum\limits_{j \in S} {{p_{s,j}}(a)} V(j,t + 1)} \right\}$$ (A)
where V(s, t) is the optimal value of state s at stage t, r(s, a) is the instantaneous profit from action a at state s, S i...
Given an n-dimensional random variable X with a joint density fX(x1,…,xn), the density of Y=h(X) is computed as a surface integral of fX in two cases: (a) h linear, and (b) h sum of squares. The integrals use the volume of the Jacobian matrix in a change-of-variables formula.
A mapping :R
n
R
m
, nm, with Jacobian of full column-rank, has a local inverse that is analogous to the Moore–Penrose inverse of linear mappings.
> satisfying P () :Q (: denotes negation), in words: either P or Q but never both. Relations between (a){(f). (a) and (b) are equivalent representations. Indeed, (a) and (b) can be written as (A; A; I) 0 @ x + x s 1 A = b ; 0 @ x + x s 1 A 0 and 0 @ A A I 1 A x 0 @ b b 0 1 A ; respectively: The remaining systems involve strict inequalities or nontr...
Several well-known examples in Probability and Operations Research are analyzed using simulation with Maple. 1.
A continuous nested sequence of similar triangles converging to the Brocard point of a given triangle is investigated. All these triangles have the same Brocard point. For polygons, the Brocard point need not exist, but there is always a limit object for an analogously defined nested sequence of inner polygons. This limit object is a Brocard point...
The product of ratios that equals 1 in Ceva's Theorem is analyzed in the case of non-concurrent Cevians, for triangles as well as arbitrary convex polygons. A general lemma on complementary systems of inequalities is proved, and used to classify the possible cases of non-concurrent Cevians. In the concurrent case, particular consideration is given...
The product of ratios that equals 1 in Ceva's Theorem is analyzed in the case of non-concurrent Cevians, for triangles as well as arbitrary convex polygons. A general lemma on complementary systems of inequalities is proved, and used to classify the possible cases of non-concurrent Cevians. In the concurrent case, particular consideration is given...
The matrix volume is a generalization, to rectangular matrices, of the absolute value of the determinant. In particular, the matrix volume can be used in change-of-variables formulæ, instead of the determinant (if the Jacobi matrix of the underlying transformation is rectangular). This result is applicable to integration on surfaces, illustrated he...
Introduction Let f : C ! C be analytic. A solution (existence assumed) of f(z) = 0 (1) can be approximated by the Newton method z k+1 := N f (z k ) ; k = 0; 1; : : : (2) using the iteration N f (z) := z Gamma f(z) f 0 (z) ; (3) with z 0 sufficiently close to the sought solution. For (local) convergence conditions, see [15, Chapter 7]. Although the...
. The Brocard point of a triangle can be viewed as the limit point of a continuous nested sequence of inner triangles that are all similar to the original triangle. All these triangles have the same Brocard point. For polygons, the Brocard point need not exist, but there is always a limit object for an analogously defined nested sequence of inner p...
Consider the inequalities (a)||⩽b,A∈Rr × nr, r < n, b positive vector (here |y| denotes the vector of absolute values of components of the vector y) and xTAx⩽λ,Apositive semi-definite∈Rn × nr, r 0Both inequalities are guaranteed a nonzero integer solution x for every positive right-hand side (b, α respectively). Such solutions will generally have a...
A random variable (RV) X is given aminimum selling price
$$S_U \left( X \right): = \mathop {\sup }\limits_x \left\{ {x + EU\left( {X - x} \right)} \right\}$$ (S) and amaximum buying price
$$B_p \left( X \right): = \mathop {\inf }\limits_x \left\{ {x + EP\left( {X - x} \right)} \right\}$$ (B) whereU(·) andP(·) are appropriate functions. These prices...
Let cos{L, M} ≔ Πi = 1cosθi denote the product of the cosines of the principal angles {θi} between the subspaces L and M. The direction cosines of an r-dimensional subspace L are the nr numbers {cos{L, RnJ}: J ∈ Qr, n}, where Qr, n ≔ the set of increasing sequences of r elements from {1, …, n}, and RnJ ≔ {x = (xk) ∈ Rn: xk = 0 for k ∉ J}. The basic...
Singular values and maximum rank minors of generalized inverses are studied. Proportionality of maximum rank minors is explained in terms of space equivalence. The Moore-Penrose inverse A is characterized as the {1 [-inverse of Awith minimal volume.
The basic solutions of the linear equations Ax = b are the solutions of subsystems corresponding to maximal nonsingular submatrices of A. The convex hull of the basic solutions is denoted by C = C(A, b). Given 1 ≤ p ≤ ∞, the lp-approximate solutions of Ax = b, denoted x{p}, are minimizers of ∥Ax − b∥p. Given M ∈ Dm, the set of positive diagonal m ×...
We solve here the problem of making change, using a minimal number of coins. This problem is a special case of the so-called knapsack problem.
For 1 ⩽ p ⩽ ∞, the lp-approximate solutions of Ax = b are the minimizers of ‖Ax − b‖p, where ‖ · ‖p is the lp-norm. We consider the special case where the null space of AT is one-dimensional. Sample results:(a) If 1 ⩽ p ⩽ ∞ and A is m x (m − 1) of rank m −1, then there is a matrix A{p} (depending on A and p) such that, for every b ∈ m, the vector A...
Let Qk;n = ffi = (fi1;¢¢¢;fik) : 1 • fi1 < ¢¢¢ < fikng denote the strictly increasing sequences of k elements from 1;:::;n. For fi;fl 2 Qk;n we denote by A(fi;fl) the submatrix of A with rows indexed by fi, columns by fl. The submatrix obtained by deleting the fi-rows and fl-columns is denoted by A(fi 0 ;fl 0 ). For nonsingular A 2 IR n£n , the Jac...
Definition 3.1 (Reelle Funktionen) Eine reelle Funktion f ist eine Regel, die den reellen Zahlen x ∈ D des Definitionsbereichs D ⊂ ℝ jeweils eine reelle Zahl f(x) zuordnet. Die Zahl f(x) heißt der Wert1 von f an der Stelle x oder das Bild2 von x unter der Funktion f. Entsprechend heißt x ein Urbild3 des Punktes f(x). Wir schreiben in dieser Situati...
Betrachten wir eine Folge von Punktionen fn: [a, b] → ℝ eines abgeschlossenen Intervalls [a,b], so können wir punktweise für jedes x ∈ [a, b] die Konvergenz der Zahlenfolge (fn(x))n untersuchen. Liegt punktweise Konvergenz vor, erhält man eine Grenzfunktion f: [a, b] → ℝ durch ie Vorschrift
$$f\left( x \right): = \begin{array}{*{20}{c}} {\lim } \\...
Wir wollen in diesem Kapitel das Problem betrachten, das bestimmte Integral
$$\int\limits_a^b {f\left( x \right)dx}$$ (7.5)
für gegebenes f, a und b zu approximieren.
In diesem Abschnitt stellen wir einige technische Sätze vor, die globale Aussagen ermöglichen. Die wichtigste Beweistechnik ist die Anwendung des Satzes vom Maximum für stetige Funktionen (Satz 6.6).
Bislang haben wir aus den linearen Funktionen durch Anwendung der algebraischen Operationen Addition, Subtraktion, Multiplikation und Division die rationalen Funktionen konstruiert sowie ferner im wesentlichen durch das Betrachten von Umkehrfunktionen algebraische Funktionen wie die Wurzelfunktionen erklärt.
Die Analysis basiert auf dem Konzept des Grenzwerts und auf zwei bestimmten Grenz wert Operationen: der Integration und der Differentiation. Es wird sich herausstellen, daß die beiden Konzepte in gewisser Weise zueinander inverse Operationen darstellen. Doch dazu später.
Wir haben das unbestimmte Integral einer Funktion f mit Hilfe des Grenzprozesses des Riemann-Integrals definiert und ihre Stammfunktion mit Hilfe der Umkehroperation der Differentiation, der Antidifferentiation. Wir zeigen nun, daß sich diese beiden Konzepte im wesentlichen entsprechen. Das ist in doppelter Hinsicht bedeutsam: Erstens zeigt es uns,...
Wir haben im letzten Abschnitt Reihen betrachtet. Besonders wichtig bei den dortigen Betrachtungen erwies sich die geometrische Reihe \(\sum\limits_{k = 0}^\infty {{x^k}}\) als Vergleichsreihe. Diese Reihe gehört zum Typus der Potenzreihen, Reihen mit einer ganz besonderen Struktur, die vor allem deshalb von besonderer Bedeutung sind, weil sie auf...
In der Mathematik spielen die Zahlen eine wichtige Rolle. Zahlen werden zu Mengen zusammengefaßt. So spricht man z. B. von der Menge der reellen Zahlen, die in § 1.3 betrachtet wird.
In diesem Anhang werden Derives grundlegende Eigenschaften erklärt, um für die ersten Schritte gewappnet zu sein.
Auf ähnliche Weise wie die reellen Zahlen geometrisch eine Zahlengerade darstellen, repräsentieren Paare reeller Zahlen Punkte in einer Ebene1.
Wir betrachten in diesem Abschnitt Grenzwerte von Funktionen. Es wird sich zeigen, daß man dieses Konzept letztlich auf das Konzept der Grenzwerte von Folgen zurückführen kann.
Eine der wichtigsten Fragen der Analysis ist es, zu einem gegebenen Punkt des Graphen einer reellen Funktion die Tangente zu bestimmen. Diese Frage, die sich oft in Anwendungsfällen stellt und deren Tragweite wir bald zu schätzen lernen werden, untersuchen wir in diesem Kapitel. Zunächst betrachten wir folgendes Beispiel.
Anläßlich eines Forschungsaufenthalts 1988/1989 von Bob Gilbert (University of De laware, USA) am Fachbereich Mathematik der Freien Universität Berlin wurde ich durch ihn auf die Verwendung symbolischer Mathematikprogramme, und zwar des Computeralgebrasystems MACSYMA, in der mathematischen Forschung aufmerk sam gemacht. Von diesem Zeitpunkt an ka...
Let L, M be subspaces in Rn, dim L = l≤dim M = m. Then the principal angles between L and M, 0≤θ1≤θ2≤⋯≤θ l≤π/2, are given by cosθi= 〈xiyi〉 ∥xi∥∥yi∥=max 〈x,y〉 ∥x∥∥y∥: x∈L, y∈ M x,⊥xk, y⊥ykk=1, ...,-1 where (xi,yi) ∈ L × M, i = 1,...,l, are the corresponding pairs of principal vectors. We also define sin {L, M}{colon equals}P{cyrillic}li=1 sinΘ{round...
Let A is-an-element-of R(r)m x n with nonzero singular values sigma(1), sigma(2),..., sigma(r). The volume of A, vol A, is defined as zero if r = 0, and otherwise, vol A = PI(i = 1)r sigma(i), or equivalently, vol A = square-root SIGMA-det2 A(IJ), summing over all r x r nonsingular submatrices A(IJ). The matrix volume vol A generalizes the "absolut...
The algorithm of [14] for linear programming is adapted here for solving a semi-infinite linear program (SILP) (Pn) of dimension n. This active-set algorithm solves a sequence of SILPs (Pk) of dimension k, where (Pk-1) is obtained from (Pk) by adding appropriate constraints restricting the solution to a hyperplane that intersects the feasible set....
In 1955, R. Penrose introduced generalized inverses (GIs) [62]. That remarkable paper gives an algebraic theory of GIs, a spectral theory of rectangular matrices, singular value decomposition (SVD) and its use in the computation of GIs. Although known since the 1900s,1GIs became a distinguished area in mathematics’ with [62] and its sequel [63], gi...