Publications (101)84.7 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: The Julia programming language is gaining enormous popularity. Julia was designed to be easy and fast. Most importantly, Julia shatters deeply established notions widely held in the applied community: 1. Highlevel, dynamic code has to be slow by some sort of law of nature 2. It is sensible to prototype in one language and then recode in another language 3. There are parts of a system for the programmer, and other parts best left untouched as they are built by the experts. Julia began with a deep understanding of the needs of the scientific programmer and the needs of the computer in mind. Bridging cultures that have often been distant, Julia combines expertise from computer science and computational science creating a new approach to scientific computing. This note introduces the programmer to the language and the underlying design theory. It invites the reader to rethink the fundamental foundations of numerical computing systems. In particular, there is the fascinating dance between specialization and abstraction. Specialization allows for custom treatment. We can pick just the right algorithm for the right circumstance and this can happen at runtime based on argument types (code selection via multiple dispatch). Abstraction recognizes what remains the same after differences are stripped away and ignored as irrelevant. The recognition of abstraction allows for code reuse (generic programming). A simple idea that yields incredible power. The Julia design facilitates this interplay in many explicit and subtle ways for machine perfor mance and, most importantly, human convenience.11/2014;  [Show abstract] [Hide abstract]
ABSTRACT: Some properties that nominally involve the eigenvalues of Gaussian Unitary Ensemble (GUE) can instead be phrased in terms of singular values. By discarding the signs of the eigenvalues, we gain access to a surprising decomposition: the singular values of the GUE are distributed as the union of the singular values of two independent ensembles of Laguerre type. This independence is remarkable given the well known phenomenon of eigenvalue repulsion. The structure of this decomposition reveals that several existing observations about large $n$ limits of the GUE are in fact manifestations of phenomena that are already present for finite random matrices. We relate the semicircle law to the quartercircle law by connecting Hermite polynomials to generalized Laguerre polynomials with parameter $\pm$1/2. Similarly, we write the absolute value of the determinant of the $n\times{}n$ GUE as a product n independent random variables to gain new insight into its asymptotic lognormality. The decomposition also provides a description of the distribution of the smallest singular value of the GUE, which in turn permits the study of the leading order behavior of the condition number of GUE matrices. The study is motivated by questions involving the enumeration of orientable maps, and is related to questions involving powers of complex Ginibre matrices. The inescapable conclusion of this work is that the singular values of the GUE play an unpredictably important role that had gone unnoticed for decades even though, in hindsight, so many clues had been around.10/2014;  [Show abstract] [Hide abstract]
ABSTRACT: Polymorphism in programming languages enables code reuse. Here, we show that polymorphism has broad applicability far beyond computations for technical computing: parallelism in distributed computing, presentation of visualizations of runtime data flow, and proofs for formal verification of correctness. The ability to reuse a single codebase for all these purposes provides new ways to understand and verify parallel programs.10/2014;  [Show abstract] [Hide abstract]
ABSTRACT: Arrays are such a rich and fundamental data type that they tend to be built into a language, either in the compiler or in a large lowlevel library. Defining this functionality at the user level instead provides greater flexibility for application domains not envisioned by the language designer. Only a few languages, such as C++ and Haskell, provide the necessary power to define $n$dimensional arrays, but these systems rely on compiletime abstraction, sacrificing some flexibility. In contrast, dynamic languages make it straightforward for the user to define any behavior they might want, but at the possible expense of performance. As part of the Julia language project, we have developed an approach that yields a novel tradeoff between flexibility and compiletime analysis. The core abstraction we use is multiple dispatch. We have come to believe that while multiple dispatch has not been especially popular in most kinds of programming, technical computing is its killer application. By expressing key functions such as array indexing using multimethod signatures, a surprising range of behaviors can be obtained, in a way that is both relatively easy to write and amenable to compiler analysis. The compact factoring of concerns provided by these methods makes it easier for userdefined types to behave consistently with types in the standard library.07/2014;  [Show abstract] [Hide abstract]
ABSTRACT: “Low temperature” random matrix theory is the study of random eigenvalues as energy is removed. In standard notation, β is identified with inverse temperature, and low temperatures are achieved through the limit β→∞. In this paper, we derive statistics for lowtemperature random matrices at the “soft edge,” which describes the extreme eigenvalues for many random matrix distributions. Specifically, new asymptotics are found for the expected value and standard deviation of the generalβ TracyWidom distribution. The new techniques utilize beta ensembles, stochastic differential operators, and Riccati diffusions. The asymptotics fit known hightemperature statistics curiously well and contribute to the larger program of generalβ random matrix theory. ©2014 American Institute of PhysicsJournal of Mathematical Physics 06/2014; 55(6). · 1.30 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We derive explicit expressions for the distributions of the extreme eigenvalues of the betaWishart random matrices in terms of the hypergeometric function of a matrix argument. These results generalize the classical results for the real (β = 1), complex (β = 2), and quaternion (β = 4) Wishart matrices to any β > 0.Random Matrices: Theory and Applications. 05/2014; 03(02).  [Show abstract] [Hide abstract]
ABSTRACT: We find the joint generalized singular value distribution and largest generalized singular value distributions of the $\beta$MANOVA ensemble with positive diagonal covariance, which is general. This has been done for the continuous $\beta > 0$ case for identity covariance (in eigenvalue form), and by setting the covariance to $I$ in our model we get another version. For the diagonal covariance case, it has only been done for $\beta = 1,2,4$ cases (real, complex, and quaternion matrix entries). This is in a way the first secondorder $\beta$ensemble, since the sampler for the generalized singular values of the $\beta$MANOVA with diagonal covariance calls the sampler for the eigenvalues of the $\beta$Wishart with diagonal covariance of Forrester and DubbsEdelmanKoevVenkataramana. We use a conjecture of MacDonald proven by Baker and Forrester concerning an integral of a hypergeometric function and a theorem of Kaneko concerning an integral of Jack polynomials to derive our generalized singular value distributions. In addition we use many identities from Forrester's {\it LogGases and Random Matrices}. We supply numerical evidence that our theorems are correct.Random Matrices: Theory and Applications. 09/2013; 03(01). 
Article: The BetaWishart Ensemble
[Show abstract] [Hide abstract]
ABSTRACT: This paper proves a matrix model for the Wishart Ensemble with general covariance and general dimension parameter beta. In so doing, we introduce a new and elegant definition of Jack polynomials.Journal of Mathematical Physics 05/2013; 54(8). · 1.30 Impact Factor 
Conference Paper: Novel algebras for advanced analytics in Julia
[Show abstract] [Hide abstract]
ABSTRACT: A linear algebraic approach to graph algorithms that exploits the sparse adjacency matrix representation of graphs can provide a variety of benefits. These benefits include syntactic simplicity, easier implementation, and higher performance. One way to employ linear algebra techniques for graph algorithms is to use a broader definition of matrix and vector multiplication. We demonstrate through the use of the Julia language system how easy it is to explore semirings using linear algebraic methodologies.High Performance Extreme Computing Conference (HPEC), 2013 IEEE; 01/2013  [Show abstract] [Hide abstract]
ABSTRACT: Dynamic languages have become popular for scientific computing. They are generally considered highly productive, but lacking in performance. This paper presents Julia, a new dynamic language for technical computing, designed for performance from the beginning by adapting and extending modern programming language techniques. A design based on generic functions and a rich type system simultaneously enables an expressive programming model and successful type inference, leading to good performance for a wide range of programs. This makes it possible for much of the Julia library to be written in Julia itself, while also incorporating bestofbreed C and Fortran libraries.09/2012; 
Article: Error analysis of free probability approximations to the density of states of disordered systems.
[Show abstract] [Hide abstract]
ABSTRACT: Theoretical studies of localization, anomalous diffusion and ergodicity breaking require solving the electronic structure of disordered systems. We use free probability to approximate the ensembleaveraged density of states without exact diagonalization. We present an error analysis that quantifies the accuracy using a generalized moment expansion, allowing us to distinguish between different approximations. We identify an approximation that is accurate to the eighth moment across all noise strengths, and contrast this with perturbation theory and isotropic entanglement theory.Physical Review Letters 07/2012; 109(3):036403. · 7.73 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We define an indefinite Wishart matrix as a matrix of the form A=W^{T}W\Sigma, where \Sigma is an indefinite diagonal matrix and W is a matrix of independent standard normals. We focus on the case where W is L by 2 which has engineering applications. We obtain the distribution of the ratio of the eigenvalues of A. This distribution can be "folded" to give the distribution of the condition number. We calculate formulas for W real (\beta=1), complex (\beta=2), quaternionic (\beta=4) or any ghost 0<\beta<\infty. We then corroborate our work by comparing them against numerical experiments.07/2012; 
Article: Partial freeness of random matrices
[Show abstract] [Hide abstract]
ABSTRACT: We investigate the implications of free probability for random matrices. From rules for calculating all possible joint moments of two free random matrices, we develop a notion of partial freeness which is quantified by the breakdown of these rules. We provide a combinatorial interpretation for partial freeness as the presence of closed paths in Hilbert space defined by particular joint moments. We also discuss how asymptotic moment expansions provide an error term on the density of states. We present MATLAB code for the calculation of moments and free cumulants of arbitrary random matrices.04/2012;  [Show abstract] [Hide abstract]
ABSTRACT: We approximate the density of states in disordered systems by decomposing the Hamiltonian into two random matrices and constructing their free convolution. The error in this approximation is determined using asymptotic moment expansions. Each moment can be decomposed into contributions from specific joint moments of the random matrices; each of which has a combinatorial interpretation as the weighted sum of returning trajectories. We show how the error, like the free convolution itself, can be calculated without explicit diagonalization of the Hamiltonian. We apply our theory to Hamiltonians for onedimensional tight binding models with Gaussian and semicircular site disorder. We find that the particular choice of decomposition crucially determines the accuracy of the resultant density of states. From a partitioning of the Hamiltonian into diagonal and offdiagonal components, free convolution produces an approximate density of states which is correct to the eighth moment. This allows us to explain the accuracy of mean field theories such as the coherent potential approximation, as well as the results of isotropic entanglement theory.02/2012;  [Show abstract] [Hide abstract]
ABSTRACT: We propose a method that we call isotropic entanglement (IE), which predicts the eigenvalue distribution of quantum many body (spin) systems with generic interactions. We interpolate between two known approximations by matching fourth moments. Though such problems can be QMAcomplete, our examples show that isotropic entanglement provides an accurate picture of the spectra well beyond what one expects from the first four moments alone. We further show that the interpolation is universal, i.e., independent of the choice of local terms.Physical Review Letters 08/2011; 107(9):097205. · 7.73 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Partitioning oracles were introduced by Hassidim et al. (FOCS 2009) as a generic tool for constanttime algorithms. For any epsilon > 0, a partitioning oracle provides query access to a fixed partition of the input boundeddegree minorfree graph, in which every component has size poly(1/epsilon), and the number of edges removed is at most epsilon*n, where n is the number of vertices in the graph. However, the oracle of Hassidimet al. makes an exponential number of queries to the input graph to answer every query about the partition. In this paper, we construct an efficient partitioning oracle for graphs with constant treewidth. The oracle makes only O(poly(1/epsilon)) queries to the input graph to answer each query about the partition. Examples of boundedtreewidth graph classes include kouterplanar graphs for fixed k, seriesparallel graphs, cactus graphs, and pseudoforests. Our oracle yields poly(1/epsilon)time property testing algorithms for membership in these classes of graphs. Another application of the oracle is a poly(1/epsilon)time algorithm that approximates the maximum matching size, the minimum vertex cover size, and the minimum dominating set size up to an additive epsilon*n in graphs with bounded treewidth. Finally, the oracle can be used to test in poly(1/epsilon) time whether the input boundedtreewidth graph is kcolorable or perfect.06/2011; 
Conference Paper: Performance of sample covariance based capon bearing only tracker
[Show abstract] [Hide abstract]
ABSTRACT: Bearing estimates input to a tracking algorithm require a concomitant measurement error to convey confidence. When Capon algorithm based bearing estimates are derived from low signaltonoise ratio (SNR) data, the method of interval errors (MIE) provides a representation of measurement error improved over high SNR metrics like the CramérRao bound or Taylor series. A corresponding improvement in overall tracker performance is had. These results have been demonstrated [4] assuming MIE has perfect knowledge of the true data covariance. Herein this assumption is weakened to explore the potential performance of a practical implementation that must address the challenges of nonstationarity and finite sample effects. Comparisons with known nonlinear smoothing techniques designed to reject outlier measurements is also explored.Signals, Systems and Computers (ASILOMAR), 2011 Conference Record of the Forty Fifth Asilomar Conference on; 01/2011  [Show abstract] [Hide abstract]
ABSTRACT: Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the size of the set of allowable compositions. As a result, programmers often deal with this issue in an adhoc manner that can sometimes violate sound programming practices such as maintaining library abstractions. In this paper, we propose language extensions that expose tradeoffs between time and accuracy to the compiler. The compiler performs fully automatic compiletime and installtime autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and subcalls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithmspecific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements , we can easily obtain performance improvements ranging from 1.1× to orders of magnitude of speedup.Proceedings of the CGO 2011, The 9th International Symposium on Code Generation and Optimization, Chamonix, France, April 26, 2011; 01/2011  [Show abstract] [Hide abstract]
ABSTRACT: Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the resulting size of the set of allowable compositions. As a result, programmers often deal with this issue in an adhoc manner that can sometimes violate sound programming practices such as maintaining library abstractions. In this paper, we propose language extensions that expose tradeoffs between time and accuracy to the compiler. The compiler performs fully automatic compiletime and installtime autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and subcalls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithmspecific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements, we can easily obtain performance improvements ranging from 1.1x to orders of magnitude of speedup.08/2010;  [Show abstract] [Hide abstract]
ABSTRACT: The method of interval estimation (MIE) provides a strategy for mean squared error (MSE) prediction of algorithm performance at low signaltonoise ratios (SNR) below estimation threshold where asymptotic predictions fail. MIE interval error probabilities for the Capon algorithm are known and depend on the true data covariance and assumed signal array response. Herein estimation of these error probabilities is considered to improve representative measurement errors for parameter estimates obtained in low SNR scenarios, as this may improve overall target tracking performance. A statistical analysis of Capon error probability estimation based on the data sample covariance matrix is explored herein.Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on; 01/2010
Publication Stats
3k  Citations  
84.70  Total Impact Points  
Top Journals
Institutions

2014

Distributed Artificial Intelligence Laboratory
Berlín, Berlin, Germany


1994–2014

Massachusetts Institute of Technology
 • Department of Chemistry
 • Department of Mathematics
 • Laboratory for Computer Science
Cambridge, Massachusetts, United States


1995–2005

University of California, Berkeley
Berkeley, California, United States
