Reza Bosagh Zadeh’s research while affiliated with Stanford Medicine and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (23)


Matrix Computations and Optimization in Apache Spark
  • Conference Paper

August 2016

·

520 Reads

·

86 Citations

Reza Bosagh Zadeh

·

Xiangrui Meng

·

·

[...]

·

Matei Zaharia

We describe matrix computations available in the cluster programming framework, Apache Spark. Out of the box, Spark provides abstractions and implementations for distributed matrices and optimization routines using these matrices. When translating single-node algorithms to run on a distributed cluster, we observe that often a simple idea is enough: separating matrix operations from vector operations and shipping the matrix operations to be ran on the cluster, while keeping vector operations local to the driver. In the case of the Singular Value Decomposition, by taking this idea to an extreme, we are able to exploit the computational power of a cluster, while running code written decades ago for a single core. Another example is our Spark port of the popular TFOCS optimization package, originally built for MATLAB, which allows for solving Linear programs as well as a variety of other convex programs. We conclude with a comprehensive set of benchmarks for hardware accelerated matrix computations from the JVM, which is interesting in its own right, as many cluster programming frameworks use the JVM. The contributions described in this paper are already merged into Apache Spark and available on Spark installations by default, and commercially supported by a slew of companies which provide further services.


Generalized Low Rank Models

January 2016

·

38 Reads

·

63 Citations

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well-known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.


linalg: Matrix Computations in Apache Spark

September 2015

·

321 Reads

·

7 Citations

We describe matrix computations available in the cluster programming framework, Apache Spark. Out of the box, Spark comes with the mllib.linalg library, which provides abstractions and implementations for distributed matrices. Using these abstractions, we highlight the computations that were more challenging to distribute. When translating single-node algorithms to run on a distributed cluster, we observe that often a simple idea is enough: separating matrix operations from vector operations and shipping the matrix operations to be ran on the cluster, while keeping vector operations local to the driver. In the case of the Singular Value Decomposition, by taking this idea to an extreme, we are able to exploit the computational power of a cluster, while running code written decades ago for a single core. We conclude with a comprehensive set of benchmarks for hardware accelerated matrix computations from the JVM, which is interesting in its own right, as many cluster programming frameworks use the JVM.


MLlib: Machine Learning in Apache Spark

May 2015

·

1,535 Reads

·

1,621 Citations

Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark's open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shipped with Spark, MLlib supports several languages and provides a high-level API that leverages Spark's rich ecosystem to simplify the development of end-to-end machine learning pipelines. MLlib has experienced a rapid growth due to its vibrant open-source community of over 140 contributors, and includes extensive documentation to support further growth and to let users quickly get up to speed.


Factorbird -a Parameter Server Approach to Distributed Matrix Factorization

November 2014

·

881 Reads

·

35 Citations

We present 'Factorbird', a prototype of a parameter server approach for factor-izing large matrices with Stochastic Gradient Descent-based algorithms. We de-signed Factorbird to meet the following desiderata: (a) scalability to tall and wide matrices with dozens of billions of non-zeros, (b) extensibility to different kinds of models and loss functions as long as they can be optimized using Stochastic Gradient Descent (SGD), and (c) adaptability to both batch and streaming scenar-ios. Factorbird uses a parameter server in order to scale to models that exceed the memory of an individual machine, and employs lock-free Hogwild!-style learning with a special partitioning scheme to drastically reduce conflicting updates. We also discuss other aspects of the design of our system such as how to efficiently grid search for hyperparameters at scale. We present experiments of Factorbird on a matrix built from a subset of Twitter's interaction graph, consisting of more than 38 billion non-zeros and about 200 million rows and columns, which is to the best of our knowledge the largest matrix on which factorization results have been reported in the literature.


Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares

October 2014

·

555 Reads

·

506 Citations

Journal of Machine Learning Research

The matrix-completion problem has attracted a lot of attention, largely as a result of the celebrated Netflix competition. Two popular approaches for solving the problem are nuclear-norm-regularized matrix approximation (Candes and Tao, 2009, Mazumder, Hastie and Tibshirani, 2010), and maximum-margin matrix factorization (Srebro, Rennie and Jaakkola, 2005). These two procedures are in some cases solving equivalent problems, but with quite different algorithms. In this article we bring the two approaches together, leading to an efficient algorithm for large matrix factorization and completion that outperforms both of these. We develop a software package "softImpute" in R for implementing our approaches, and a distributed version for very large matrices using the "Spark" cluster programming environment.


Generalized Low Rank Models

October 2014

·

213 Reads

·

282 Citations

Foundations and Trends® in Machine Learning

Principal components analysis (PCA) is a well-known technique for approximating a data set represented by a matrix by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose a number of large scale and parallel algorithms for fitting generalized low rank models which allow us to find low rank approximations to large heterogeneous datasets, and provide two codes that implement these algorithms.


On the precision of social and information networks

October 2013

·

44 Reads

·

32 Citations

The diffusion of information on online social and information networks has been a popular topic of study in recent years, but attention has typically focused on speed of dissemination and recall (i.e. the fraction of users getting a piece of information). In this paper, we study the complementary notion of the precision of information diffusion. Our model of information dissemination is "broadcast-based'', i.e., one where every message (original or forwarded) from a user goes to a fixed set of recipients, often called the user's ``friends'' or ``followers'', as in Facebook and Twitter. The precision of the diffusion process is then defined as the fraction of received messages that a user finds interesting. On first glance, it seems that broadcast-based information diffusion is a "blunt" targeting mechanism, and must necessarily suffer from low precision. Somewhat surprisingly, we present preliminary experimental and analytical evidence to the contrary: it is possible to simultaneously have high precision (i.e. is bounded below by a constant), high recall, and low diameter! We start by presenting a set of conditions on the structure of user interests, and analytically show the necessity of each of these conditions for obtaining high precision. We also present preliminary experimental evidence from Twitter verifying that these conditions are satisfied. We then prove that the Kronecker-graph based generative model of Leskovec et al. satisfies these conditions given an appropriate and natural definition of user interests. Further, we show that this model also has high precision, high recall, and low diameter. We finally present preliminary experimental evidence showing Twitter has high precision, validating our conclusion. This is perhaps a first step towards a formal understanding of the immense popularity of online social networks as an information dissemination mechanism.


WTF: the who to follow service at Twitter

May 2013

·

183 Reads

·

440 Citations

Wtf (" Who to Follow ") is Twitter's user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. This paper provides an architectural overview and shares lessons we learned in building and running the service over the past few years. Particularly noteworthy was our design decision to process the entire Twitter graph in memory on a single server, which significantly reduced architectural complexity and allowed us to develop and deploy the service in only a few months. At the core of our architecture is Cassovary, an open-source in-memory graph processing engine we built from scratch for Wtf. Besides powering Twitter's user recommendations , Cassovary is also used for search, discovery, promoted products, and other services as well. We describe and evaluate a few graph recommendation algorithms implemented in Cassovary, including a novel approach based on a combination of random walks and SALSA. Looking into the future, we revisit the design of our architecture and comment on its limitations, which are presently being addressed in a second-generation system under development.


WTF: the who to follow service at Twitter

May 2013

·

621 Reads

·

281 Citations

WTF ("Who to Follow") is Twitter's user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. This paper provides an architectural overview and shares lessons we learned in building and running the service over the past few years. Particularly noteworthy was our design decision to process the entire Twitter graph in memory on a single server, which significantly reduced architectural complexity and allowed us to develop and deploy the service in only a few months. At the core of our architecture is Cassovary, an open-source in-memory graph processing engine we built from scratch for WTF. Besides powering Twitter's user recommendations, Cassovary is also used for search, discovery, promoted products, and other services as well. We describe and evaluate a few graph recommendation algorithms implemented in Cassovary, including a novel approach based on a combination of random walks and SALSA. Looking into the future, we revisit the design of our architecture and comment on its limitations, which are presently being addressed in a second-generation system under development.


Citations (19)


... On the other hand, low-rank factorization decomposes large matrices into smaller ones, reducing the number of parameters and, thus, the computational load. [25][26] ...

Reference:

LEVERAGING MULTIMODAL AI IN EDGE COMPUTING FOR REAL-TIME DECISION-MAKING
Generalized Low Rank Models
  • Citing Book
  • January 2016

... Data profiling is a crucial step in our DW and generation procedure, which involves examining datasets to gain insights into important characteristics and the significance of relationships and correlations among their attributes [5,41]. Data profiling encompasses several techniques, such as data preview, summarising the primary properties of the data, data description, data distribution analysis, range analysis, identifying trends, pattern analysis, detecting anomalies, and validating assumptions using summary statistics. ...

Matrix Computations and Optimization in Apache Spark
  • Citing Conference Paper
  • August 2016

... Vector embeddings are used by the well-known music streaming service Spotify to generate customized playlists and suggest songs that align with users' musical preferences [50]. Twitter is one of the social media platforms that uses vector representations to recommend tweets and accounts to follow based on user interactions and interests [51]. These personalized recommendations keep users interested and satisfied, which increases user retention and loyalty. ...

WTF: the who to follow service at Twitter
  • Citing Conference Paper
  • May 2013

... This heuristic works very well for data-intensive ML algorithms because it generally avoids unnecessary overheads of distributed operations. Other systems such as Spark MLlib [368] and Mahout Samsara [301] also provide local and distributed operations but require the user to perform execution type selection manually. For the decision between CPU and GPU operations, SystemML again relies on memory estimates, heuristics, and manual configurations. ...

linalg: Matrix Computations in Apache Spark
  • Citing Article
  • September 2015

... Python libraries such as Scikit-learn [45], MLPy [46], and MLxtend [47] support DL without specific hardware backing, while libraries/frameworks like Tensorflow [48], PyTorch [49], Keras [50], Theano [51], MxNet [52], and JAX [53] facilitate Graphics Processing Unit (GPU) support, ensuring efficient parallelism for large-scale data problems. MapReduce frameworks like Apache Hadoop [54] and Apache Spark [55] offer scalable and fault-tolerant workload distribution for big data but lack native GPU integration. Python lacks a native MapReduce library, but alternatives like Deeplearning4J (DL4J) [56] and H2O [57] provide functionalities that integrate MapReduce capabilities into Python environments. ...

MLlib: Machine Learning in Apache Spark
  • Citing Article
  • May 2015

... In recent years, decentralized matrix factorization (DMF) has found application in various large-scale commercial domains by corporations, including Twitter [22], Kuaishou [27], and Meituan [29]. It has been employed for multiple matrix completion tasks such as recommender systems [16], social network analysis [15], and medical data mining [21]. ...

Factorbird -a Parameter Server Approach to Distributed Matrix Factorization
  • Citing Conference Paper
  • November 2014

... Although easy to apply, these methods may result in information loss or introduce significant bias [29]. In contrast, algorithms such as k-Nearest Neighbor, Expectation-Maximization, Matrix Factorization, and Multiple Imputation using Chained Equations [20,[30][31][32] consider multiple influencing parameters of the real system and their interrelationships as comprehensively as possible, thereby reducing imputation bias. Recently, interpolation models based on Generative Adversarial Networks (GANs) [33] have achieved higher accuracy; however, their training process may encounter issues such as mode collapse and difficulty in convergence. ...

Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares
  • Citing Article
  • October 2014

Journal of Machine Learning Research

... A deeper understanding of the intricate relationship between low-rankness and sparsity can be achieved by extending beyond conventional linear algebraic decomposition methods [47,53]. In a significant contribution to this field, Saul [41] investigated the intrinsic low-rank characteristics of non-negative sparse matrices through an innovative approach. ...

Generalized Low Rank Models
  • Citing Article
  • October 2014

Foundations and Trends® in Machine Learning

... GNNs have gained widespread popularity in numerous domains (Ying et al., 2018;Reiser et al., 2022;Buterez et al., 2024;Zhou et al., 2024). Many real-world graphs consist of billions of nodes and edges (Hu et al., 2021;Leskovec & Krevl, 2014;Gupta et al., 2013). Training GNNs on 1 Technical University of Munich, Germany 2 IBM Research, Dublin, Ireland 3 University of Bayreuth, Germany 4 University of Toronto, Canada. ...

WTF: the who to follow service at Twitter
  • Citing Conference Paper
  • May 2013