Ran Gilad-Bachrach

Ran Gilad-Bachrach
Tel Aviv University | TAU · Department of Biomedical Engineering

PhD

About

74
Publications
21,386
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,768
Citations
Citations since 2016
29 Research Items
2893 Citations
20162017201820192020202120220100200300400500600
20162017201820192020202120220100200300400500600
20162017201820192020202120220100200300400500600
20162017201820192020202120220100200300400500600
Introduction

Publications

Publications (74)
Article
Full-text available
Remote assessment of the gait of older adults (OAs) during daily living using wrist-worn sensors has the potential to augment clinical care and mobility research. However, hand movements can degrade gait detection from wrist-sensor recordings. To address this challenge, we developed an anomaly detection algorithm and compared its performance to fou...
Preprint
Full-text available
When dealing with tabular data, models based on regression and decision trees are a popular choice due to the high accuracy they provide on such tasks and their ease of application as compared to other model classes. Yet, when it comes to graph-structure data, current tree learning algorithms do not provide tools to manage the structure of the data...
Preprint
Full-text available
The black-box nature of modern machine learning techniques invokes a practical and ethical need for explainability. Feature importance aims to meet this need by assigning scores to features, so humans can understand their influence on predictions. Feature importance can be used to explain predictions under different settings: of the entire sample s...
Preprint
Full-text available
We continuously interact with computerized systems to achieve goals and perform tasks in our personal and professional lives. Therefore, the ability to program such systems is a skill needed by everyone. Consequently, computational thinking skills are essential for everyone, which creates a challenge for the educational system to teach these skills...
Preprint
Full-text available
Motivated by the fact that humans like some level of unpredictability or novelty, and might therefore get quickly bored when interacting with a stationary policy, we introduce a novel non-stationary bandit problem, where the expected reward of an arm is fully determined by the time elapsed since the arm last took part in a switch of actions. Our mo...
Article
Full-text available
In recent years, methods were proposed for assigning feature importance scores to measure the contribution of individual features. While in some cases the goal is to understand a specific model, in many cases the goal is to understand the contribution of certain properties (features) to a real-world phenomenon. Thus, a distinction has been made bet...
Article
Full-text available
Search advertising, a popular method for online marketing, has been employed to improve health by eliciting positive behavioral change. However, writing effective advertisements requires expertise and experimentation, which may not be available to health authorities wishing to elicit such changes, especially when dealing with public health crises s...
Article
Introduction: People with T1DM and their healthcare providers perceive that mood and sleep quality influence blood glucose (BG). Artificial pancreas algorithms use only BG estimates to predict future BG and determine insulin requirements. We aimed to improve the understanding of the effect of emotional states and personality on temporal BG changes...
Article
Full-text available
Machine Learning models should ideally be compact and robust. Compactness provides efficiency and comprehensibility whereas robustness provides stability. Both topics have been studied in recent years but in isolation. Here we present a robust model compression scheme which is independent of model types: it can compress ensembles, neural networks a...
Article
Full-text available
Cerebrovascular disease is a leading cause of mortality and disability and an immense global burden. It is partly related to aging in a metabolic syndrome-promoting environment. Prevention strategies are insufficient: they rely on intermittent screening in predominantly high-risk individuals, while most cases occur in the intermediate risk populati...
Preprint
Full-text available
Machine Learning models should ideally be compact and robust. Compactness provides efficiency and comprehensibility whereas robustness provides resilience. Both topics have been studied in recent years but in isolation. Here we present a robust model compression scheme which is independent of model types: it can compress ensembles, neural networks...
Preprint
BACKGROUND Cerebrovascular disease is a leading cause of mortality and disability. Common risk assessment tools for stroke are based on the Framingham equation, which relies on traditional cardiovascular risk factors to predict an acute event in the near decade. However, no tools are currently available to predict a near/impending stroke, which mig...
Preprint
Full-text available
When training a predictive model over medical data, the goal is sometimes to gain insights about a certain disease. In such cases, it is common to use feature importance as a tool to highlight significant factors contributing to that disease. As there are many existing methods for computing feature importance scores, understanding their relative me...
Conference Paper
A vast amount of data belonging to companies and individuals is currently stored in the cloud in encrypted form by trustworthy service providers such as Microsoft, Amazon, and Google. Unfortunately, the only way for the cloud to use the data in computations is to first decrypt it, then compute on it, and finally re-encrypt it, resulting in a proble...
Preprint
Full-text available
Search advertising is one of the most commonly-used methods of advertising. Past work has shown that search advertising can be employed to improve health by eliciting positive behavioral change. However, writing effective advertisements requires expertise and (possible expensive) experimentation, both of which may not be available to public health...
Preprint
Full-text available
Vygotsky's notions of Zone of Proximal Development and Dynamic Assessment emphasize the importance of personalized learning that adapts to the needs and abilities of the learners and enables more efficient learning. In this work we introduce a novel adaptive learning engine called E-gostky that builds on these concepts to personalize the learning p...
Preprint
Full-text available
When applying machine learning to sensitive data, one has to balance between accuracy, information leakage, and computational-complexity. Recent studies have shown that Homomorphic Encryption (HE) can be used for protecting against information leakage while applying neural networks. However, this comes with the cost of limiting the width and depth...
Article
Full-text available
Background One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on specific mutations, the idea was for the data holder to encrypt...
Preprint
Full-text available
In many cases, assessing the quality of goods is hard. For example, when purchasing a car, it is hard to measure how pollutant the car is since there are infinitely many driving conditions to be tested. Typically, these situations are considered under the umbrella of information asymmetry and as Akelrof showed may lead to a market of lemons. Howeve...
Article
Full-text available
In real world systems, the predictions of deployed Machine Learned models affect the training data available to build subsequent models. This introduces a bias in the training data that needs to be addressed. Existing solutions to this problem attempt to resolve the problem by either casting this in the reinforcement learning framework or by quanti...
Article
Full-text available
Currently known methods for this task either employ the computationally intensive \emph{exponential mechanism} or require an access to the covariance matrix, and therefore fail to utilize potential sparsity of the data. The problem of designing simpler and more efficient methods for this task has been raised as an open problem in \cite{kapralov2013...
Technical Report
Full-text available
This document presents a list of potential applications for homomorphic encryption. The list of application is not comprehensive, instead, it tries to demonstrate the breadth of potential applications in various domains and therefore to demonstrate the importance of this technology. The list was curated during the Crypto Standardization Workshop th...
Article
Biological data science is an emerging field facing multiple challenges for hosting, sharing, computing on, and interacting with large data sets. Privacy regulations and concerns about the risks of leaking sensitive personal health and genomic data add another layer of complexity to the problem. Recent advances in cryptography over the last five ye...
Conference Paper
Affective computing systems blend technical and social elements and present challenges for designing notifications. As the number of applications expand that utilize notifications, we need to consider that competing for attention is a concern. With respect to promoting positive health behaviors, notifications need to be adapted to times when a user...
Conference Paper
Various wearable sensors capturing body vibration, jaw movement, hand gesture, etc., have shown promise in detecting when one is currently eating. However, based on existing literature and user surveys conducted in this study, we argue that a Just-in-Time eating intervention, triggered upon detecting a current eating event, is sub-optimal. An eatin...
Article
Applying machine learning to a problem which involves medical, financial, or other types of sensitive data, not only requires accurate predictions but also careful attention to maintaining data privacy and security. Legal and ethical requirements may prevent the use of cloud-based machine learning solutions for such tasks. In this work, we will pre...
Conference Paper
Full-text available
The development of strong social and emotional skills is central to personal wellbeing. Increasingly, these skills are being taught in schools through well researched curricula. Such social-emotional learning (SEL) curricula are most effective if reinforced by parents, thus transferring the skills into everyday contexts. Traditional SEL programs ha...
Technical Report
Applying machine learning to a problem which involves medical, financial, or other types of sensitive data, not only requires accurate predictions but also careful attention to maintaining data privacy and security. Legal and ethical requirements may prevent the use of cloud-based machine learning solutions for such tasks. In this work, we will pre...
Article
Full-text available
Multiple Additive Regression Trees (MART), an ensemble model of boosted regression trees, is known to deliver high prediction accuracy for diverse tasks, and it is widely used in practice. However, it suffers an issue which we call over-specialization, wherein trees added at later iterations tend to impact the prediction of only a few instances, an...
Patent
Search model updates are described. In one or more implementations, a search service uses a model to rank items in a search result, the model formed using an initial set of data. An update is generated using a subsequent set of data, which is formed after the initial set of data, that provides feedback describing user interaction with one or more i...
Article
Full-text available
The problem we address is the following: how can a user employ a predictive model that is held by a third party, without compromising private information. For example, a hospital may wish to use a cloud service to predict the readmission risk of a patient. However, due to regulations, the patient's medical files cannot be revealed. The goal is to m...
Conference Paper
A prominent approach in collaborative filtering based recommender systems is using dimensionality reduction (matrix factorization) techniques to map users and items into low-dimensional vectors. In such systems, a higher inner product between a user vector and an item vector indicates that the item better suits the user's preference. Traditionally,...
Conference Paper
Full-text available
Stress is considered to be a modern day " global epidemic"; so given the widespread nature of this problem, it would be beneficial if solutions that help people to learn how to cope better with stress were scalable beyond what individual or group therapies can provide today. Therefore, in this work, we study the potential of smart-phones as a perva...
Patent
Full-text available
Technologies are described herein for placing search results on a search engine results page (SERP). A query may be received. The query may be transmitted to a plurality of search result providers. A first set of search results and a second set of search results may be received from the search result providers. Intent features may be extracted from...
Article
Typically, one approaches a supervised machine learning problem by writing down an objective function and finding a hypothesis that minimizes it. This is equivalent to finding the Maximum A Posteriori (MAP) hypothesis for a Boltzmann distribution. However, MAP is not a robust statistic. We present an alternative approach by defining a median of the...
Article
Full-text available
In the mixture models problem it is assumed that there are $K$ distributions $\theta_{1},\ldots,\theta_{K}$ and one gets to observe a sample from a mixture of these distributions with unknown coefficients. The goal is to associate instances with their generating distributions, or to identify the parameters of the hidden distributions. In this work...
Patent
Full-text available
This patent application pertains to answer model comparison. One implementation can determine a first frequency at which an individual answer category appears in an individual slot on a query results page when utilizing a first model. The method can ascertain a second frequency at which the individual answer category appears in the individual slot...
Conference Paper
Full-text available
Many learning algorithms generate complex models that are difficult for a human to interpret, debug, and extend. In this paper, we address this challenge by proposing a new learning paradigm called correctable learning, where the learning algorithm receives external feedback about which data examples are incorrectly learned. We define a set of metr...
Conference Paper
Personal area networks are enablers for many new medical applications. In this work, we present an implementation of such a network through a guided wave on body communication channel. This method allows the creation of high bandwidth communication channels which are confined to the body and improve on previous technologies in terms of privacy and...
Article
Full-text available
Human gait is an important indicator of health, with applications ranging from diagnosis, monitoring, and rehabilitation. In practice, the use of gait analysis has been limited. Existing gait analysis systems are either expensive, intrusive, or require well-controlled environments such as a clinic or a laboratory. We present an accurate gait analys...
Conference Paper
Modern web search engines are federated --- a user query is sent to the numerous specialized search engines called verticals like web (text documents), News, Image, Video, etc. and the results returned by these engines are then aggregated and composed into a search result page (SERP) and presented to the user. For a specific query, multiple vertica...
Article
Full-text available
The classification task uses observations and prior knowledge to select a hypothesis that will predict class assignments well. In this work we ask the question: what is the best hypothesis to select from a given hypothesis class? To address this question we adopt a PAC-Bayesian approach. According to this viewpoint, the observations and prior knowl...
Article
Full-text available
Online prediction methods are typically presented as serial algorithms running on a single processor. However, in the age of web-scale prediction problems, it is increasingly common to encounter situations where a single processor cannot keep up with the high rate at which inputs arrive. In this work, we present the \emph{distributed mini-batch} al...
Article
Full-text available
The standard model of online prediction deals with serial processing of inputs by a single processor. However, in large-scale online prediction problems, where inputs arrive at a high rate, an increasingly common necessity is to distribute the computation across several processors. A non-trivial challenge is to design distributed algorithms for onl...
Article
Full-text available
Collecting large labeled data sets is a laborious and expensive task, whose scaling up requires division of the labeling workload between many teachers. When the number of classes is large, miscorrespondences between the labels given by the different teachers are likely to occur, which, in the extreme case, may reach total inconsistency. In this pa...
Conference Paper
Full-text available
Collecting large labeled data sets is a laborious and expensive task, whose scaling up requires division of the labeling workload between many teachers. When the number of classes is large, miscorrespondences between the labels given by the different teachers are likely to occur, which, in the extreme case, may reach total inconsistency. In this st...
Chapter
In this paper we introduce a margin based feature selection criterion and apply it to measure the quality of sets of features. Using margins we devise novel selection algorithms for multi-class categorization problems and provide theoretical generalization bound. We also study the well known Relief algorithm and show that it resembles a gradient as...
Article
Coping with high dimensional data is a challenge for ma-chine learning. Modern methods, such as kernel machines, show that it is possible to learn concepts even in surprisingly high dimensional spaces. However, working in high dimensions takes it toll, both in computational complexity and in accuracy. A common way to overcome these deficien-cies is...
Conference Paper
Full-text available
Computer grids are complex, heterogeneous, and dynamic systems, whose behavior is governed by hundreds of manually- tuned parameters. As the complexity of these systems grows, automating the procedure of parameter tuning becomes in- dispensable. In this paper, we consider the problem of auto- tuning server capacity, i.e. the number of jobs a server...
Conference Paper
Full-text available
Feature selection is usually motivated by improved computational complexity, economy and problem understanding, but it can also improve classification accuracy in many cases. In this paper we investigate the relationship between the optimal number of features and the training set size. We present a new and simple analysis of the well-studied two-Ga...
Conference Paper
Full-text available
Training a learning algorithm is a costly task. A major goal of active learning is to reduce this cost. In this paper we introduce a n ew algo- rithm, KQBC, which is capable of actively learning large scale problems by using selective sampling. The algorithm overcomes the costly sam- pling step of the well known Query By Committee(QBC) algorithm by...
Article
Full-text available
Feature selection is the task of choosing a small set out of a given set of features that capture the relevant properties of the data. In the context of supervised classification problems the relevance is determined by the given labels on the training data. A good choice of features is a key for building compact and accurate classifiers. In this pa...
Conference Paper
Full-text available
The Bayes classifier achieves the minimal error rate by constructing a weighted majority over all concepts in the concept class. The Bayes Point [1] uses the single concept in the class which has the minimal error. This way, the Bayes Point avoids some of the deficiencies of the Bayes classifier. We prove a bound on the generalization error for Bay...
Article
Full-text available
The Query By Committee (QBC) algorithm is among the few algorithms in the active learning framework that has some theoretical justification.
Article
Full-text available
A fundamental question in learning theory is the quantification of the basic tradeoff between the complexity of a model and its predictive accuracy. One valid way of quantifying this tradeoff, known as the "Information Bottleneck", is to measure both the complexity of the model and its prediction accuracy by using Shannon's mutual information. In t...
Article
Full-text available
Prototypes based algorithms are commonly used to reduce the computational complexity of Nearest-Neighbour (NN) classifiers. In this paper we discuss theoretical and algorithmical aspects of such algorithms. On the theory side, we present margin based generalization bounds that suggest that these kinds of classifiers can be more accurate then the 1-...
Article
A long-standing goal in the realm of Machine Learning is to minimize sample-complexity, i.e. to reduce as much as possible the number of examples used in the course of learning. The Active Learning paradigm is one such method aimed at achieving this goal by transforming the learner from a passive participant in the information gathering process to...
Article
Full-text available
This paper concerns the online list accessing problem. In the first part of the paper we present two new families of list accessing algorithms. The first family is of optimal, 2-competitive, deterministic online algorithms. This family, called the MRI (MOVE-TO-RECENT-ITEM) family, includes as members the well-known MOVE-TO-FRONT (MTF) algorithm and...
Article
Full-text available
A fundamental problem in learning theory is bounding the information gained by an example about the unknown target concept. This problem is most critical in the context of active learning, when the learner has to select the most informative examples to be labled in order to minimize the number of lables required for good generalization. The Mutual...
Article
Recent studies of supervised learning algorithms for text classiers employ (very) large data sets for training and evaluation. Nevertheless, there are many important situations that cannot be faithfully represented by such large data sets and are characterized mainly by inherently small training sets. In this paper we focus on three such small" bin...
Article
Full-text available
. Much of the research in machine learning and neural computation assumes that the teacher gives the correct answers to the learning algorithm. However, in many cases this assumption is doubtful, since dierent sorts of noise may disable the learner from getting the correct answers all the time. The noise could be caused by noisy communication, huma...
Article
Generalization in most PAC learning analysis starts around O (d) examples, where d = V C dim of the class. Nevertheless, analysis of learning curves using statistical mechanics shows much earlier generalization [7]. Here we introduce a gadget called Early Predictor, which exists if somewhat better than random prediction of the label of an arbitrary...
Article
Full-text available
We present a new scheme for cleaning a sample corrupted by labeling noise. Our scheme is universal in the sense that we only make general assumptions on the dual learning problem and therefore it is completely detached from the specifics of the primal problem itself. In a nutshell, we turn to the dual learning problem to exploit valuable informatio...
Conference Paper
Full-text available
Recent works have shown the advantage of using Active Learning methods, such as the Query by Committee (QBC) algorithm, to various learning problems. This class of Algorithms requires an oracle with the ability to randomly select a consistent hypothesis according to some predefined distribution. When trying to implement such an oracle, for the line...
Article
Full-text available
Generalization in most PAC learning analysis starts around O (d) examples, where d = V C dim of the class. Nevertheless, analysis of learning curves using statistical mechanics shows much earlier generalization [7]. Here we introduce a gadget called Early Predictor, which exists if somewhat better than random prediction of the label of an arbitrary...