About
60
Publications
3,411
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
553
Citations
Citations since 2017
Publications
Publications (60)
Understanding engagement patterns of users in online platforms, may it be games, online social networks, or academic websites, is a widely studied topic with many real-world applications and economic consequences. A holy grail in this area of research is to develop an automatic prediction algorithm for when a user is going to leave the platform and...
Modeling human personality is important for several AI challenges, from the engineering of artificial psychotherapists to the design of persona bots. However, the field of computational personality analysis heavily relies on labeled data, which may be expensive, difficult or impossible to get. This problem is amplified when dealing with rare person...
It's by now folklore that to understand the activity pattern of a user in an online social network (OSN) platform, one needs to look at his friends or the ones he follows. The common perception is that these friends exert influence on the user, effecting his decision whether to re-share content or not. Hinging upon this intuition, a variety of mode...
Stance detection is an important task, supporting many downstream tasks such as discourse parsing and modeling the propagation of fake news, rumors, and science denial. In this paper, we propose a novel framework for stance detection. Our framework is unsupervised and domain-independent. Given a claim and a multi-participant discussion -- we constr...
E-commerce is the fastest-growing segment of the economy. Online reviews play a crucial role in helping consumers evaluate and compare products and services. As a result, fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. There are many reasons why it is hard to identify opinion spammer...
E-commerce is the fastest-growing segment of the economy. Online reviews play a crucial role in helping consumers evaluate and compare products and services. As a result, fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. There are many reasons why it is hard to identify opinion spammer...
The principle of compositionality, an important postulation in language and cognition research, posits that the meaning of a complex expression is determined by the meaning of its constituting parts and the operation performed on those parts. Here, we provide strong evidence that this principle plays a significant role also in interpreting facial e...
Stance detection is an important task, supporting many downstream tasks such as discourse parsing and modeling the propagation of fake news, rumors, and science denial. In this paper, we propose a novel framework for stance detection. Our framework is unsupervised and domain-independent. Given a claim and a multi-participant discussion - we constru...
DNA–protein interactions play essential roles in all living cells. Understanding of how features embedded in the DNA sequence affect specific interactions with proteins is both challenging and important, since it may contribute to finding the means to regulate metabolic pathways involving DNA–protein interactions. Using a massive experimental bench...
DNA–protein interactions are essential in all aspects of every living cell. Understanding of how features embedded in the DNA sequence affect specific interactions with proteins is challenging but important, since it may contribute to finding the means to regulate metabolic pathways involving DNA–protein interactions. Using a massive experimental b...
Complex social systems at various scales of analysis (e.g. dyads, families, tribes, etc.) are formed and maintained through verbal interactions. Therefore, the ability to (1) model these interactions and (2) to use models of interaction for identifying significant relations may be of interest to the social sciences. Adopting the perspective of soci...
The taxing computational effort that is involved in solving some high-dimensional statistical problems, in particular problems involving non-convex optimization, has popularized the development and analysis of algorithms that run efficiently (polynomial-time) but with no general guarantee on statistical consistency. In light of the ever-increasing...
M. tuberculosis (Mtb) is a pathogenic bacterium that causes tuberculosis, which kills more than 1.5 million people worldwide every year. Resistant strains to available antibiotics pose a significant healthcare problem. The enormous complexity of the ribosome poses a barrier for drug discovery. We have overcome this in a tractable way by using an RN...
This paper studies students' engagement in e-learning environments in which students work independently and solve problems without external supervision. We propose a new method to infer engagement patterns of users in such self-directed environments. We view engagement as a continuous process in time, measured along carefully chosen axes that are d...
We have developed new lead compounds that target the ribosomal peptidyl transferase center (PTC) of Mycobacterium tuberculosis. For this purpose, we used a fragment-based virtual screening (FBVS) methodology, in which the first step was the novel exploitation of NMR T2 relaxation to identify fragment molecules that bind specifically to RNA hairpin...
Studies of emotional facial expressions reveal agreement among observes about the meaning of six to fifteen basic static expressions. Other studies focused on the temporal evolvement, within single facial expressions. Here, we argue that people infer a larger set of emotion states than previously assumed, by taking into account sequences of differe...
In this work we ask to which extent are simple statistics useful to make sense of social media data. By simple statistics we mean counting and bookkeeping type features such as the number of likes given to a user's post, a user's number of friends, etc. We find that relying solely on simple statistics is not always a good approach. Specifically, we...
There are certain contexts, where we would like to analyze the behavior of small interacting systems, such as sports teams. While large interacting systems have drawn much attention in the past years, let it be physical systems of interacting particles or social networks, small systems are short of appropriate quantitative modeling and measurement...
To optimize its performance, a competitive team, such as a soccer team, must maintain a delicate balance between organization and disorganization. On the one hand, the team should maintain organized patterns of behavior to maximize the cooperation between its members. On the other hand, the team’s behavior should be disordered enough to mislead its...
Anomaly detection in a communication network is a powerful tool for predicting faults, detecting network sabotage attempts and learning user profiles for marketing purposes and quality of services improvements. In this article, we convert the unsupervised data mining learning problem into a supervised classification problem. We will propose three m...
Lagging strand DNA synthesis by DNA polymerase requires RNA primers produced by DNA primase. The N-terminal primase domain of the gene 4 protein of phage T7 comprises a zinc-binding domain that recognizes a specific DNA sequence and an RNA polymerase domain that catalyzes RNA polymerization. Based on its crystal structure, the RNA polymerase domain...
We present a new method to construct a family of co-spectral graphs. Our method is based on a new type of graph product that we define, the bipartite graph product, which may be of self-interest. Our method is different from existing techniques in the sense that it is not based on a sequence of local graph operations (e.g. Godsil–McKay switching)....
Based on a non-rigorous formalism called the “cavity method”, physicists have put forward intriguing predictions on phase transitions in diluted mean-field models, in which the geometry of interactions is induced by a sparse random graph or hypergraph. One example of such a model is the graph coloring problem on the Erdős–Renyi random graph G(n, d/...
For a fixed number $d>0$ and $n$ large, let $G(n,d/n)$ be the random graph on $n$ vertices in which any two vertices are connected with probability $d/n$ independently. The problem of determining the chromatic number of $G(n,d/n)$ goes back to the famous 1960 article of Erdös and Rényi that started the theory of random graphs [Magayar Tud. Akad. Ma...
Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such al...
Based on a non-rigorous formalism called the "cavity method", physicists have
put forward intriguing predictions on phase transitions in discrete structures.
One of the most remarkable ones is that in problems such as random $k$-SAT or
random graph $k$-coloring, very shortly before the threshold for the existence
of solutions there occurs another p...
In the Graph Realization Problem (GRP), one is given a graph G, a set of non-negative edge-weights, and an integer d. The goal is to find, if possible, a realization of G in the Euclidian space R d , such that the distance between any two vertices is the assigned edge weight. The problem has many applications in mathematics and computer science, bu...
Traditional studies of multi-source, multi-terminal interference channels typically allow a vanishing probability of error in communication. Motivated by the study of network coding, this work addresses the task of quantifying the loss in rate when insisting on zero error communication in the context of interference channels.
Traditional studies of multi-source, multi-terminal interference channels
typically allow a vanishing probability of error in communication. Motivated by
the study of network coding, this work addresses the task of quantifying the
loss in rate when insisting on zero error communication in the context of
interference channels.
Estimating the leading principal components of data assuming they are sparse,
is a central task in modern high-dimensional statistics. Many algorithms were
suggested for this sparse PCA problem, from simple diagonal thresholding to
sophisticated semidefinite programming (SDP) methods. A key theoretical
question asks under what conditions can such a...
Over the past decade, physicists have developed deep but non-rigorous
techniques for studying phase transitions in discrete structures. Recently,
their ideas have been harnessed to obtain improved rigorous results on the
phase transitions in binary problems such as random $k$-SAT or $k$-NAESAT
(e.g., Coja-Oghlan and Panagiotou: STOC 2013). However,...
The problem of (approximately) counting the number of triangles in a
graph is one of the basic problems in graph theory. In this paper we
study the problem in the streaming model. We study the amount of memory
required by a randomized algorithm to solve this problem. In case the
algorithm is allowed one pass over the stream, we present a best
possi...
Given a graph $G=(V,E)$, an integer $k$, and a function $f_G:V^k \times V^k
\to {0,1}$, the $k^{th}$ graph product of $G$ w.r.t $f_G$ is the graph with
vertex set $V^k$, and an edge between two vertices $x=(x_1,...,x_k)$ and
$y=(y_1,...,y_k)$ iff $f_G(x,y)=1$. Graph products are a basic combinatorial
object, widely studied and used in different are...
Consider the random graph process where we start with an empty graph on n vertices and, at time t, are given an edge e
t
chosen uniformly at random among the edges which have not appeared so far. A classical result in random graph theory asserts that w.h.p. the graph becomes Hamiltonian at time (1/2+o(1))n log n. On the contrary, if all the edges...
In a load balancing network each processor has an initial collection of unit-size jobs, tokens, and in each round, pairs of processors connected by balancers split their load as evenly as possible. An excess token (if any) is placed according to some predefined rule. As it turns out, this rule crucially effects the performance of the network. In th...
Consider the random graph process where we start with an empty graph on n
vertices, and at time t, are given an edge e_t chosen uniformly at random among
the edges which have not appeared so far. A classical result in random graph
theory asserts that w.h.p. the graph becomes Hamiltonian at time (1/2+o(1))n
log n. On the contrary, if all the edges w...
Let (C
1,C′1),(C
2,C′2),...,(C
m
,C′m
) be a sequence of ordered pairs of 2CNF clauses chosen uniformly at random (with replacement) from the set of all \(4\binom{n}{2}\) clauses on n variables. Choosing exactly one clause from each pair defines a probability distribution over 2CNF formulas. The choice at each step must be made on-line, without bac...
Belief propagation (BP) is a message-passing algorithm that computes the exact marginal distributions at every vertex of a graphical model without cycles. While BP is designed to work correctly on trees, it is routinely applied to general graphical models that may contain cycles, in which case neither convergence, nor correctness in the case of con...
It is known that random k-CNF formulas have a so-called satisfiability threshold at a density (namely, clause-variable ratio) of roughly 2^k\ln 2: at densities slightly below this threshold almost all k-CNF formulas are satisfiable whereas slightly above this threshold almost no k-CNF formula is satisfiable. In the current work we consider satisfia...
Recently, Hazan and Krauthgamer showed [12] that if, for a fixed small ε, an ε-best ε-approximate Nash equilibrium can be found in polynomial time in two-player games, then it is also possible to find a planted clique in G
n, 1/2 of size C logn, where C is a large fixed constant independent of ε. In this paper, we extend their result to show that i...
Experimental results show that certain message passing algorithms, namely, Survey Propagation, are very effective in finding satisfying assignments for random satisfiable 3CNF formulas which are considered hard for other SAT heuristics. Unfortunately, rigorous understanding of this phenomena is still lacking. In this paper we make a modest step tow...
In this paper we study the model of ε-smoothed k-CNF formulas. Starting from an arbitrary instance F with n variables and m = dn clauses, apply the ε-smoothing operation of flipping the polarity of every literal in every clause independently at random with probability ε. Keeping ε and k fixed, and letting the density d = m/n grow, it is rather easy...
As part of the efforts put in understanding the intricacies of the k-colorability problem, different distributions over k-colorable graphs were analyzed. While the problem is notoriously hard (not even reasonably approximable) in the worst case, the average case (with respect to such distributions) often turns out to be "easy". Semi-random models m...
In this work we suggest a new model for generating random satisfiable k -CNF formulas. To generate such formulas. randomly permute all $2^k\binom{n}{k}$ possible clauses over the variables x 1 ,. . ., x n , and starting from the empty formula, go over the clauses one by one, including each new clause as you go along if, after its addition, the form...
Contributing to the rigorous understanding of BP, in this paper we relate the convergence of BP to spectral properties of the graph. This encompasses a result for random graphs with a ``planted'' solution; thus, we obtain the first rigorous result on BP for graph coloring in the case of a complex graphical structure (as opposed to trees). In partic...
International audience
Message passing algorithms are popular in many combinatorial optimization problems. For example, experimental results show that \emphsurvey propagation (a certain message passing algorithm) is effective in finding proper k-colorings of random graphs in the near-threshold regime. In 1962 Gallager introduced the concept of Low...
We study a new approach to the satisabilit y problem, which we call the Support Paradigm. Given a CNF formula F and an assignment to its variables we say that a literal x supports a clause C in F w.r.t. if x is the only literal that evaluates to true in C. Our focus in this work will be on heuristics that obey the following general template: start...
Coloring a k-colorable graph using k colors (k≥3) is a notoriously hard problem. Considering average case analysis allows for better results. In this work we consider the
uniform distribution over k-colorable graphs with n vertices and exactly cn edges, c greater than some sufficiently large constant. We rigorously show that all proper k-colorings...
International audience
Finding a satisfying assignment for a $k$-CNF formula $(k \geq 3)$, assuming such exists, is a notoriously hard problem. In this work we consider the uniform distribution over satisfiable $k$-CNF formulas with a linear number of clauses (clause-variable ratio greater than some constant). We rigorously analyze the structure of...
Belief Propagation (BP) is a message-passing algorithm that computes the exact marginal distributions at every vertex of a graphical model without cycles. While BP is designed to work correctly on trees, it is routinely appli ed to general graphical models that may contain cycles, in which case neither convergence, nor correctness in the case of co...
Semirandom models generate problem instances by blend- ing random and adversarial decisions, thus intermediating between the worst-case assumptions that may be overly pes- simistic in many situations, and the easy pure random case. In the Gn;p;k random graph model, the n vertices are par- titioned into k color classes each of size n=k. Then, ev- er...
We present an algorithm for solving 3SAT instances. Several algorithms have been proved to work whp (with high probability) for various SAT distributions. However, an algorithm that works whp has a drawback. Indeed for typical instances it works well, however for some rare inputs it does not provide a solution at all. Alternatively, one could requi...
In this work we suggest a new model for generating random satisflable 3CNF formulas. To generate such formulas { randomly permute all 8 ¡n 3 ¢ possible clauses over the variables x1;:::;xn, and starting from the empty formula, go over the clauses one by one, including each new clause as you go along if after its addition the formula remains satisfl...