Preprint

On The MCMC Performance In Bernoulli Group Testing And The Random Max-Set Problem

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

The group testing problem is a canonical inference task where one seeks to identify k infected individuals out of a population of n people, based on the outcomes of m group tests. Of particular interest is the case of Bernoulli group testing (BGT), where each individual participates in each test independently and with a fixed probability. BGT is known to be an ``information-theoretically'' optimal design, as there exists a decoder that can identify with high probability as n grows the infected individuals using m=log2(nk)m^*=\log_2 \binom{n}{k} BGT tests, which is the minimum required number of tests among \emph{all} group testing designs. An important open question in the field is if a polynomial-time decoder exists for BGT which succeeds also with mm^* samples. In a recent paper (Iliopoulos, Zadik COLT '21) some evidence was presented (but no proof) that a simple low-temperature MCMC method could succeed. The evidence was based on a first-moment (or ``annealed'') analysis of the landscape, as well as simulations that show the MCMC success for n1000sn \approx 1000s. In this work, we prove that, despite the intriguing success in simulations for small n, the class of MCMC methods proposed in previous work for BGT with mm^* samples takes super-polynomial-in-n time to identify the infected individuals, when k=nαk=n^{\alpha} for α(0,1)\alpha \in (0,1) small enough. Towards obtaining our results, we establish the tight max-satisfiability thresholds of the random k-set cover problem, a result of potentially independent interest in the study of random constraint satisfaction problems.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In this review article we discuss connections between the physics of disordered systems, phase transitions in inference problems, and computational hardness. We introduce two models representing the behavior of glassy systems, the spiked tensor model and the generalized linear model. We discuss the random (non-planted) versions of these problems as prototypical optimization problems, as well as the planted versions (with a hidden solution) as prototypical problems in statistical inference and learning. Based on ideas from physics, many of these problems have transitions where they are believed to jump from easy (solvable in polynomial time) to hard (requiring exponential time). We discuss several emerging ideas in theoretical computer science and statistics that provide rigorous evidence for hardness by proving that large classes of algorithms fail in the conjectured hard regime. This includes the overlap gap property, a particular mathematization of clustering or dynamical symmetry-breaking, which can be used to show that many algorithms that are local or robust to changes in their input fail. We also discuss the sum-of-squares hierarchy, which places bounds on proofs or algorithms that use low-degree polynomials such as standard spectral methods and semidefinite relaxations, including the Sherrington–Kirkpatrick model. Throughout the manuscript we present connections to the physics of disordered systems and associated replica symmetry breaking properties.
Article
Full-text available
Significance Frequent mass testing can slow a rapidly spreading infectious disease by quickly identifying and isolating infected individuals from the population. One proposed method to reduce the extremely high costs of this testing strategy is to employ pooled testing, in which samples are combined and tested together using one test, and the entire pool is cleared given a negative test result. This paper demonstrates that frequent pooled testing of individuals with correlated risk—even given large uncertainty about infection rates—is particularly efficient. We conclude that frequent pooled testing using natural groupings is a cost-effective way to suppress infection risk in a pandemic.
Article
Full-text available
We study support recovery for a k×kk \times k principal submatrix with elevated mean λ/N\lambda /N, hidden in an N×NN\times N symmetric mean zero Gaussian matrix. Here λ>0\lambda >0 is a universal constant, and we assume k=Nρk = N \rho for some constant ρ(0,1)\rho \in (0,1). We establish that there exists a constant C>0C>0 such that the MLE recovers a constant proportion of the hidden submatrix if λC1ρlog1ρ\lambda {\ge C} \sqrt{\frac{1}{\rho } \log \frac{1}{\rho }}, while such recovery is information theoretically impossible if λ=o(1ρlog1ρ)\lambda = o( \sqrt{\frac{1}{\rho } \log \frac{1}{\rho }} ). The MLE is computationally intractable in general, and in fact, for ρ>0\rho >0 sufficiently small, this problem is conjectured to exhibit a statistical-computational gap. To provide rigorous evidence for this, we study the likelihood landscape for this problem, and establish that for some ε>0\varepsilon >0 and 1ρlog1ρλ1ρ1/2+ε\sqrt{\frac{1}{\rho } \log \frac{1}{\rho } } \ll \lambda \ll \frac{1}{\rho ^{1/2 + \varepsilon }}, the problem exhibits a variant of the Overlap-Gap-Property (OGP). As a direct consequence, we establish that a family of local MCMC based algorithms do not achieve optimal recovery. Finally, we establish that for λ>1/ρ\lambda > 1/\rho , a simple spectral method recovers a constant proportion of the hidden submatrix.
Article
Full-text available
The group testing problem consists of determining a small set of defective items from a larger set of items based on a number of tests, and is relevant in applications such as medical testing, communication protocols, pattern matching, and more. In this paper, we revisit an efficient algorithm for noisy group testing in which each item is decoded separately (Malyutov and Mateev, 1980), and develop novel performance guarantees via an information-theoretic framework for general noise models. For the special cases of no noise and symmetric noise, we find that the asymptotic number of tests required for vanishing error probability is within a factor log20.7\log 2 \approx 0.7 of the information-theoretic optimum at low sparsity levels, and that with a small fraction of allowed incorrectly decoded items, this guarantee extends to all sublinear sparsity levels. In addition, we provide a converse bound showing that if one tries to move slightly beyond our low-sparsity achievability threshold using separate decoding of items and i.i.d. randomized testing, the average number of items decoded incorrectly approaches that of a trivial decoder.
Conference Paper
Full-text available
This paper proposes a new fingerprinting decoder based on the Markov Chain Monte Carlo (MCMC) method. A Gibbs sampler generates groups of users according to the posterior probability that these users could have forged the sequence extracted from the pirated content. The marginal probability that a given user pertains to the collusion is then estimated by a Monte Carlo method. The users having the biggest empirical marginal probabilities are accused. This MCMC method can decode any type of fingerprinting codes. This paper is in the spirit of the 'Learn and Match' decoding strategy: it assumes that the collusion attack belongs to a family of models. The Expectation-Maximization algorithm estimates the parameters of the collusion model from the extracted sequence. This part of the algorithm is described for the binary Tardos code and with the exploitation of the soft outputs of the watermarking decoder. The experimental body considers some extreme setups where the fingerprinting code lengths are very small. It reveals that the weak link of our approach is the estimation part. This is a clear warning to the 'Learn and Match' decoding strategy.
Conference Paper
Full-text available
A large class of computational problems involve the determination of properties of graphs, digraphs, integers, arrays of integers, finite families of finite sets, boolean formulas and elements of other countable domains. Through simple encodings from such domains into the set of words over a finite alphabet these problems can be converted into language recognition problems, and we can inquire into their computational complexity. It is reasonable to consider such a problem satisfactorily solved when an algorithm for its solution is found which terminates within a number of steps bounded by a polynomial in the length of the input. We show that a large number of classic unsolved problems of covering, matching, packing, routing, assignment and sequencing are equivalent, in the sense that either each of them possesses a polynomial-bounded algorithm or none of them does.
Article
Full-text available
The subset sum problem is to decide whether or not the 0-l integer programming problem &Sgr;ni=l aixi = M, ∀I, xI = 0 or 1, has a solution, where the ai and M are given positive integers. This problem is NP-complete, and the difficulty of solving it is the basis of public-key cryptosystems of knapsack type. An algorithm is proposed that searches for a solution when given an instance of the subset sum problem. This algorithm always halts in polynomial time but does not always find a solution when one exists. It converts the problem to one of finding a particular short vector v in a lattice, and then uses a lattice basis reduction algorithm due to A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovasz to attempt to find v. The performance of the proposed algorithm is analyzed. Let the density d of a subset sum problem be defined by d = n/log2(maxi ai). Then for “almost all” problems of density d d n, it is proved that the lattice basis reduction algorithm locates v. Extensive computational tests of the algorithm suggest that it works for densities d dc(n), where dc(n) is a cutoff value that is substantially larger than 1/n. This method gives a polynomial time attack on knapsack public-key cryptosystems that can be expected to break them if they transmit information at rates below dc(n), as n → ∞.
Article
Full-text available
In binary high-throughput screening projects where the goal is the identification of low-frequency events, beyond the obvious issue of efficiency, false positives and false negatives are a major concern. Pooling constitutes a natural solution: it reduces the number of tests, while providing critical duplication of the individual experiments, thereby correcting for experimental noise. The main difficulty consists in designing the pools in a manner that is both efficient and robust: few pools should be necessary to correct the errors and identify the positives, yet the experiment should not be too vulnerable to biological shakiness. For example, some information should still be obtained even if there are slightly more positives or errors than expected. This is known as the group testing problem, or pooling problem. In this paper, we present a new non-adaptive combinatorial pooling design: the "shifted transversal design" (STD). It relies on arithmetics, and rests on two intuitive ideas: minimizing the co-occurrence of objects, and constructing pools of constant-sized intersections. We prove that it allows unambiguous decoding of noisy experimental observations. This design is highly flexible, and can be tailored to function robustly in a wide range of experimental settings (i.e., numbers of objects, fractions of positives, and expected error-rates). Furthermore, we show that our design compares favorably, in terms of efficiency, to the previously described non-adaptive combinatorial pooling designs. This method is currently being validated by field-testing in the context of yeast-two-hybrid interactome mapping, in collaboration with Marc Vidal's lab at the Dana Farber Cancer Institute. Many similar projects could benefit from using the Shifted Transversal Design.
Article
Full-text available
In this paper we present a detailed study of the hitting set (HS) problem. This problem is a generalization of the standard vertex cover to hypergraphs: one seeks a configuration of particles with minimal density such that every hyperedge of the hypergraph contains at least one particle. It can also be used in important practical tasks, such as the group testing procedures where one wants to detect defective items in a large group by pool testing. Using a statistical mechanics approach based on the cavity method, we study the phase diagram of the HS problem, in the case of random regular hypergraphs. Depending on the values of the variables and tests degrees different situations can occur: The HS problem can be either in a replica symmetric phase, or in a one-step replica symmetry breaking one. In these two cases, we give explicit results on the minimal density of particles, and the structure of the phase space. These problems are thus in some sense simpler than the original vertex cover problem, where the need for a full replica symmetry breaking has prevented the derivation of exact results so far. Finally, we show that decimation procedures based on the belief propagation and the survey propagation algorithms provide very efficient strategies to solve large individual instances of the hitting set problem.
Article
Full-text available
Maximum satisfiability is a canonical NP-hard optimization problem that appears empirically hard for random instances. Let us say that a Conjunctive normal form (CNF) formula consisting of k-clauses is p-satisfiable if there exists a truth assignment satisfying 12k+p2k1-2^{-k}+p 2^{-k} of all clauses (observe that every k-CNF is 0-satisfiable). Also, let Fk(n,m)F_k(n,m) denote a random k-CNF on n variables formed by selecting uniformly and independently m out of all possible k-clauses. It is easy to prove that for every k>1k>1 and every p in (0,1], there is Rk(p)R_k(p) such that if r>Rk(p)r >R_k(p), then the probability that Fk(n,rn)F_k(n,rn) is p-satisfiable tends to 0 as n tends to infinity. We prove that there exists a sequence δk0\delta_k \to 0 such that if r<(1δk)Rk(p)r <(1-\delta_k) R_k(p) then the probability that Fk(n,rn)F_k(n,rn)is p-satisfiable tends to 1 as n tends to infinity. The sequence δk\delta_k tends to 0 exponentially fast in k.
Article
We establish a phase transition known as the “all-or-nothing” phenomenon for noiseless discrete channels. This class of models includes the Bernoulli group testing model and the planted Gaussian perceptron model. Previously, the existence of the all-or-nothing phenomenon for such models was only known in a limited range of parameters. Our work extends the results to all signals with arbitrary sublinear sparsity. Over the past several years, the all-or-nothing phenomenon has been established in various models as an outcome of two seemingly disjoint results: one positive result establishing the “all” half of all-or-nothing, and one impossibility result establishing the “nothing” half. Our main technique in the present work is to show that for noiseless discrete channels, the “all” half implies the “nothing” half, that is, a proof of “all” can be turned into a proof of “nothing.” Since the “all” half can often be proven by straightforward means—for instance, by the first-moment method—our equivalence gives a powerful and general approach towards establishing the existence of this phenomenon in other contexts.
Article
We study a variant of the sparse PCA (principal component analysis) problem in the “hard” regime, where the inference task is possible yet no polynomial‐time algorithm is known to exist. Prior work, based on the low‐degree likelihood ratio, has conjectured a precise expression for the best possible (subexponential) runtime throughout the hard regime. Following instead a statistical physics‐inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low‐degree conjecture: we show that a class of natural MCMC (Markov chain Monte Carlo) methods (with worst‐case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the overlap gap property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime.
Article
In this paper, we study the problem of non-adaptive group testing, in which one seeks to identify which items are defective given a set of suitably-designed tests whose outcomes indicate whether or not at least one defective item was included in the test. The most widespread recovery criterion seeks to exactly recover the entire defective set, and relaxed criteria such as approximate recovery and list decoding have also been considered. In this paper, we study the fundamental limits of group testing under the significantly relaxed weak recovery criterion, which only seeks to identify a small fraction (e.g., 0.01) of the defective items. Given the near-optimality of i.i.d. Bernoulli testing for exact recovery in sufficiently sparse scaling regimes, it is natural to ask whether this design additionally succeeds with much fewer tests under weak recovery. Our main negative result shows that this is not the case, and in fact, under i.i.d. Bernoulli random testing in the sufficiently sparse regime, an all-or-nothing phenomenon occurs: When the number of tests is slightly below a threshold, weak recovery is impossible, whereas when the number of tests is slightly above the same threshold, high-probability exact recovery is possible. In establishing this result, we additionally prove similar negative results under Bernoulli designs for the weak detection problem (distinguishing between the group testing model vs. completely random outcomes) and the problem of identifying a single item that is definitely defective. On the positive side, we show that all three relaxed recovery criteria can be attained using considerably fewer tests under suitably-chosen non-Bernoulli designs. Thus, our results collectively indicate that when too few tests are available, naively applying i.i.d. Bernoulli testing can lead to catastrophic failure, whereas “cutting one’s losses” and adopting a more carefully-chosen design can still succeed in attaining these less stringent criteria.
Article
The group testing problem concerns discovering a small number of defective items within a large population by performing tests on pools of items. A test is positive if the pool contains at least one defective, and negative if it contains no defectives. This is a sparse inference problem with a combinatorial flavour, with applications in medical testing, biology, telecommunications, information technology, data science, and more. In this monograph, we survey recent developments in the group testing problem from an information-theoretic perspective. We cover several related developments: achievability bounds for optimal decoding methods, efficient algorithms with practical storage and computation requirements, and algorithm-independent converse bounds. We assess the theoretical guarantees not only in terms of scaling laws, but also in terms of the constant factors, leading to the notion of the rate and capacity of group testing, indicating the amount of information learned per test. Considering both noiseless and noisy settings, we identify several regimes where existing algorithms are provably optimal or near-optimal, as well as regimes where there remains greater potential for improvement. In addition, we survey results concerning a number of variations on the standard group testing problem, including partial recovery criteria, adaptive algorithms with a limited number of stages, constrained test designs, and sublinear-time algorithms.
Article
For a constant γ[0,1]\gamma \in[0,1] and a graph G, let ωγ(G)\omega_{\gamma}(G) be the largest integer k for which there exists a k-vertex subgraph of G with at least γ(k2)\gamma\binom{k}{2} edges. We show that if 0<p<γ<10<p<\gamma<1 then ωγ(Gn,p)\omega_{\gamma}(G_{n,p}) is concentrated on a set of two integers. More precisely, with α(γ,p)=γlogγp+(1γ)log1γ1p\alpha(\gamma,p)=\gamma\log\frac{\gamma}{p}+(1-\gamma)\log\frac{1-\gamma}{1-p}, we show that ωγ(Gn,p)\omega_{\gamma}(G_{n,p}) is one of the two integers closest to 2α(γ,p)(lognloglogn+logeα(γ,p)2)+12\frac{2}{\alpha(\gamma,p)}\big(\log n-\log\log n+\log\frac{e\alpha(\gamma,p)}{2}\big)+\frac{1}{2}, with high probability. While this situation parallels that of cliques in random graphs, a new technique is required to handle the more complicated ways in which these "quasi-cliques" may overlap.
Article
We show that in the K-sat model with N variables and αN\alpha N clauses, the expected ratio of the smallest number of unsatisfied clauses to the number of variables is α/2Kαc(N)/2K\alpha/2^K - \sqrt{\alpha} c_*(N)/2^K up to smaller order terms o(α)o(\sqrt{\alpha}) as α\alpha\to\infty uniformly in N, where c(N)c_*(N) is the expected normalized maximum energy of some specific mixed p-spin spin glass model. The formula for the limit of c(N)c_*(N) is well-known from the theory of spin glasses.
Article
We establish that in the large degree limit, the value of certain optimization problems on sparse random hypergraphs is determined by an appropriate Gaussian optimization problem. This approach was initiated in Dembo et. al.(2016) for extremal cuts of graphs. The usefulness of this technique is further illustrated by deriving the optimal value for Max q-cut on Erd\H{o}s-R\'enyi and random regular graphs, Max XORSAT on Erd\H{o}s-R\'enyi hypergraphs, and the min-bisection for the Stochastic Block Model.
Article
Many questions of fundamental interest in todays science can be formulated as inference problems: Some partial, or noisy, observations are performed over a set of variables and the goal is to recover, or infer, the values of the variables based on the indirect information contained in the measurements. For such problems, the central scientific questions are: Under what conditions is the information contained in the measurements sufficient for a satisfactory inference to be possible? What are the most efficient algorithms for this task? A growing body of work has shown that often we can understand and locate these fundamental barriers by thinking of them as phase transitions in the sense of statistical physics. Moreover, it turned out that we can use the gained physical insight to develop new promising algorithms. Connection between inference and statistical physics is currently witnessing an impressive renaissance and we review here the current state-of-the-art, with a pedagogical focus on the Ising model which formulated as an inference problem we call the planted spin glass. In terms of applications we review two classes of problems: (i) inference of clusters on graphs and networks, with community detection as a special case and (ii) estimating a signal from its noisy linear measurements, with compressed sensing as a case of sparse estimation. Our goal is to provide a pedagogical review for researchers in physics and other fields interested in this fascinating topic.
Article
Group testing, also known as pooling, is a common technique used in high-throughput experiments to reduce the number of tests required to identify rare biological interactions while correcting for experimental noise. Central to the group testing problem are 1) a pooling design that lays out how items are grouped together into pools for testing and 2) a decoder that interprets the results of the tested pools, identifying the active compounds. In this work, we take advantage of decoder guarantees from the field of compressed sensing (CS) to address the problem of efficient and reliable detection of biological interaction in noisy high-throughput experiments. First, we formulate the group testing problem in terms of a Boolean CS framework. We then propose a low-complexity l_1-norm decoder to interpret pooling test results and identify active compounds. We test the proposed decoder using simulated experiments and real data sets. When benchmarked against the current state-of-the-art, the proposed decoder provides superior error-correction for the majority of the cases considered while being notably faster computationally. Lastly, we study the impact of different sparse pooling design matrices on decoder performance and show that the shifted transversal design (STD) is the most suitable among the pooling designs surveyed for biological applications of CS.
Article
In this paper, we give an overview of Combinatorial Group Testing algo-rithms which are applicable to DNA Library Screening. Our survey focuses on several classes of constructions not discussed in previous surveys, provides a general view on pooling design constructions and poses several open questions arising from this view.
Article
This work concerns average case analysis of simple solutions for random set covering (SC) instances. Simple solutions are constructed via an O(nm) algorithm. At first an analytical upper bound on the expected solution size is provided. The bound in combination with previous results yields an absolute asymptotic approximation result of o(log m) order. An upper bound on the variance of simple solution values is calculated. Sensitivity analysis performed on simple solutions for random SC instances shows that they are highly robust, in the sense of maintaining their feasibility against augmentation of the input data with additional random constraints.  2005 Elsevier B.V. All rights reserved.
Article
Since the early 1940s, group testing (pooled testing) has been used to reduce costs in a variety of applications, including infectious disease screening, drug discovery, and genetics. In such applications, the goal is often to classify individuals as positive or negative using initial group testing results and the subsequent process of decoding of positive pools. Many decoding algorithms have been proposed, but most fail to acknowledge, and to further exploit, the heterogeneous nature of the individuals being screened. In this article, we use individuals' risk probabilities to formulate new informative decoding algorithms that implement Dorfman retesting in a heterogeneous population. We introduce the concept of "thresholding" to classify individuals as "high" or "low risk," so that separate, risk-specific algorithms may be used, while simultaneously identifying pool sizes that minimize the expected number of tests. When compared to competing algorithms which treat the population as homogeneous, we show that significant gains in testing efficiency can be realized with virtually no loss in screening accuracy. An important additional benefit is that our new procedures are easy to implement. We apply our methods to chlamydia and gonorrhea data collected recently in Nebraska as part of the Infertility Prevention Project.
Article
This paper describes an effective method for extracting as much information as possible from pooling experiments for library screening. Pools are collections of clones, and screening a pool with a probe determines whether any of these clones are positive for the probe. The results of the pool screenings are interpreted, or decoded, to infer which clones are candidates to be positive. These candidate positives are subjected to confirmatory testing. Decoding the pool screening results is complicated by the presence of errors, which typically lead to ambiguities in the inference of positive clones. However, in many applications there are reasonable models for the prior distributions for positives and for errors, and Bayes inference is the preferred method for ranking candidate positives. Because of the combinatoric complexity of the Bayes formulation, we implemented a decoding algorithm using a Markov chain Monte Carlo method. The algorithm was used in screening a library with 1298 clones using 47 pools. We corroborated the posterior probabilities for positives with results from confirmatory screening. We also simulated the screening of a 10-fold coverage library of 33,000 clones using 253 pools. The use of our algorithm, effective under conditions where combinatorial decoding techniques are imprudent, allows the use of fewer pools and also introduces needed robustness.
Greedy Heuristics and Linear Relaxations for the Random Hitting Set Problem
  • Gabriel Arpino
  • Daniil Dmitriev
  • Nicolo Grometto
Gabriel Arpino, Daniil Dmitriev, and Nicolo Grometto. Greedy Heuristics and Linear Relaxations for the Random Hitting Set Problem. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (AP-PROX/RANDOM 2024), volume 317 of Leibniz International Proceedings in Informatics (LIPIcs), pages 30:1-30:22, 2024.
Information theory. Interscience tracts in pure and applied mathematics
  • Robert Ash
Robert Ash. Information theory. Interscience tracts in pure and applied mathematics; no. 19. Interscience Publishers, New York, 1965.
Cliques in random graphs
  • Béla Bollobás
  • Paul Erdös
Béla Bollobás and Paul Erdös. Cliques in random graphs. In Mathematical Proceedings of the Cambridge Philosophical Society, volume 80, pages 419-427. Cambridge University Press, 1976.
Almost-linear planted cliques elude the metropolis process
  • Zongchen Chen
  • Elchanan Mossel
  • Ilias Zadik
Zongchen Chen, Elchanan Mossel, and Ilias Zadik. Almost-linear planted cliques elude the metropolis process. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4504-4539. SIAM, 2023.
On the low-temperature mcmc threshold: the cases of sparse tensor pca, sparse regression, and a geometric rule
  • Zongchen Chen
  • Conor Sheehan
  • Ilias Zadik
Zongchen Chen, Conor Sheehan, and Ilias Zadik. On the low-temperature mcmc threshold: the cases of sparse tensor pca, sparse regression, and a geometric rule. arXiv preprint arXiv:2408.00746, 2024.
A semiquantitative group testing approach for learning interpretable clinical prediction rules
  • A Emad
  • K Varshney
  • D Malioutov
A. Emad, K. Varshney, and D. Malioutov. A semiquantitative group testing approach for learning interpretable clinical prediction rules. Signal Processing with Adaptive Sparse Structured Representations (SPARS'15), 2015.
The landscape of the planted clique problem: Dense subgraphs and the overlap gap property
  • David Gamarnik
  • Ilias Zadik
David Gamarnik and Ilias Zadik. The landscape of the planted clique problem: Dense subgraphs and the overlap gap property. The Annals of Applied Probability, 34(4):3375 -3434, 2024.
Statistical Inference and the Sum of Squares Method
  • S Hopkins
S. Hopkins. Statistical Inference and the Sum of Squares Method. PhD thesis, Cornell University, 2018.
Group testing and local search: is there a computational-statistical gap?
  • Fotis Iliopoulos
  • Ilias Zadik
Fotis Iliopoulos and Ilias Zadik. Group testing and local search: is there a computational-statistical gap? Proceedings of Machine Learning Research (COLT), 138:1-53, 2021.
Random Max-CSPs Inherit Algorithmic Hardness from Spin Glasses
  • Chris Jones
  • Kunal Marwaha
  • Juspreet Singh Sandhu
  • Jonathan Shi
Chris Jones, Kunal Marwaha, Juspreet Singh Sandhu, and Jonathan Shi. Random Max-CSPs Inherit Algorithmic Hardness from Spin Glasses. In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023), volume 251 of Leibniz International Proceedings in Informatics (LIPIcs), pages 77:1-77:26, 2023.
Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio
  • Dmitriy Kunisky
  • Afonso S Alexander S Wein
  • Bandeira
Dmitriy Kunisky, Alexander S Wein, and Afonso S Bandeira. Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio. In ISAAC Congress (International Society for Analysis, its Applications and Computation), pages 1-50. Springer, 2019.
Phase transitions in group testing
  • Jonathan Scarlett
  • Volkan Cevher
Jonathan Scarlett and Volkan Cevher. Phase transitions in group testing. In Proceedings of the twenty-seventh annual ACM-SIAM symposium on Discrete algorithms, pages 40-53. SIAM, 2016.
Group testing with dna chips: generating designs and decoding experiments
  • Alexander Schliep
  • C David
  • Sven Torney
  • Rahmann
Alexander Schliep, David C Torney, and Sven Rahmann. Group testing with dna chips: generating designs and decoding experiments. In Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003, pages 84-91. IEEE, 2003.