Max Planck Institute for Informatics
Recent publications
We present the stochastic quadratic polynomial based method, a novel solution method for Itô stochastic differential equations (SDEs). The idea is based on numerically computing the unknown function at the next two time points, and iteratively continuing this process until the final time is reached. To achieve this, the time interval is subdivided into smaller sub-intervals, and quadratic polynomials are used to approximate the solution between two successive intervals. The main properties of the stochastic numerical methods, e.g. convergence, consistency, and stability are analyzed. We test the proposed method in an SDE problem, demonstrating promising results. We also compare our method with classic stochastic schemes, such as Euler-Maruyama (EM) and Milstein schemes, and demonstrate that the proposed method achieves higher accuracy.
We present the second part of the rigorous evaluation of modern machine learning force fields (MLFFs) within the TEA Challenge 2023. This study provides an in-depth analysis of the performance of MACE, SO3krates, sGDML, SOAP/GAP, and FCHL19* in modeling molecules, molecule-surface interfaces, and periodic materials. We compare observables obtained from molecular dynamics (MD) simulations using different MLFFs under identical conditions. Where applicable, density-functional theory (DFT) or experiment serves as a reference to reliably assess the performance of the ML models. In the absence of DFT benchmarks, we conduct a comparative analysis based on results from various MLFF architectures. Our findings indicate that, at the current stage of MLFF development, the choice of ML model is in the hands of the practitioner. When a problem falls within the scope of a given MLFF architecture, the resulting simulations exhibit weak dependency on the specific architecture used. Instead, emphasis should be placed on developing complete, reliable, and representative training datasets. Nonetheless, long-range noncovalent interactions remain challenging for all MLFF models, necessitating special caution in simulations of physical systems where such interactions are prominent, such as molecule-surface interfaces. The findings presented here reflect the state of MLFF models as of October 2023.
For a well-studied family of domination-type problems, in bounded-treewidth graphs, we investigate whether it is possible to find faster algorithms. For sets σ , ρ of non-negative integers, a ( σ , ρ )-set of a graph G is a set S of vertices such that | N ( u )∩ S | ∈ σ for every u ∈ S , and | N ( v )∩ S | ∈ ρ for every v∉Sv\not\in S . The problem of finding a ( σ , ρ )-set (of a certain size) unifies common problems like Independent Set , Dominating Set , Independent Dominating Set , and many others. In an accompanying paper, it is proven that, for all pairs of finite or cofinite sets ( σ , ρ ), there is an algorithm that counts ( σ , ρ )-sets in time (cσ,ρ)twnO(1)(c_{\sigma,\rho })^{\sf tw}\cdot n^{{\rm O}(1)} (if a tree decomposition of width tw{\sf tw} is given in the input). Here, c σ , ρ is a constant with an intricate dependency on σ and ρ . Despite this intricacy, we show that the algorithms in the accompanying paper are most likely optimal, i.e., for any pair ( σ , ρ ) of finite or cofinite sets where the problem is non-trivial, and any ε > 0, a (cσ,ρε)twnO(1)(c_{\sigma,\rho }-\varepsilon)^{\sf tw}\cdot n^{{\rm O}(1)} -algorithm counting the number of ( σ , ρ )-sets would violate the Counting Strong Exponential-Time Hypothesis (#SETH). For finite sets σ and ρ , our lower bounds also extend to the decision version, showing that those algorithms are optimal in this setting as well.
Let φ be a sentence of CMSO2\mathsf {CMSO}_2 (monadic second-order logic with quantification over edge subsets and counting modular predicates) over the signature of graphs. We present a dynamic data structure that for a given graph G that is updated by edge insertions and edge deletions, maintains whether φ is satisfied in G . The data structure is required to correctly report the outcome only when the feedback vertex number of G does not exceed a fixed constant k , otherwise it reports that the feedback vertex number is too large. With this assumption, we guarantee amortized update time Oφ,k(logn)\mathcal {O}_{\varphi,k}{(\log n)} . If we additionally assume that the feedback vertex number of G never exceeds k , this update time guarantee is worst-case. By combining this result with a classic theorem of Erdős and Pósa, we give a fully dynamic data structure that maintains whether a graph contains a packing of k vertex-disjoint cycles with amortized update time Ok(logn)\mathcal {O}_{k}{(\log n)} . Our data structure also works in a larger generality of relational structures over binary signatures.
In the single-source shortest paths problem, the goal is to compute the shortest path tree from a designated source vertex in a weighted, directed graph. We present the first near-linear time algorithm for the problem that can also handle negative edge-weights; the runtime is O ( m log 8 ( n ) log W ) . In contrast to all recent developments that rely on sophisticated continuous optimization methods and dynamic algorithms, our algorithm is simple: it requires only a simple graph decomposition and elementary combinatorial tools. In fact, ours is the first combinatorial algorithm for negative-weight single-source shortest paths to break through the classic O ~ ( m n log W ) bound from over three decades ago (Gabow and Tarjan, SICOMP’89.)
In the Directed Feedback Vertex Set (DFVS) problem, given a digraph D and a positive integer k , the goal is to check if there exists a set of at most k vertices whose deletion from D results in a directed acyclic graph. The existence of a polynomial kernel for DFVS , parameterized by the solution size k , is a central open problem in Kernelization. In this paper, we give a polynomial kernel for DFVS parameterized by k plus the size of a treewidth- η modulator (of the underlying undirected graph), where η is any fixed positive integer. Since the status of the existence of a polynomial kernel for DFVS (parameterized by the solution size) is open for a very long time now, and it is known to not admit a polynomial kernel when the parameter is the size of a treewidth-2 modulator, solution size plus the size of the treewidth- η modulator makes for an interesting choice of parameter to study. In fact, the polynomial kernelization complexity of DFVS parameterized by the size of the undirected feedback vertex set (treewidth-1 modulator) in the underlying undirected graph, has already been studied in literature. Our choice of parameter strictly encompasses previous positive kernelization results on DFVS . Our result is based on a novel application of the tool of important separators embedded in state-of-the-art machinery such as protrusion decompositions.
While machine learning (ML) models have been able to achieve unprecedented accuracies across various prediction tasks in quantum chemistry, it is now apparent that accuracy on a test set alone is not a guarantee for robust chemical modeling such as stable molecular dynamics (MD). To go beyond accuracy, we use explainable artificial intelligence (XAI) techniques to develop a general analysis framework for atomic interactions and apply it to the SchNet and PaiNN neural network models. We compare these interactions with a set of fundamental chemical principles to understand how well the models have learned the underlying physicochemical concepts from the data. We focus on the strength of the interactions for different atomic species, how predictions for intensive and extensive quantum molecular properties are made, and analyze the decay and many-body nature of the interactions with interatomic distance. Models that deviate too far from known physical principles produce unstable MD trajectories, even when they have very high energy and force prediction accuracy. We also suggest further improvements to the ML architectures to better account for the polynomial decay of atomic interactions.
Antiretroviral therapy is the standard treatment for HIV, but it requires daily use and can cause side effects. Despite being available for decades, there are still 1.5 million new infections and 700,000 deaths each year, highlighting the need for better therapies. Broadly neutralizing antibodies (bNAbs), which are highly active against HIV-1, represent a promising new approach and clinical trials have demonstrated the potential of bNAbs in the treatment and prevention of HIV-1 infection. However, HIV-1 antibody resistance (HIVAR) due to variants in the HIV-1 envelope glycoproteins (HIV-1 Env) is not well understood yet and poses a critical problem for the clinical use of bNAbs in treatment. HIVAR also plays an important role in the future development of an HIV-1 vaccine, which will require elicitation of bNAbs to which the circulating strains are sensitive. In recent years, a variety of methods have been developed to detect, characterize and predict HIVAR. Structural analysis of antibody-HIV-1 Env complexes has provided insight into viral residues critical for neutralization, while testing of viruses for antibody susceptibility has verified the impact of some of these residues. In addition, in vitro viral neutralization and adaption assays have shaped our understanding of bNAb susceptibility based on the envelope sequence. Furthermore, in vivo studies in animal models have revealed the rapid emergence of escape variants to mono- or combined bNAb treatments. Finally, similar variants were found in the first clinical trials testing bNAbs for the treatment of HIV-1-infected patients. These structural, in vitro, in vivo and clinical studies have led to the identification and validation of HIVAR for almost all available bNAbs. However, defined assays for the detection of HIVAR in patients are still lacking and for some novel, highly potent and broad-spectrum bNAbs, HIVAR have not been clearly defined. Here, we review currently available approaches for the detection, characterization and prediction of HIVAR.
We prove new upper and lower bounds on the number of iterations the k -dimensional Weisfeiler-Leman algorithm ( k -WL) requires until stabilization. For k3k\geq 3 , we show that k -WL stabilizes after at most O(knk1logn)O(kn^{k-1}\log n) iterations (where n denotes the number of vertices of the input structures), obtaining the first improvement over the trivial upper bound of nk1n^{k}-1 and extending a previous upper bound of O(nlogn)O(n\log n) for k=2 [Lichter et al., LICS 2019]. We complement our upper bounds by constructing k -ary relational structures on which k -WL requires at least nΩ(k)n^{\Omega(k)} iterations to stabilize. This improves over a previous lower bound of nΩ(k/logk)n^{\Omega(k/\log k)} [Berkholz, Nordström, LICS 2016]. We also investigate tradeoffs between the dimension and the iteration number of WL, and show that d -WL, where d=3(k+1)2d=\lceil\frac{3(k+1)}{2}\rceil , can simulate the k -WL algorithm using only O(k2nk/2+1logn)O(k^{2}\cdot n^{\lfloor k/2\rfloor+1}\log n) many iterations, but still requires at least nΩ(k)n^{\Omega(k)} iterations for any d (that is sufficiently smaller than n ). The number of iterations required by k -WL to distinguish two structures corresponds to the quantifier rank of a sentence distinguishing them in the (k+1) -variable fragment Ck+1\mathsf{C}_{k+1} of first-order logic with counting quantifiers. Hence, our results also imply new upper and lower bounds on the quantifier rank required in the logic Ck+1\mathsf{C}_{k+1} , as well as tradeoffs between variable number and quantifier rank.
When physical properties of molecules are being modeled with machine learning, it is desirable to incorporate SO(3)-covariance. While such models based on low body order features are not complete, we formulate and prove general completeness properties for higher order methods and show that 6k – 5 of these features are enough for up to k atoms. We also find that the Clebsch–Gordan operations commonly used in these methods can be replaced by matrix multiplications without sacrificing completeness, lowering the scaling from O(l⁶) to O(l³) in the degree of the features. We apply this to quantum chemistry, but the proposed methods are generally applicable for problems involving three-dimensional point configurations.
Virtual reality (VR) has the potential to become a revolutionary technology with a significant impact on our daily lives. The immersive experience provided by VR equipment, where the user’s body and senses are used to interact with the surrounding content, accompanied by the feeling of presence elicits a realistic behavioral response. In this work, we leverage the full control of audiovisual cues provided by VR to study an audiovisual suppression effect (ASE) where auditory stimuli degrade visual performance. In particular, we explore if barely audible sounds (in the range of the limits of hearing frequencies) generated following a specific spatiotemporal setup can still trigger the ASE while participants are experiencing high cognitive loads. A first study is carried out to find out how sound volume and frequency can impact this suppression effect, while the second study includes higher cognitive load scenarios closer to real applications. Our results show that the ASE is robust to variations in frequency, volume and cognitive load, achieving a reduction of visual perception with the proposed hardly audible sounds. Using such auditory cues means that this effect could be used in real applications, from entertaining to VR techniques like redirected walking.
Zero-cost proxies are nowadays frequently studied and used to search for neural architectures. They show an impressive ability to predict the performance of architectures by making use of their untrained weights. These techniques allow for immense search speed-ups. So far the joint search for well performing and robust architectures has received much less attention in the field of NAS. Therefore, the main focus of zero-cost proxies is the clean accuracy of architectures, whereas the model robustness should play an evenly important part. In this paper, we analyze the ability of common zero-cost proxies to serve as performance predictors for robustness in the popular NAS-Bench-201 search space. We are interested in the single prediction task for robustness and the joint multi-objective of clean and robust accuracy. We further analyze the feature importance of the proxies and show that predicting the robustness makes the prediction task from existing zero-cost proxies more challenging. As a result, the joint consideration of several proxies becomes necessary to predict a model’s robustness while the clean accuracy can be regressed from a single such feature. Our code is available at https://github.com/jovitalukasik/zcp_eval.
We consider the task of lexicographic direct access to query answers. That is, we want to simulate an array containing the answers of a join query sorted in a lexicographic order chosen by the user. A recent dichotomy showed for which queries and orders this task can be done in polylogarithmic access time after quasilinear preprocessing, but this dichotomy does not tell us how much time is required in the cases classified as hard. We determine the preprocessing time needed to achieve polylogarithmic access time for all join queries and all lexicographical orders. To this end, we propose a decomposition-based general algorithm for direct access on join queries. We then explore its optimality by proving lower bounds for the preprocessing time based on the hardness of a certain online Set-Disjointness problem, which shows that our algorithm’s bounds are tight for all lexicographic orders on join queries. Then, we prove the hardness of Set-Disjointness based on the Zero-Clique Conjecture which is an established conjecture from fine-grained complexity theory. Interestingly, while proving our lower bound, we show that self-joins do not affect the complexity of direct access (up to logarithmic factors). Our algorithm can also be used to solve queries with projections and relaxed order requirements, though in these cases, its running time is not necessarily optimal. We also show that similar techniques to those used in our lower bounds can be used to prove that, for enumerating answers to Loomis-Whitney joins, it is not possible to significantly improve upon trivially computing all answers at preprocessing. This, in turn, gives further evidence (based on the Zero-Clique Conjecture) to the enumeration hardness of self-join free cyclic joins with respect to linear preprocessing and constant delay.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.
154 members
Rodrigo Benenson
  • Department 2: Computer Vision and Multimodal Computing
Michael Xuelin Huang
  • Department 2: Computer Vision and Multimodal Computing
Hyeonseung Yu
  • Department 4: Computer Graphics
Simon Razniewski
  • Department 5: Databases and Information Systems
Information
Address
Saarbrücken, Germany