Robust Perron cluster analysis in conformation dynamics

Konrad-Zuse-Zentrum fuer Informationstechnik, Berlin D-14195, Germany
Linear Algebra and its Applications (Impact Factor: 0.98). 01/2003; 398:161-184. DOI: 10.1016/j.laa.2004.10.026

ABSTRACT The key to molecular conformation dynamics is the direct identification of metastable conformations, which are almost invariant sets of molecular dynamical systems. Once some reversible Markov operator has been discretized, a generalized symmetric stochastic matrix arises. This matrix can be treated by Perron cluster analysis, a rather recent method involving a Perron cluster eigenproblem. The paper presents an improved Perron cluster analysis algorithm, which is more robust than earlier suggestions. Numerical examples are included.

  • Source
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Markov state models (MSMs) have been successful in computing metastable states, slow relaxation timescales and associated structural changes, and stationary or kinetic experimental observables of complex molecules from large amounts of molecular dynamics simulation data. However, MSMs approximate the true dynamics by assuming a Markov chain on a clusters discretization of the state space. This approximation is difficult to make for high-dimensional biomolecular systems, and the quality and reproducibility of MSMs has, therefore, been limited. Here, we discard the assumption that dynamics are Markovian on the discrete clusters. Instead, we only assume that the full phase-space molecular dynamics is Markovian, and a projection of this full dynamics is observed on the discrete states, leading to the concept of Projected Markov Models (PMMs). Robust estimation methods for PMMs are not yet available, but we derive a practically feasible approximation via Hidden Markov Models (HMMs). It is shown how various molecular observables of interest that are often computed from MSMs can be computed from HMMs/PMMs. The new framework is applicable to both, simulation and single-molecule experimental data. We demonstrate its versatility by applications to educative model systems, a 1 ms Anton MD simulation of the bovine pancreatic trypsin inhibitor protein, and an optical tweezer force probe trajectory of an RNA hairpin.
    The Journal of Chemical Physics 11/2013; 139(18):184114. DOI:10.1063/1.4828816 · 3.12 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dispersal on the landscape/seascape scale may lead to complex spatial population structure with non-synchronous demography and genetic divergence. In this study we present a novel approach to identify subpopulations and dispersal barriers based on estimates of dispersal probabilities on the landscape scale. A theoretical framework is presented where the landscape connectivity matrix is analyzed for clusters as a signature of partially isolated subpopulations. Identification of subpopulations is formulated as a minimization problem with a tuneable penalty term that makes it possible to gener-ate population subdivisions with varying degree of dispersal restrictions. We show that this approach produces superior results compared to alternative standard methods. We apply this theory to a dataset of modeled dispersal probabilities for a sessile marine invertebrate with free-swimming larvae in the Baltic Sea. For a range of critical connectivities we produce a hierarchical partitioning into subpopulations spanning dispersal probabilities that are typical for both genetic diver-gence and demographic independence. The mapping of subpopulations suggests that the Baltic Sea includes a fine-scale (100–600 km) mosaic of invisible dispersal barriers. An analysis of the present network of marine protected areas reveal that protection is very unevenly distributed among the suggested subpopulations. Our approach can be used to assess the location and strength of dispersal barriers in the landscape, and identify conservation units when extensive genotyping is prohibitively costly to cover necessary spatial and temporal scales, e.g. in spatial management of marine populations. Understanding spatial population structure is essential for the analysis of many ecological and evolutionary processes (Waples and Gaggiotti 2006). Efficient conservation and management of intra-specific genetic diversity also require the identification of relevant population subdivisions, e.g. evolutionary significant units ESU (Ryder 1986) and demo-graphically independent management units MU (Moritz 1994, Palsbøll et al. 2007). Failure to identify routes of dispersal and gene flow (movement of genes among popu-lations) may lead to low population growth, loss of genetic variation and erosion of local adaptations with risk for extinction of isolated and genetically distinct populations (reviewed by Crooks and Sanjayan 2006). Identification of subpopulations is approached through a diversity of methods including trace elements, stable isotopes, parasite load and tagging (Cadrin et al. 2005). Most common, however, is the use of neutral genetic markers in samples from different geographic locations (Frazer and Bernatchez 2001, Allendorf and Luikart 2007). Analysis of genetic markers can give valuable insight about reduced gene flow and breaks in population connectivity, but there are several limitations (Lowe and Allendorf 2010). Equilibrium-based methods may not, for example, reflect current gene flow and often have low power to detect gene flows in the range where subdivision of populations becomes relevant

Preview (2 Sources)

Available from