Jan T. Kim

Jan T. Kim
The Pirbright Institute · Centre for Integrative Biology

Dr

About

65
Publications
13,424
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,811
Citations

Publications

Publications (65)
Article
Full-text available
Family Cortinariaceae currently includes only one genus, Cortinarius, which is the largest Agaricales genus, with thousands of species worldwide. The species are important ectomycorrhizal fungi and form associations with many vascular plant genera from tropicals to arctic regions. Genus Cortinarius contains a lot of morphological variation, and its...
Article
Full-text available
The tree of life is the fundamental biological roadmap for navigating the evolution and properties of life on Earth, and yet remains largely unknown. Even angiosperms (flowering plants) are fraught with data gaps, despite their critical role in sustaining terrestrial life. Today, high-throughput sequencing promises to significantly deepen our under...
Preprint
Full-text available
The tree of life is the fundamental biological roadmap for navigating the evolution and properties of life on Earth, and yet remains largely unknown. Even angiosperms (flowering plants) are fraught with data gaps, despite their critical role in sustaining terrestrial life. Today, high-throughput sequencing promises to significantly deepen our under...
Article
Full-text available
In phylogenetic studies across angiosperms, at various taxonomic levels, polytomies have persisted despite efforts to resolve them by increasing sampling of taxa and loci. The large amount of genomic data now available and statistical tools to analyze them provide unprecedented power for phylogenetic inference. Targeted sequencing has emerged as a...
Article
High-throughput DNA sequencing (HTS) presents great opportunities for plant systematics, yet genomic complexity needs to be reduced for HTS to be effectively applied. We highlight Hyb-Seq as a promising approach, especially in light of the recent development of probes enriching 353 low-copy nuclear genes from any flowering plant taxon
Article
Full-text available
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe...
Preprint
Full-text available
Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost associated with developing targeted sequencing approaches is preliminary data needed for identifying orthologous loci for probe design. In plants, identifying...
Article
Full-text available
Providing science and society with an integrated, up‐to‐date, high quality, open, reproducible and sustainable plant tree of life would be a huge service that is now coming within reach. However, synthesizing the growing body of DNA sequence data in the public domain and disseminating the trees to a diverse audience are often not straightforward du...
Article
Full-text available
The appropriate timing of developmental transitions is critical for adapting many crops to their local climatic conditions. Therefore, understanding the genetic basis of different aspects of phenology could be useful in highlighting mechanisms underpinning adaptation, with implications in breeding for climate change. For bread wheat (Triticum aesti...
Article
Full-text available
Dosage compensation is the fundamental process, by which gene expression from the male monosomic X chromosome and from the diploid set of autosomes is equalized. Various molecular mechanisms have evolved in different organisms to achieve this task. In Drosophila, genes on the male X chromosome are upregulated to the levels of expression from the tw...
Article
Full-text available
The process by which eukaryotic viruses with segmented genomes select a complete set of genome segments for packaging into progeny virus particles is not understood. In this study a model based on the association of genome segments through specific RNA-RNA interactions driven by base pairing is formalized and tested in the Orbivirus genus of the Re...
Article
Full-text available
Full-genome sequences have been used to monitor the fine-scale dynamics of epidemics caused by RNA viruses. However, the ability of this approach to confidently reconstruct transmission trees is limited by the knowledge of the genetic diversity of viruses that exist within different epidemiological units. In order to address this question, this stu...
Article
Abstract Plants are frequently wounded by mechanical impact or by insects, and their ability to adequately respond to wounding is essential for their survival and reproductive success. The wound response is mediated by a signal transduction and regulatory network. Molecular studies in Arabidopsis have identified the COI1 geneas a central component...
Article
Full-text available
Crop disorders are a serious threat to food security of inhabitants of remote areas in developing countries. While farmers in developed countries have frequently access to various expert resources that help them to identify the onset of a disease, farmers in developing countries usually do not have such support. However, their access to the Interne...
Article
Full-text available
Gene regulatory networks (GRNs) determine the dynamics of gene expression. Interest often focuses on the topological structure of a GRN while numerical parameters (e.g. decay rates) are unknown and less important. For larger GRNs, inference of structure from gene expression data is prohibitively difficult. Models are often proposed based on integra...
Article
The information contained in a gene can conceptually be separated into a regulatory and a structural component. Structural information determines the structure of the gene product and typically takes the form of a coding sequence of nucleotides, which is then (according to the “central dogma of molecular biology”) translated to an amino acid sequen...
Conference Paper
Full-text available
Gene expression levels within a cell are determined by the network of regulatory interactions among genes. In spatially extended systems of multiple cells, gene expression levels are also affected by activity in neighbouring cells. This interplay of a genetic regulatory network and interactions among neighbouring cells may qualitatively alter the d...
Article
Full-text available
Cardiovascular disease is the second most prevalent cause of morbidity and mortality in women of developed countries. Although it is well established that gender is a risk factor for cardiovascular disease, most gene expression analysis studies favour the identification of disease bio-markers and potential drug targets over combined populations. Th...
Conference Paper
Full-text available
Empowerment quantifies the choice available to an agent as the actuation channel capacity. However, not all such choices are sustainable: After some choices, the agent may not be able to return to its original state, or returning there may be costly. In this paper we explore whether empowerment can be adapted to obtain a measure of sustainability....
Data
Raw result files. Raw SBM results files, alignments of miRNA targets used in the analysis and a list of overlapping targets predicted by both SBM and miRanda.
Data
Full-text available
tableS1 – SBM comparison with miRanda. Full results of the SBM comparison with miRanda for each of the four miRNAs tested.
Data
Full-text available
tableS2 – Number of overlapping predictions. Summary table showing the number of target regions predicted by SBM and miRanda using default parameters and their overlap.
Article
Full-text available
Experimental identification of microRNA (miRNA) targets is a difficult and time consuming process. As a consequence several computational prediction methods have been devised in order to predict targets for follow up experimental validation. Current computational target prediction methods use only the miRNA sequence as input. With an increasing num...
Article
Full-text available
Identifying and utilizing information is central to reproduc-tive success. We study a scenario where a multicellular colony has to trade-off between utility of strategies for in-vestment in persistence or progeny and the (Shannon-type) relevant information necessary to realize these strategies. We develop a general approach to treat such problems t...
Conference Paper
Morphogenesis and the spatial structure of an organism have repercussions on gene expression. These effects can influence the results of regulatory network reconstruction. An integrated, flexible and exten- sible computational framework for modelling gene expression dynamics within spatially growing structures is developed and used as a test system...
Conference Paper
Full-text available
This paper examines how methods inspired by biological processes can be applied to the design of large-scale environment-aware sensor networks. Our ultimate goal are systems containing thousands of sensors spanning large building complexes or even cities that can cooperate to detect and analyse complex situations. We discuss the problems involved i...
Article
Full-text available
Recognition of protein-DNA binding sites in genomic sequences is a crucial step for discovering biological functions of genomic sequences. Explosive growth in availability of sequence information has resulted in a demand for binding site detection methods with high specificity. The motivation of the work presented here is to address this demand by...
Article
Transcription factors and their binding sites have become a focal point of bioinformaties research since it became clear that regulatory networks are a centerpiece of genetic information processing. Understanding how genetic information controls and organizes complex biological processes, such as metabolic dynamics, development and morphogenesis [1...
Conference Paper
Full-text available
Microarray technology has resulted in large sets of gene expression data. Using these data to derive knowledge about the underlying mechanisms that control gene expression dynamics has become an important challenge. Adequate models of the fundamental principles of gene regulation, such as Artificial Life models of regulatory networks, are pivotal f...
Conference Paper
Full-text available
If a simple and fast solution for one-class classification is required, the most common approach is to assume a Gaussian distribution for the patterns of the single class. Bayesian classification then leads to a simple template matching. In this paper we show for two very different applications that the classification performance can be improved s...
Conference Paper
Detecting the sites on genomic DNA at which DNA binding proteins bind is a highly relevant task in bioinformatics. For example, the binding sites of transcription factors are key elements of regulatory networks and determine the location of genes on a genome. Usually, for a given DNA binding protein, only a few DNA-subsequences at which the protein...
Article
Empirically, it has been observed in several cases that the information content of transcription factor binding site sequences (465798-:79;<97 ) approximately equals the information content of binding site positions (4>=@? 7A8-:79;<CB ). A general framework for formal models of transcription factors and binding sites is developed to address this is...
Article
Empirically, it has been observed in several cases that the information content of transcription factor binding site sequences (R(sequence)) approximately equals the information content of binding site positions (R(frequency)). A general framework for formal models of transcription factors and binding sites is developed to address this issue. Measu...
Conference Paper
Evolutionary algorithms can be used to solve complex optimization tasks. However, adequate parameterization is crucial for efficient optimization. Evolutionary adapta- tion of mutation rates provides a solution to the problem of finding suitable mutation rate settings. However, evolution of low mutation rates may lead to premature convergence. In n...
Article
Full-text available
MADS-box genes encode a family of transcription factors which control diverse developmental processes in flowering plants ranging from root to flower and fruit development. A large screening for MIKC-type MADS-box gene cDNAs in maize yielded sequences for 31 different genes, 29 of which are of the MIKC-type. 15 of these MIKC-type genes were novel....
Conference Paper
Full-text available
We propose a concept for a Shannon-type quantification of information relevant to a decision unit or agent. The proposed measure is operational, can - at least in principle - be calculated for a given system and has an immediate interpretation as an information quantity. Its use as a natural framework for the study of sensor evolution is discussed.
Conference Paper
The formal language transsys is introduced as a tool for compre- hensively representing regulatory gene networks in a way that makes them ac- cessible to ALife modelling. As a first application, Lindenmayer systems are enhanced by integration with transsys. The resulting formalism, called L- transsys, is used to implement the ABC model of flower mo...
Article
nt. What we are interested in is to quantify this amount of information. A couple of properties would be desirable for this quantity. What one would like to have is a measure that quanti es the relevant, the valuable" information in the environment. One would like to be able to interpret it in a sense similar to the classical Shannon information; l...
Article
We propose a concept for a Shannon-type quanti cation of information relevant to a decision unit or agent. The proposed measure is operational, can { at least in principle { be calculated for a given system and has an immediate interpretation as an information quantity. Its use as a natural framework for the study of sensor evolution is discussed.
Conference Paper
Networks of genes which encode transcription factors (regulatory networks) play a central role in the realization of phenotypic traits based on genetic information. Sequence-specific recogni-tion of DNA subsequences by proteins is a key mechanism in constituting regulatory networks. Understanding the information theoretic principles underlying the...
Article
Ziel unserer Arbeit ist es, einen Beitrag zur wissenschaftlichen Definition des Begriffs „Biodiversität“ zu leisten. Dazu vergleichen wir Verfahren zur Biodiversitätsmessung anhand molekulargenetischer Daten aus Pflanzen auf ihre Eignung, Biodiversität quantitativ zu charakterisieren. Wir beschränken uns dabei auf solche Verfahren, die auf allen Eb...
Article
Full-text available
Evolutionary developmental genetics (evodevotics) is a novel scientific endeavor which assumes that changes in developmental control genes are a major aspect of evolutionary changes in morphology. Understanding the phylogeny of developmental control genes may thus help us to understand the evolution of plant and animal form. The principles of evode...
Article
LindEvol: Artificial Models for Natural Plant Evolution
Article
The specification of floral organ identity during development depends on the function of a limited number of homeotic genes grouped into three classes: A, B, and C. Pairs of paralogous B class genes, such as DEF and GLO in Antirrhinum, and AP3 and PI in Arabidopsis, are required for establishing petal and stamen identity. To gain a better understan...
Conference Paper
The concept of biodiversity has received rapidly increasing interest in the biosciences during the last decade. Yet, it is unclear and disputed how biodiversity should be characterised and measured. We compared several biodiversity measures by applying them to data retrieved from the LindEvol-GA model of evolution. A series of LindEvol-GA runs with...
Article
Full-text available
The evolutionary origin of the angiosperms (flowering plants sensu stricto) is still enigmatic. Answers to the question of angiosperm origins are intimately connected to the identification of their sister group among extinct and extant taxa. Most phylogenetic analyses based on morphological data agree that among the groups of extant seed plants, th...
Article
Flowers sensu lato are short, specialized axes bearing closely aggregated sporophylls. They are typical for seed plants (spermatophytes) and are prominent in flowering plants sensu stricto (angiosperms), where they often comprise an attractive perianth. There is evidence that spermatophytes evolved from gymnosperm-like plants with a fern-like mode...
Article
The MADS-box encodes a novel type of DNA-binding domain found so far in a diverse group of transcription factors from yeast, animals, and seed plants. Here, our first aim was to evaluate the primary structure of the MADS-box. Compilation of the 107 currently available MADS-domain sequences resulted in a signature which can strictly discriminate bet...
Article
Full-text available
This contribution gives an initial report of a new project exploring the perspectives and limits of reversely engineering regulatory gene networks from gene expression data. The availability of such data is currently increasing dramatically due to the microarray technology. However, inferring the underlying network from expression data is difficult...

Network

Cited By