John W Pinney's research while affiliated with Imperial College London and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (101)
In times when herpesvirus genomic data were scarce, the cospeciation between these viruses and their hosts was considered to be common knowledge. However, as more herpesviral sequences were made available, tree reconciliation analyses started to reveal topological incongruences between host and viral phylogenies, indicating that other cophylogeneti...
Herpesviruses (HVs, Family: Herpesviridae) have large genomes that encode hundreds of proteins. Apart from amino acid mutations, protein domain acquisitions, duplications and losses are also common modes of evolution. HV domain repertoires differ across species, and only a core set is shared among all species, aspect that raises a question: How hav...
The evolution of protein-protein interactions (PPIs) is directly influenced by the evolutionary histories of the genes and the species encoding the interacting proteins. When it comes to PPIs of host-pathogen systems, the complexity of their evolution is much higher, as two independent, but biologically associated entities, are involved. In this wo...
Herpesviruses (HVs) have large genomes that can encode thousands of proteins. Apart from amino acid mutations, protein domain acquisitions, duplications and losses are also common modes of evolution. HV domain repertoires differ across species, and only a core set is shared among all viruses, aspect that raises a question: How have HV domain repert...
Cospeciation has been suggested to be the main force driving the evolution of herpesviruses, with viral species co-diverging with their hosts along more than 400 million years of evolutionay history. Recent studies, however, have been challenging this assumption, showing that other co-phylogenetic events, such as intrahost speciations and host swit...
To study virus–host protein interactions, knowledge about viral and host protein architectures and repertoires, their particular evolutionary mechanisms, and information on relevant sources of biological data is essential. The purpose of this review article is to provide a thorough overview about these aspects. Protein domains are basic units defin...
Despite several recent advances in the automated generation of draft metabolic reconstructions, the manual curation of these networks to produce high quality genome-scale metabolic models remains a labour-intensive and challenging task.
We present PathwayBooster, an open-source software tool to support the manual comparison and curation of metaboli...
The molecular reaction networks that coordinate the response of an organism to changing environmental conditions are central for survival and reproduction. Escherchia coli employs an accurate and flexible signalling system that is capable of processing ambient nitrogen availability rapidly and with high accuracy. Carefully orchestrated post-transla...
One of the challenging questions in modelling biological systems is to characterise the functional forms of the processes that control and orchestrate molecular and cellular phenotypes. Recently proposed methods for the analysis of metabolic pathways, for example dynamic flux estimation, can only provide estimates of the underlying fluxes at discre...
MOTIVATION: One of the challenging questions in modelling biological systems is to characterize the functional forms of the processes that control and orchestrate molecular and cellular phenotypes. Recently proposed methods for the analysis of metabolic pathways, for example, dynamic flux estimation, can only provide estimates of the underlying flu...
Abstract Recent advances in the automation of metabolic model reconstruction have led to the availability of draft-quality metabolic models (predicted reaction complements) for multiple bacterial species. These reaction complements can be considered as trait representations and can be used for ancestral state reconstruction to infer the most likely...
The ability to adapt to environments with fluctuating nutrient availability is vital for bacterial survival. Although essential for growth, few nitrogen metabolism genes have been identified or fully characterised in mycobacteria and nitrogen stress survival mechanisms are unknown.
A global transcriptional analysis of the mycobacterial response to...
Misannotation in sequence databases is an important obstacle for automated tools for gene function annotation, which rely extensively on comparison with sequences with known function. To improve current annotations and prevent future propagation of errors, sequence-independent tools are, therefore, needed to assist in the identification of misannot...
GO enrichment among network subgraphs. Two types of graph are shown: (i) bar plots A, B, C and D show the relative level of enrichment of GO terms pertaining to more specialist, or general functions, measured by the number of genes represented, from each network. Here, a positive value represents relative enrichment of GO terms of the given size, w...
Large-scale molecular interaction data sets have the potential to provide a comprehensive, system-wide understanding of biological function. Although individual molecules can be promiscuous in terms of their contribution to function, molecular functions emerge from the specific interactions of molecules giving rise to modular organisation. As funct...
Network partitioning methodology. Interaction networks were partitioned by using k-way partitioning. k represents the number of partitions for the algorithm to produce. Given the number of nodes in the network and the number of partitions we can estimate the average size, s, of the subgraphs produced by partitioning. Many different values for k wer...
The genetic interaction network listed as pairwise interactions between nodes. The first line of the file reports the number of edges in the network.
(ZIP)
The combined interaction network listed as pairwise interactions between nodes. The first line of the file reports the number of edges in the network.
(ZIP)
The 20 most accurately represented GO terms from each network and ontology.
(XLS)
Summary of network subgraphs showing plots of (i) subgraph size against subgraph frequency (panels A, C, E and G), and (ii) subgraph size against the top 95th percentile of clusters ordered by edge density (panels B, D, F and H) for PPI, genetic, co-regulation and combined interaction networks, respectively.
(EPS)
The PPI interaction network listed as pairwise interactions between nodes. The first line of the file reports the number of edges in the network. This file also contains a lookup between node identifiers and the yeast systematic name reported in the SGD.
(ZIP)
The coregulation interaction network listed as pairwise interactions between nodes. The first line of the file reports the number of edges in the network.
(ZIP)
Background
With the continued proliferation of high-throughput biological experiments, there is a pressing need for tools to integrate the data produced in ways that produce biologically meaningful conclusions. Many microarray studies have analysed transcriptomic data from a pathway perspective, for instance by testing for KEGG pathway enrichment i...
The evolution of biological systems is influenced by a number of factors and forces that have acted in different combinations at different times to give rise to extant organisms. Here we illustrate some of the issues surrounding the data-driven evolutionary analysis of biological systems in the context of bacterial two-component systems (TCSs). TCS...
IntroductionPathogen genomicsMetabolic modelsProtein–protein interactionsResponse to environmentImmune system interactionsManipulation of other host systemsEvolution of the host–pathogen systemTowards systems medicine for infectious diseasesConcluding remarksAcknowledgementsReferences
Sensing the environment and responding appropriately to it are key capabilities for the survival of an organism. All extant organisms must have evolved suitable sensors, signaling systems, and response mechanisms allowing them to survive under the conditions they are likely to encounter. Here, we investigate in detail the evolutionary history of on...
Table of biological cohesiveness measures for significant biclusters. A table of biological cohesiveness measures. Each row represents a significant bicluster. Sequence similarity, semantic similarity and network clustering are measures pertaining to the proteins of a given bicluster. From right to left, the columns show: bicluster id; p-value for...
Human immunodeficiency virus type 1 (HIV-1) exploits a diverse array of host cell functions in order to replicate. This is mediated through a network of virus-host interactions. A variety of recent studies have catalogued this information. In particular the HIV-1, Human Protein Interaction Database (HHPID) has provided a unique depth of protein int...
Table of significant biclusters and their HIV-host interactions. A table of significant biclusters. Each row represents a single HIV-host interaction within a significant bicluster. The biclusters are divided in to higher-level groups, known as sub-systems, based on shared interactions and labeled according to the biological role of the included ho...
Table of host subsystem details. A table of host subsystem details. Each row represents a host subsystem. From right to left the columns show: the name of the subsystem; the number of biclusters included in the subsystem; the number of human genes in the subsystem; the intersection between the subsystem and the Brass et al. (2008) siRNA screen; p-v...
Hierarchy of protein interaction types. A hierarchy that incorporates all of the interaction types found in the NCBI HIV-1, host protein interaction database (HHPID) with the addition of parent terms for these types. HHPID interaction types have a unique id, and polarity, direction and control attributes. These attributes are explained in detail in...
In order to replicate, HIV, like all viruses, needs to invade a host cell and hijack it for its own use, a process that involves multiple protein interactions between virus and host. The HIV-1, Human Protein Interaction Database available at NCBI's website captures this information from the primary literature, containing over 2,500 unique interacti...
Figure S4: LCR centre positions distribution. Distributions of LCR centre positions and randomly replaced LCR centre positions. The random distribution extremities show the expected frequency decrease, while the original distribution on top, appears to be enriched with extremity LCRs.
Figure S2: Mean and standard deviation from UniProt entropy distributions. The entropy distributions mean grows asymptotically towards the Hmax value as the window regions increase and sequences within them approach random states. The entropy distributions standard deviation decreases as longer sequences become more homogeneous.
Figure S3: Computing random LCR positions. Method to compute random LCR positions. The same process is repeated for each LCR in S. cerevisiae: LCRs (shown in red) are extracted from their corresponding protein sequence and re-inserted randomly 1000 times. Each time, the normalised centre position is included into the random distribution.
Figure S1: LCR distributions in PPI datasets. PPI datasets overlap between the HC, DIPv, FYI and BioGrid datasets, and the distribution of LCRs among them.
Table S1: LCR distributions in PPI datasets. LCRs are approximately equally distributed across the high-confidence datasets (HC, FYI and DIPv). Enrichment is defined as (Observed - Expected)/Expected.
Regions of protein sequences with biased amino acid composition (so-called Low-Complexity Regions (LCRs)) are abundant in the protein universe. A number of studies have revealed that i) these regions show significant divergence across protein families; ii) the genetic mechanisms from which they arise lends them remarkable degrees of compositional p...
Our understanding of how evolution acts on biological networks remains patchy, as is our knowledge of how that action is best identified, modelled and understood. Starting with network structure and the evolution of protein-protein interaction networks, we briefly survey the ways in which network evolution is being addressed in the fields of system...
The evolutionary mechanisms by which protein interaction networks grow and change are beginning to be appreciated as a major factor shaping their present-day structures and properties. Starting with a consideration of the biases and errors inherent in our current views of these networks, we discuss the dangers of constructing evolutionary arguments...
The binding of integrin adhesion receptors to their extracellular matrix ligands controls cell morphology, movement, survival, and differentiation in various developmental, homeostatic, and disease processes. Here, we report a methodology to isolate complexes associated with integrin adhesion receptors, which, like other receptor-associated signali...
A report of the Biochemical Society/Wellcome Trust meeting 'Protein Evolution - Sequences, Structures and Systems', Hinxton, UK, 26-27 January 2009.
The extracellular matrix (ECM) is a complex substrate that is involved in and influences a spectrum of behaviours such as growth and differentiation and is the basis for the structure of tissues. Although a characteristic of all metazoans, the ECM has elaborated into a variety of tissues unique to vertebrates, such as bone, tendon and cartilage. He...
A common method for presenting and studying biological interaction networks is visualization. Software tools can enhance our ability to explore network visualizations and improve our understanding of biological systems, particularly when these tools offer analysis capabilities. However, most published network visualizations are static representatio...
Supplemental Data S1. Stage specific mRNA expression of TgAaaH 1&2 during the tachyzoite to bradyzoite switch against actin as a housekeeping gene. Quantitative RT-PCR showing stage specific mRNA expression of TgAaaH 1&2 (black bars) in NED type III strain (A) and RH type I strain (B) parasites. Expression of SAG1 tachyzoite specific marker and BAG...
The genome of the protozoan parasite Toxoplasma gondii was found to contain two genes encoding tyrosine hydroxylase; that produces L-DOPA. The encoded enzymes metabolize phenylalanine as well as tyrosine with substrate preference for tyrosine. Thus the enzymes catabolize phenylalanine to tyrosine and tyrosine to L-DOPA. The catalytic domain descrip...
Although many interactions between HIV-1 and human proteins have been reported in the scientific literature, no publicly accessible source for efficiently reviewing this information was available. Therefore, a project was initiated in an attempt to catalogue all published interactions between HIV-1 and human proteins. HIV-related articles in PubMed...
The methodologies we use both enable and help define our research. However, as experimental complexity has increased the choice of appropriate methodologies has become an increasingly difficult task. This makes it difficult to keep track of available bioinformatics software, let alone the most suitable protocols in a specific research area. To reme...
Recombinant HIV-1 genomes contribute significantly to the diversity of variants within the HIV/AIDS pandemic. It is assumed that some of these mosaic genomes may have novel properties that have led to their prevalence, particularly in the case of the circulating recombinant forms (CRFs). In regions of the HIV-1 genome where recombination has a tend...
Position and composition of polyA tails detected in orf294 (Mito3) and orf199 (Mito4) transcripts isolated at 40°C (underlined) or 24°C, respectively. Nucleotides are numbered according to the ORF ATG at position 1.
(0.19 MB TIF)
ACT clique clustering analysis that identifies genes showing co-expression with the heat shock transcription factor At2g26150.
(4.10 MB TIF)
Position and composition of polyA tails detected in rpl2 transcripts isolated at 40°C (underlined) or 24°C, respectively. Nucleotides are numbered according to the transcriptional start site at position 1.
(0.22 MB TIF)
The neprilysin (M13) family of endopeptidases are zinc-metalloenzymes, the majority of which are type II integral membrane proteins. The best characterised of this family is neprilysin, which has important roles in inactivating signalling peptides involved in modulating neuronal activity, blood pressure and the immune system. Other family members i...
Analysis of protein-protein interaction networks is an increasingly popular means to infer biological insight, but is close enough attention being paid to data handling protocols and the degree of bias in the data?
Polyadenylation of RNA has a decisive influence on RNA stability. Depending on the organisms or subcellular compartment, it either enhances transcript stability or targets RNAs for degradation. In plant mitochondria, polyadenylation promotes RNA degradation, and polyadenylated mitochondrial transcripts are therefore widely considered to be rare and...
Consensus tree of phylogenetic analyses of M13 peptidases. A majority consensus cladogram of three methods of phylogenetic reconstruction of M13 proteins, including percentage bootstrap values in the order: neighbour-joining/maximum parsimony/maximum likelihood.
Maximum parsimony analysis of M13 peptidases. Cladogram of maximum parsimony reconstruction of M13 proteins, including percentage bootstrap values.
Proteins used in this study. A comprehensive list of all proteins used in this study including accession numbers, source of the sequence and abbreviations used in the manuscript. Underlined sequences were used to generate the SHARKhunt gene model.
Neighbour-joining analysis of M13 peptidases. Cladogram of neighbour-joining reconstruction of M13 proteins, including percentage bootstrap values.
Maximum likelihood analysis of M13 peptidases. Cladogram of maximum likelihood reconstruction of M13 proteins, including percentage bootstrap values.
Multiple sequence alignment of M13 protein sequences. A multiple sequence alignment of M13 proteins generated using MUSCLE and edited to remove uninformative regions.
As whole-genome protein–protein interaction datasets become available for a wide range of species, evolutionary biologists have the opportunity to address some of the unanswered questions surrounding the evolution of these complex systems. Protein interaction networks from divergent organisms may be compared to investigate how gene duplication, del...
With the completion of sequencing projects for several parasite genomes, efforts are ongoing to make sense of this mass of information in terms of the gene products encoded and their interactions in the growth, development and survival of parasites. The emerging science of systems biology aims to explain the complex relationship between genotype an...
WGDs are illustrated in blue and SSDs are illustrated in red (found at 40% sequence identity), yellow (30% ID), and green (20% ID). Mean shared interaction ratio r is plotted against sequence divergence measured by Ka. The rightmost bin indicates the mean shared interaction ratio for WGDs, three sets of SSDs and pairs of proteins selected at random...