
Katherine JamesNewcastle University | NCL · School of Computing Science
Katherine James
BSc, MRes, PhD
About
109
Publications
10,420
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
782
Citations
Introduction
My research focuses on the analysis and systematic integration of largescale omics data, using advanced bioinformatic techniques, in order to characterise highly complex cellular systems and generate novel testable hypotheses. My current project involves the development of approaches for the analysis of next generation sequencing data with a focus on evolutionary genetics.
Additional affiliations
November 2019 - present
November 2017 - November 2019
March 2015 - November 2017
Publications
Publications (109)
We previously showed that the germ cell specific nuclear protein RBMXL2 represses cryptic splicing patterns during meiosis and is required for male fertility. It has remained unknown whether RBMXL2 evolved its role in splicing repression to deal with the transcriptionally permissive environment of meiosis or might fulfil a function required in all...
We previously showed that the germ cell specific nuclear protein RBMXL2 represses cryptic splicing patterns during meiosis and is required for male fertility. It has remained unknown whether RBMXL2 evolved its role in splicing repression to deal with the transcriptionally permissive environment of meiosis or might fulfil a function required in all...
We previously showed that the germ cell specific nuclear protein RBMXL2 represses cryptic splicing patterns during meiosis and is required for male fertility. It has remained unknown whether RBMXL2 evolved its role in splicing repression to deal with the transcriptionally permissive environment of meiosis or might fulfil a function required in all...
The identification of gastrointestinal helminth infections of humans and livestock almost exclusively relies on the detection of eggs or larvae in faeces, followed by manual counting and morphological characterisation to differentiate species using microscopy-based techniques. However, molecular approaches based on the detection and quantification...
Background
Probabilistic functional integrated networks (PFINs) are designed to aid our understanding of cellular biology and can be used to generate testable hypotheses about protein function. PFINs are generally created by scoring the quality of interaction datasets against a Gold Standard dataset, usually chosen from a separate high-quality data...
Interactome analyses have traditionally been applied to yeast, human and other model organisms due to the availability of protein-protein interaction data for these species. Recently, these techniques have been applied to more diverse species using computational interaction prediction from genome sequence and other data types. This review describes...
Since the large-scale experimental characterization of protein-protein interactions (PPIs) is not possible for all species, several computational PPI prediction methods have been developed that harness existing data from other species. While PPI network prediction has been extensively used in eukaryotes, microbial network inference has lagged behin...
Interactome analyses have traditionally been applied to yeast, human and other model organisms due to the availability of protein-protein interactions data for these species. Recently these techniques have been applied to more diverse species using computational interaction prediction from genome sequence and other data types. This review describes...
Background
Probabilistic functional integrated networks (PFINs) are designed to aid our understanding of cellular biology and can be used to generate testable hypotheses about protein function. PFINs are generally created by scoring the quality of interaction datasets against a Gold Standard dataset, usually chosen from a separate high-quality data...
Engineering genetic regulatory circuits is key to the creation of biological applications that are responsive to environmental changes. Computational models can assist in understanding especially large and complex circuits where manual analysis is infeasible, permitting a model-driven design process. However, there are still few tools that offer th...
Background: Chromosome-level assemblies are indispensable for accurate gene prediction, synteny assessment, and understanding higher-order genome architecture. Reference and draft genomes of key helminth species have been published, but little is yet known about the biology of their chromosomes. Here, we present the complete genome of the tapeworm...
Previously we showed that the germline-specific RNA binding protein RBMXL2 is essential for male meiosis where it represses cryptic splicing patterns. Here we find that its ubiquitously expressed paralog RBMX helps underpin human genome stability by preventing non-productive splicing. In particular, RBMX blocks selection of aberrant splice and poly...
Synthetic biology aims to develop novel biological systems and increase their reproducibility using engineering principles such as standardisation and modularisation. It is important that these systems can be represented and shared in a standard way to ensure they can be easily understood, reproduced, and utilised by other researchers. The Syntheti...
Severe acute respiratory syndrome coronavirus two (SARS-CoV-2), the virus responsible for the coronavirus disease 2019 (COVID-19) pandemic, represents an unprecedented global health challenge. Consequently, a large amount of research into the disease pathogenesis and potential treatments has been carried out in a short time frame. However, developi...
Background:
Reference genome and transcriptome assemblies of helminths have reached a level of completion whereby secondary analyses that rely on accurate gene estimation or syntenic relationships can be now conducted with a high level of confidence. Recent public release of the v.3 assembly of the mouse bile-duct tapeworm, Hymenolepis microstoma,...
Background
The king scallop, Pecten maximus, is distributed in shallow waters along the Atlantic coast of Europe. It forms the basis of a valuable commercial fishery and plays a key role in coastal ecosystems and food webs. Like other filter feeding bivalves it can accumulate potent phytotoxins, to which it has evolved some immunity. The molecular...
Background: Chromosome-level assemblies are indispensable for accurate gene prediction, synteny assessment and understanding higher-order genome architecture. Reference and draft genomes of key helminth species have been published but little is yet known about the biology of their chromosomes. Here we present the complete genome of the tapeworm Hym...
Background
The King Scallop, Pecten maximus , is distributed in shallow waters along the Atlantic coast of Europe. It forms the basis of a valuable commercial fishery and its ubiquity means that it plays a key role in coastal ecosystems and food webs. Like other filter feeding bivalves it can accumulate potent phytotoxins, to which it has evolved s...
The vast majority of organisms possess transcription elongation factors, the functionally similar bacterial Gre and eukaryotic/archaeal TFIIS/TFS. Their main cellular functions are to proofread errors of transcription and to restart elongation via stimulation of RNA hydrolysis by the active centre of RNA polymerase (RNAP). However, a number of taxo...
Background Heterogeneity is a major obstacle to developing effective treatments for patients with primary Sjögren's syndrome. We aimed to develop a robust method for stratification, exploiting heterogeneity in patient-reported symptoms, and to relate these differences to pathobiology and therapeutic response.
Background Heterogeneity is a major obstacle to developing effective treatments for patients with primary Sjögren's syndrome. We aimed to develop a robust method for stratification, exploiting heterogeneity in patient-reported symptoms, and to relate these differences to pathobiology and therapeutic response. Methods We did hierarchical cluster ana...
Background: Heterogeneity is a major obstacle to developing effective treatments for patients with primary Sjögren's syndrome. We aimed to develop a robust method for stratification, exploiting heterogeneity in patient-reported symptoms, and to relate these differences to pathobiology and therapeutic response. / Methods: We did hierarchical cluster...
Background
Heterogeneity is a major obstacle to developing effective treatments for patients with primary Sjögren’s syndrome. We aimed to develop a robust method for stratification, exploiting heterogeneity in patient-reported symptoms, and to relate these differences to pathobiology and therapeutic response.
Methods
We did hierarchical cluster...
The vast majority of organisms possess transcription elongation factors, the functionally similar bacterial Gre and eukaryotic TFIIS/TFS. Their main cellular functions are to proofread errors of transcription and to restart elongation via stimulation of RNA hydrolysis by the active centre of RNA polymerase (RNAP). Very few taxons lack these factors...
Reference genome and transcriptome assemblies of helminths have reached a level of completion whereby secondary analyses that rely on accurate gene estimation or syntenic relationships can be now conducted with a high level of confidence. Recent public release of the v.3 assembly of the mouse bile-duct tapeworm, Hymenolepis microstoma , provides ch...
Evolutionary variation in anteroposterior patterning of the axial skeleton is a major contributor to the evolution of the vertebrate body plan, with five canonical vertebral types in tetrapods (cervical, thoracic, lumbar, sacral, caudal). However, less is known about the evolutionary origin and variation in vertebral regionalization patterns outsid...
Male germ cells of all placental mammals express an ancient nuclear RNA binding protein of unknown function called RBMXL2. Here we find that deletion of the retrogene encoding RBMXL2 blocks spermatogenesis. Transcriptome analyses of age-matched deletion mice show that RBMXL2 controls splicing patterns during meiosis. In particular, RBMXL2 represses...
Background
Tapeworms are agents of neglected tropical diseases responsible for significant health problems and economic loss. They also exhibit adaptations to a parasitic lifestyle that confound comparisons of their development with other animals. Identifying the genetic factors regulating their complex ontogeny is essential to understanding unique...
There has been a significant increase in the number of diagnostic and prognostic models published in the last decade. Testing such models in an independent, external validation cohort gives some assurance the model will transfer to a naturalistic, healthcare setting. Of 2,147 published models in the PubMed database, we found just 120 included some...
Background: Androgen steroid hormones are key drivers of prostate cancer. Previous work has shown that androgens can drive the expression of alternative mRNA isoforms as well as transcriptional changes in prostate cancer cells. Yet to what extent androgens control alternative mRNA isoforms and how these are expressed and differentially regulated in...
OBJECTIVES: B-cell activating factor (BAFF), β-2 microglobulin (β2M) and serum free light chains (FLCs) are elevated in primary SS (pSS) and associated with disease activity. We aimed to investigate their association with the individual disease activity domains of the EULAR Sjögren’s Syndrome Disease Activity Index (ESSDAI) in a large well-characte...
Objectives:
B-cell activating factor (BAFF), β-2 microglobulin (β2M) and serum free light chains (FLCs) are elevated in primary SS (pSS) and associated with disease activity. We aimed to investigate their association with the individual disease activity domains of the EULAR Sjögren's Syndrome Disease Activity Index (ESSDAI) in a large well-charact...
Objectives:
To assess the use of the Clinical EULAR Sjögren's Syndrome Disease Activity Index (ClinESSDAI), a version of the ESSDAI without the biological domain, for assessing potential eligibility and outcomes for clinical trials in patients with primary Sjögren's syndrome (pSS), according to the new ACR-EULAR classification criteria, from the U...
This is a correction to: Rheumatology, Volume 55, Issue 3, 1 March 2016, Pages 544–552, https://doi.org/10.1093/rheumatology/kev373
Transcription in all living organisms is accomplished by multi-subunit RNA polymerases (msRNAPs). msRNAPs are highly conserved in evolution and invariably share a ∼400 kDa five-subunit catalytic core. Here we characterize a hypothetical ∼100 kDa single-chain protein, YonO, encoded by the SPβ prophage of Bacillus subtilis. YonO shares very distant h...
Supplementary Figures and Supplementary Tables
Background
Lymphoma development is a serious complication of Primary Sjögren's syndrome (pSS). To date, the biological processes that may be involved in pSS-associated lymphoma are not fully understood.
Objectives
The aim of our study is to use microarray gene expression data from a well-defined cohort of pSS patients to identify biological proces...
The identification of the protein-coding regions of a genome is straightforward due to the universality of start and stop codons. However, the boundaries of the transcribed regions, conditional operon structures, non-coding RNAs and the dynamics of transcription, such as pausing of elongation, are non-trivial to identify, even in the comparatively...
Pausing by RNA polymerase is a major mechanism that regulates transcription elongation but can cause conflicts with fellow RNA polymerases and other cellular machineries. Here, we summarize our recent finding that misincorporation could be a major source of transcription pausing in vivo, and discuss the role of misincorporation-induced pausing.
The aim of the study was to evaluate the levels of physical activity in individuals with primary Sjögren’s syndrome (PSS) and its relationship to the clinical features of PSS. To this cross-sectional study, self-reported levels of physical activity from 273 PSS patients were measured using the International Physical Activity Questionnaire-short for...
The transcription error rate estimated from mistakes in end product RNAs is 10−3–10−5. We analyzed the fidelity of nascent RNAs from all actively transcribing elongation complexes (ECs) in Escherichia coli and Saccharomyces cerevisiae and found that 1–3% of all ECs in wild-type cells, and 5–7% of all ECs in cells lacking proofreading factors are, i...
BacillOndex is an extension of the Ondex data integration system, providing a semantically annotated, integrated knowledge base for the model Gram-positive bacterium Bacillus subtilis. This application allows a user to mine a variety of B. subtilis data sources, and analyse the resulting integrated dataset, which contains data about genes, gene pro...
Supplementary figures.
Supplementary tables.
Steroid androgen hormones play a key role in the progression and treatment of prostate cancer, with androgen deprivation therapy being the first-line treatment used to control cancer growth. Here we apply a novel search strategy to identify androgen-regulated cellular pathways that may be clinically important in prostate cancer. Using RNASeq data,...
Background
Fatigue is a debilitating condition with a significant impact on patients’ quality of life. Fatigue is frequently reported by patients suffering from primary Sjögren’s Syndrome (pSS), a chronic autoimmune condition characterised by dryness of the eyes and the mouth. However, although fatigue is common in pSS, it does not manifest in all...
Batch correction.
Principle component plots of the data pre- (A) and post-batch correction (B). Points are coloured and shaped by experimental batch.
(PNG)
Correction for clinical factors.
The top five genes for the linear fits of the three fatigue scores corrected for the other clinical factors. Factors were included in the regression fits individually and in combination. No significantly differentially expressed genes were found. Disease activity was measured using the EULAR Sjögren’s Syndrome Disea...
Interferon type I signature.
The clinical scores in the IFN type I positive and negative groups. ESSDAI scores were significantly higher in the IFN positive group. However, there was no significant relationship between IFN signature and ESSPRI, SSDDI or the three fatigue scores.
(PNG)
Enriched pathways in pSS.
Gene sets were considered to be enriched at an FDR cut-off of 25%.
(DOCX)
ESSPRI correction for clinical factors.
Volcano plots for the ESSPRI physical fatigue groups corrected for clinical factors. High fatigue >7 (n = 36) and low fatigue ≤3 (n = 34). A. Age at UKPSSR cohort recruitment. B. Disease activity measured using the EULAR Sjögren’s Syndrome Disease Activity Index. C. Disease damage measured using the Sjögren’s...
Correlations between fatigue and clinical factors.
The correlations between the three fatigue scores and the other clinical factors included in the analyses.
(DOCX)