About
165
Publications
20,866
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
982
Citations
Introduction
Stephen J. Bush is a postdoctoral bioinformatician at the Weatherall Institute of Molecular Medicine, University of Oxford. Currently working on methodologies for exploring 'selfish spermatogonial selection', a mechanism underpinning the association of paternal age with neurodevelopmental disorders.
Previously worked at the Nuffield Department of Medicine, Oxford (microbial genomics; 2018-2021) and at the Roslin Institute, University of Edinburgh (livestock transcriptomics; 2014-2017).
Additional affiliations
May 2021 - present
January 2018 - April 2021
August 2014 - December 2017
Publications
Publications (165)
The laboratory rat is an important model for biomedical research. To generate a comprehensive rat transcriptomic atlas, we curated and downloaded 7700 rat RNA-seq datasets from public repositories, downsampled them to a common depth and quantified expression. Data from 585 rat tissues and cells, averaged from each BioProject, can be visualized and...
Mosaic loss of the Y chromosome (LOY) is the most frequent chromosomal aberration in aging men and is strongly correlated with mortality and disease. To date, studies of LOY have only been performed in humans, and so it is unclear whether LOY is a natural consequence of our relatively long lifespan or due to exposure to human-specific external stre...
Selection leaves signatures in the DNA sequence of genes, with many test statistics devised to detect its action. While these statistics are frequently used to support hypotheses about the adaptive significance of particular genes, the effect these genes have on reproductive fitness is rarely quantified experimentally. Consequently, it is unclear h...
The laboratory rat is an important model for biomedical research. To generate a comprehensive rat transcriptomic atlas, we curated and down-loaded 7700 rat RNA-seq datasets from public repositories, down-sampled them to a common depth and quantified expression. Data from 590 rat tissues and cells, averaged from each Bioproject, can be visualised an...
Minimizing false positives is a critical issue when variant calling as no method is without error. It is common practice to post-process a variant-call file (VCF) using hard filter criteria intended to discriminate true-positive (TP) from false-positive (FP) calls. These are applied on the simple principle that certain characteristics are dispropor...
Homozygous mutation of the Csf1r locus ( Csf1rko ) in mice, rats and humans leads to multiple postnatal developmental abnormalities. To enable analysis of the mechanisms underlying the phenotypic impacts of Csf1r mutation, we bred a rat Csf1rko allele to the inbred dark agouti (DA) genetic background and to a Csf1r -mApple reporter transgene. The C...
The laboratory rat continues to be the model of choice for many studies of physiology, behavior, and complex human diseases. Cells of the mononuclear phagocyte system (MPS; monocytes, macrophages, and dendritic cells) are abundant residents in every tissue in the body and regulate postnatal development, homeostasis, and innate and acquired immunity...
USP16 is a histone deubiquitinase which facilitates G2/M transition during the cell cycle, regulates DNA damage repair and contributes to inducible gene expression. We mutated the USP16 gene in a high differentiation clone of the acute monocytic leukemia cell line THP-1 using the CRISPR-Cas9 system and generated four homozygous knockout clones. All...
The laboratory rat is widely used as a model for human diseases. Many of these diseases involve monocytes and tissue macrophages in different states of activation. Whilst methods for in vitro differentiation of mouse macrophages from embryonic stem cells (ESC) and bone marrow (BM) are well established, these are lacking for the rat. The gene expres...
Campylobacter is the leading cause of bacterial foodborne gastroenteritis worldwide. Handling or consumption of contaminated poultry meat is a key risk factor for human campylobacteriosis. One potential control strategy is to select poultry with increased resistance to Campylobacter. We associated high-density genome-wide genotypes (600K single nuc...
Mutations in the human CSF1R gene have been associated with dominant and recessive forms of neurodegenerative disease. Here we describe the impacts of Csf1r mutation in the rat on development of the brain. Diffusion imaging indicated small reductions in major fiber tracts that may be associated in part with ventricular enlargement. RNA-seq profilin...
Read alignment is the central step of many analytic pipelines that perform variant calling. To reduce error, it is common practice to pre-process raw sequencing reads to remove low-quality bases and residual adapter contamination, a procedure collectively known as ‘trimming’. Trimming is widely assumed to increase the accuracy of variant calling, a...
The development of the mononuclear phagocyte system (MPS) is controlled by signals from the CSF1 receptor (CSF1R). Homozygous mutation of the Csf1r locus ( Csf1rko ) in inbred rats led to the loss of non-classical monocytes and tissue macrophage populations, reduced postnatal somatic growth, severe developmental delay impacting all major organ syst...
Tooth resorption (TR) in domestic cats is a common and painful disease characterised by the loss of mineralised tissues from the tooth. Due to its progressive nature and unclear aetiology the only treatment currently available is to extract affected teeth. To gain insight into TR pathogenesis, we characterised the transcriptomic changes involved in...
Alternative splicing is widespread throughout eukaryotic genomes and greatly increases transcriptomic diversity. Many alternative isoforms have functional roles in developmental processes and are precisely temporally regulated. To facilitate the study of alternative splicing in a developmental context, we created MeDAS, a Metazoan Developmental Alt...
The mononuclear phagocyte system (MPS) is a family of cells including progenitors, circulating blood monocytes, resident tissue macrophages, and dendritic cells (DCs) present in every tissue in the body. To test the relationships between markers and transcriptomic diversity in the MPS, we collected from National Center for Biotechnology Information...
The maintenance of a healthy cardiovascular system requires expression of genes that contribute to essential biological activities and repression of those that are associated with functions likely to be detrimental to cardiovascular homeostasis. Vascular calcification is a major disruption to cardiovascular homeostasis, where tissues of the cardiov...
Read alignment is the central step of many analytic pipelines that perform SNP calling. To reduce error, it is common practice to pre-process raw sequencing reads to remove low-quality bases and residual adapter contamination, a procedure collectively known as 'trimming'. Trimming is widely assumed to increase the accuracy of SNP calling although t...
Mammalian macrophages differ in their basal gene expression profiles and response to the toll-like receptor 4 (TLR4) agonist, lipopolysaccharide (LPS). In human macrophages, LPS elicits a temporal cascade of transient gene expression including feed forward activators and feedback regulators that limit the response. Here we present a transcriptional...
The response of the human acute myeloid leukemia cell line THP-1 to phorbol esters has been widely studied to test candidate leukemia therapies and as a model of cell cycle arrest and monocyte-macrophage differentiation. Here we have employed Cap Analysis of Gene Expression (CAGE) to analyze a dense time course of transcriptional regulation in THP-...
In several countries, one of the most pronounced trends in contemporary baby naming is selecting a comparatively uncommon name. Nevertheless, although a well-documented phenomenon, studies of uncommon name use are often limited to forenames. This study analyses approximately 22 million full names from England and 1 million from Wales, given between...
Sequencing data from host-associated microbes can often be contaminated by the body of the investigator or research subject. Human DNA is typically removed from microbial reads either by subtractive alignment (dropping all reads that map to the human genome) or by using a read classification tool to predict those of human origin, and then discardin...
Large animal models are of increasing importance in cardiovascular disease research as they demonstrate more similar cardiovascular features (in terms of anatomy, physiology and size) to humans than do rodent species. The maintenance of a healthy cardiovascular system requires expression of genes that contribute to essential biological activities a...
Campylobacter is the leading cause of bacterial foodborne gastroenteritis in many countries. Source attribution studies unequivocally identify the handling or consumption of contaminated poultry meat as the primary risk factor. One potential strategy to control Campylobacter is to select poultry with increased resistance to colonisation. We conduct...
The mononuclear phagocyte system (MPS) is a family of cells including progenitors, circulating blood monocytes, resident tissue macrophages and dendritic cells (DC) present in every tissue in the body. To test the relationships between markers and transcriptomic diversity in the MPS, we collected from NCBI-GEO >500 quality RNA-seq datasets generate...
The domestic pig (Sus scrofa) is both an economically important livestock species and a model for biomedical research. Two highly contiguous pig reference genomes have recently been released. To support functional annotation of the pig genomes and comparative analysis with large human transcriptomic data sets, we aimed to create a pig gene expressi...
Background:
Accurately identifying single-nucleotide polymorphisms (SNPs) from bacterial sequencing data is an essential requirement for using genomics to track transmission and predict important phenotypes such as antimicrobial resistance. However, most previous performance evaluations of SNP calling have been restricted to eukaryotic (human) dat...
Sequencing data from host-associated microbes can often be contaminated by the body of the investigator or research subject. Human DNA is typically removed from microbial reads either by subtractive alignment (dropping all reads that map to the human genome) or using a read classification tool to predict those of human origin, and then discarding t...
Milk yield is the most important dairy sheep trait and constitutes the key genetic improvement goal via selective breeding. Mastitis is one of the most prevalent diseases, significantly impacting on animal welfare, milk yield and quality, while incurring substantial costs. Our objectives were to determine the feasibility of a concomitant genetic im...
Goats (Capra hircus) are an economically important livestock species providing meat and milk across the globe. They are of particular importance in tropical agri-systems contributing to sustainable agriculture, alleviation of poverty, social cohesion, and utilisation of marginal grazing. There are excellent genetic and genomic resources available f...
There is increasing recognition that the underlying genetic variation contributing to complex traits influences transcriptional regulation and can be detected at a population level as expression quantitative trait loci. At the level of an individual, allelic variation in transcriptional regulation of individual genes can be detected by measuring al...
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translate...
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translate...
Pervasive allelic variation at both gene and single nucleotide level (SNV) between individuals is commonly associated with complex traits in humans and animals. Allele-specific expression (ASE) analysis, using RNA-Seq, can provide a detailed annotation of allelic imbalance and infer the existence of cis-acting transcriptional regulation. However, v...
Goats (Capra hircus) are an economically important livestock species providing meat and milk across the globe. They are of particular importance in tropical agri-systems contributing to sustainable agriculture, alleviation of poverty, social cohesion and utilisation of marginal grazing. There are excellent genetic and genomic resources available fo...
The domestic water buffalo (Bubalus bubalis) makes a major contribution to the global agricultural economy in the form of milk, meat, hides, and draught power. The global water buffalo population is predominantly found in Asia, and per head of population more people depend upon the buffalo than on any other livestock species. Despite its agricultur...
Background
Accurately identifying SNPs from bacterial sequencing data is an essential requirement for using genomics to track transmission and predict important phenotypes such as antimicrobial resistance. However, most previous performance evaluations of SNP calling have been restricted to eukaryotic (human) data. Additionally, bacterial SNP calli...
Pervasive allelic variation at both gene and single nucleotide level (SNV) between individuals is commonly associated with complex traits in humans and animals. Allele-specific expression (ASE) analysis, using RNA-Seq, can provide a detailed annotation of allelic imbalance and infer the existence of cis-acting transcriptional regulation. However, v...
Milk yield is the most important dairy sheep trait and constitutes the key genetic improvement goal via selective breeding. Mastitis is one of the most prevalent diseases, significantly impacting on animal welfare, milk yield and quality, while incurring substantial costs. Our objectives were to determine the feasibility of a concomitant genetic im...
Goniodysgenesis is a developmental abnormality of the anterior chamber of the eye. It is generally considered to be congenital in dogs (Canis lupus familiaris), and has been associated with glaucoma and blindness. Goniodysgenesis and early-onset glaucoma initially emerged in Border Collies in Australia in the late 1990s and have subsequently been f...
The naming of a newborn for a deceased relative is a means by which a meaningful connection can be maintained with the dead. This study analyses the birth, marriage and death records of England and Wales to highlight a historic naming custom–that should a child die shortly after birth, their name could often be re-used for a later sibling.
This re-...
The phosphatidylserine receptor TIM4, encoded by TIMD4, mediates the phagocytic uptake of apoptotic cells. We applied anti-chicken TIM4 mAbs in combination with CSF1R reporter transgenes to dissect the function of TIM4 in the chick (Gallus gallus). During development in ovo, TIM4 was present on the large majority of macrophages, but expression beca...
One of the most significant physiological challenges to neonatal and juvenile ruminants is the development and establishment of the rumen. Using a subset of RNA-Seq data from our high-resolution atlas of gene expression in sheep (Ovis aries) we have provided the first comprehensive characterization of transcription of the entire gastrointestinal (G...
Background:
mRNA-like long non-coding RNAs (lncRNAs) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. Thus, in many cases lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for th...
The European honey bee (Apis mellifera) plays a major role in pollination and food production. Honey bee health is a complex product of the environment, host genetics and associated microbes (commensal, opportunistic and pathogenic). Improved understanding of these factors will help manage modern challenges to bee health. Here we used DNA sequencin...
Chosen names reflect changes in societal values, personal tastes and cultural diversity. Vogues in name usage can be easily shown on a case by case basis, by plotting the rise and fall in their popularity over time. However, individual name choices are not made in isolation and trends in naming are better understood as group-level phenomena. Here w...
Usage of forenames in the BMD dataset, as the absolute number of registered forenames per year.
(XLSX)
Number of unique forenames, and forename diversity, in the Office for National Statistics dataset.
(XLSX)
General features of the BMD and ONS corpora of names.
(DOCX)
Typographical changes made to the BMD corpus of names.
(XLSX)
Records excluded from the BMD corpus as they are unrecognisable as a complete name.
(XLSX)
Rank order of middle names in the BMD corpus, by number of registered births per year.
(XLSX)
Rank order of names in the BMD corpus, by total number of registered births (across 177 years).
(XLSX)
Summary of the BMD corpus: Number of usable birth records per year, forename diversity, proportion of records with a middle name and the most popular fore/middle name per year.
(XLSX)
Usage of forenames in the BMD dataset, as a percentage of total registered forenames per year.
(XLSX)