• Home
  • Andrew C Parnell
Andrew C Parnell

Andrew C Parnell
Maynooth University · Hamilton Institute

PhD Statistics, University of Sheffield

About

164
Publications
55,269
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,744
Citations
Additional affiliations
November 2016 - March 2017
University College Dublin
Position
  • Head of Department
September 2008 - present
University College Dublin
Position
  • Lecturer in Statistics
September 2008 - present
University College Dublin
Position
  • Lecturer in Statistics

Publications

Publications (164)
Article
We propose a Bayesian, noisy-input, spatial–temporal generalized additive model to examine regional relative sea-level (RSL) changes over time. The model provides probabilistic estimates of component drivers of regional RSL change via the combination of a univariate spline capturing a common regional signal over time, random slopes and intercepts c...
Preprint
Full-text available
Modelling growth in student achievement is a significant challenge in the field of education. Understanding how interventions or experiences such as part-time work can influence this growth is also important. Traditional methods like difference-in-differences are effective for estimating causal effects from longitudinal data. Meanwhile, Bayesian no...
Article
Full-text available
Background: Despite its important role in education, significant gaps remain in the literature on homework. Notably, there is a dearth of understanding regarding how homework effects vary across different subjects, how student backgrounds may moderate its effectiveness, what the optimal amount and distribution of homework is, and how the causal imp...
Article
Full-text available
Bayesian Causal Forests (BCF) is a causal inference machine learning model based on the flexible non-parametric regression and classification tool, Bayesian Additive Regression Trees (BART). Motivated by data from the Trends in International Mathematics and Science Study (TIMSS), which includes data on student achievement in both mathematics and sc...
Article
Turbidity is commonly monitored as an important water quality index. Human activities, such as dredging and dumping operations, can disrupt turbidity levels and should be monitored and analysed for possible effects. In this paper, we model the variations of turbidity in Dublin Bay over space and time to investigate the effects of dumping and dredgi...
Preprint
Full-text available
Background: Despite its important role in education, significant gaps remain in the literature on homework. Notably, there is a dearth of understanding regarding how homework effects vary across different subjects, how student backgrounds may moderate its effectiveness, what the optimal amount and distribution of homework is, and how the causal imp...
Conference Paper
Full-text available
In this study we investigate the impact of regularly assigning creative mathematical reasoning tasks on student achievement. Using a causal inference machine learning approach applied to Irish eighth grade data from TIMSS 2019, we find that assigning challenging questions requiring students to go beyond the instruction has a clear positive effect o...
Preprint
In this study we investigate the impact of regularly assigning creative mathematical reasoning tasks on student achievement. Using a causal inference machine learning approach applied to Irish eighth grade data from TIMSS 2019, we find that assigning challenging questions requiring students to go beyond the instruction has a clear positive effect o...
Preprint
Full-text available
We present reslr, an R package to perform Bayesian modelling of relative sea level data. We include a variety of different statistical models previously proposed in the literature, with a unifying framework for loading data, fitting models, and summarising the results. Relative sea-level data often contain measurement error in multiple dimensions a...
Preprint
Full-text available
Environmental monitoring is crucial to our understanding of climate change, biodiversity loss and pollution. The availability of large-scale spatio-temporal data from sources such as sensors and satellites allows us to develop sophisticated models for forecasting and understanding key drivers. However, the data collected from sensors often contain...
Article
Full-text available
Teacher shortages and attrition are problems of international concern. One of the most frequent reasons for teachers leaving the profession is a lack of job satisfaction. Accordingly, in this study we have adopted a causal inference machine learning approach to identify practical interventions for improving overall levels of job satisfaction. We ap...
Preprint
Full-text available
We propose a Bayesian, noisy-input, spatial-temporal generalised additive model to examine regional relative sea-level (RSL) changes over time. The model provides probabilistic estimates of component drivers of regional RSL change via the combination of a univariate spline capturing a common regional signal over time, random slopes and intercepts c...
Article
Full-text available
We propose a Bayesian model which produces probabilistic reconstructions of hydroclimatic variability in Queensland Australia. The model provides a standardized approach to hydroclimate reconstruction using multiple palaeoclimate proxy records derived from natural archives such as speleothems, ice cores and tree rings. The method combines time‐seri...
Preprint
Full-text available
Lockdowns were widely used to reduce transmission of COVID-19 and prevent health care services from being overwhelmed. While these mitigation measures helped to reduce loss of life, they also disrupted the everyday lives of billions of people. We use data from a survey of Singaporean citizens and permanent residents during the peak of the lockdown...
Preprint
We present vivid, an R package for visualizing variable importance and variable interactions in machine learning models. The package provides a range of displays including heatmap and graph-based displays for viewing variable importance and interaction jointly and partial dependence plots in both a matrix layout and an alternative layout emphasizin...
Preprint
Tree-based regression and classification has become a standard tool in modern data science. Bayesian Additive Regression Trees (BART) has in particular gained wide popularity due its flexibility in dealing with interactions and non-linear effects. BART is a Bayesian tree-based machine learning method that can be applied to both regression and class...
Article
Full-text available
The immense advances in computer power achieved in the last decades have had a significant impact in Earth science, providing valuable research outputs that allow the simulation of complex natural processes and systems, and generating improved forecasts. The development and implementation of innovative geoscientific software is currently evolving t...
Preprint
Full-text available
We propose a simple yet powerful extension of Bayesian Additive Regression Trees which we name Hierarchical Embedded BART (HE-BART). The model allows for random effects to be included at the terminal node level of a set of regression trees, making HE-BART a non-parametric alternative to mixed effects models which avoids the need for the user to spe...
Preprint
Full-text available
We propose a Bayesian model which produces probabilistic reconstructions of hydroclimatic variability in Queensland Australia. The approach uses instrumental records of hydroclimate indices such as rain and evaporation, as well as palaeoclimate proxy records derived from natural archives such as sediment cores, speleothems, ice cores and tree rings...
Article
Full-text available
Stable isotope ratios are used to reconstruct animal diet in trophic ecology via mixing models. Several assumptions of stable isotope mixing models are critical, i.e., constant trophic discrimination factor and isotopic equilibrium between the consumer and its diet. The isotopic turnover rate (λ and its counterpart the half-life) affects the dynami...
Preprint
Teacher shortages and attrition are problems of international concern. Studies investigating this problem often identify important correlates of these two outcomes, but fail to produce easily implementable recommendations. Accordingly, in this study we have adopted a causal inference machine learning approach to identify practical interventions for...
Article
Full-text available
Process and material uncertainty, particularly real-time process-structure-property checks, are a key obstacle to improved uptake of metal powder bed fusion in industry. Efforts are underway for live process monitoring such as thermal and image-based data gathering for every layer printed. Current crystal plasticity finite element (CPFE) modelling...
Article
Full-text available
Conservation paleobiology seeks to leverage proxy reconstructions of ecological communities and environmental conditions to predict future changes and inform management decisions. Populations of East African megafauna likely changed during the Holocene in response to trends and events in the regional hydroclimate, but reconstructing these populatio...
Article
Full-text available
Palaeoclimate data relating to hydroclimate variability over the past millennia have a vital contribution to make to the water sector globally. The water industry faces considerable challenges accessing climate data sets that extend beyond that of historical gauging stations. Without this, variability around the extremes of floods and droughts is u...
Article
Full-text available
The relative contributions of both copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) to the additive genetic variance of carcass traits in cattle is not well understood. A detailed understanding of the relative importance of CNVs in cattle may have implications for study design of both genomic predictions and genome-wide associ...
Article
Full-text available
We consider the analysis of count data in which the observed frequency of zero counts is unusually large, typically with respect to the Poisson distribution. We focus on two alternative modelling approaches: over‐dispersion (OD) models and zero‐inflation (ZI) models, both of which can be seen as generalisations of the Poisson distribution; we refer...
Article
Full-text available
Background The carcass value of cattle is a function of carcass weight and quality. Given the economic importance of carcass merit to producers, it is routinely included in beef breeding objectives. A detailed understanding of the genetic variants that contribute to carcass merit is useful to maximize the efficiency of breeding for improved carcass...
Article
Full-text available
Earthquake hazard assessments for the Tokyo Region are complicated by the trench–trench triple junction where the oceanic Philippine Sea Plate not only underthrusts a continental plate but is also being subducted by the Pacific Plate. Great thrust earthquakes and associated tsunamis are historically recognized hazards from the Continental/Philippin...
Article
Full-text available
The epidemic increase in the incidence of Human Papilloma Virus (HPV) related Oropharyngeal Squamous Cell Carcinomas (OPSCCs) in several countries worldwide represents a significant public health concern. Although gender neutral HPV vaccination programmes are expected to cause a reduction in the incidence rates of OPSCCs, these effects will not be...
Preprint
Full-text available
We propose a new semi-parametric model based on Bayesian Additive Regression Trees (BART). In our approach, the response variable is approximated by a linear predictor and a BART model, where the first component is responsible for estimating the main effects and BART accounts for the non-specified interactions and non-linearities. The novelty in ou...
Preprint
Variable importance, interaction measures, and partial dependence plots are important summaries in the interpretation of statistical and machine learning models. In this paper we describe new visualization techniques for exploring these model summaries. We construct heatmap and graph-based displays showing variable importance and interaction jointl...
Article
This study investigates the use of a statistical anomaly detection method to analyse in-situ process monitoring data obtained during the Laser-Powder Bed Fusion of Ti-6Al-4V parts. The printing study was carried out on a Renishaw 500M Laser-Powder Bed Fusion system. A photodiode-based system called InfiniAM was used to monitor the melt-pool emissio...
Article
Full-text available
Bayesian additive regression trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners and is very flexible for predicting in the presence of nonlinearity and high-order interactions. In this paper...
Article
Full-text available
Building robust age–depth models to understand climatic and geologic histories from coastal sedimentary archives often requires composite chronologies consisting of multi-proxy age markers. Pollen chronohorizons derived from a known change in vegetation are important for age–depth models, especially those with other sparse or imprecise age markers....
Article
Full-text available
We develop a new approach for feature selection via gain penalization in tree-based models. First, we show that previous methods do not perform sufficient regularization and often exhibit sub-optimal out-of-sample performance, especially when correlated features are present. Instead, we develop a new gain penalization idea that exhibits a general l...
Article
Full-text available
The North Atlantic Oscillation (NAO) is the major atmospheric mode that controls winter European climate variability because its strength and phase determine regional temperature, precipitation and storm tracks. The NAO spatial structure and associated climatic impacts over Europe are not stationary making it crucial to understanding its past evolu...
Article
While the ecosystem of the Great Barrier Reef (GBR), north-eastern Australia, is being threatened by the elevated levels of sediments and nutrients discharged from adjacent coastal river systems, the source of these detrimental pollutants are not well understood. Here we used a combined isotopic (δ¹³C, δ¹⁵N) and geochemical (Zn, Pt and S) signature...
Preprint
Full-text available
We consider models underlying regression analysis of count data in which the observed frequency of zero counts is unusually large, typically with respect to the Poisson distribution. We focus on two alternative modelling approaches: Over-Dispersion (OD) models, and Zero-Inflation (ZI) models, both of which can be seen as generalisations of the Pois...
Article
Full-text available
Modes of climate variability affect global and regional climates on different spatio-temporal scales, and they have important impacts on human activities and ecosystems. As these modes are a useful tool for simplifying the understanding of the climate system, it is crucial that we gain improved knowledge of their long-term past evolution and intera...
Preprint
Bayesian Additive Regression Trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners and is very flexible for predicting in the presence of non-linearity and high-order interactions. In this pape...
Preprint
Full-text available
We develop a new approach for feature selection via gain penalization in tree-based models. First, we show that previous methods do not perform sufficient regularization and often exhibit sub-optimal out-of-sample performance, especially when correlated features are present. Instead, we develop a new gain penalization idea that exhibits a general l...
Article
Full-text available
Background: The trading of individual animal genotype information often involves only the exchange of the called genotypes and not necessarily the additional information required to effectively call structural variants. The main aim here was to determine if it is possible to impute copy number variants (CNVs) using the flanking single nucleotide p...
Article
In this study, we begin a comprehensive characterisation of temperature extremes in Ireland for the period 1981‐2010. We produce return levels of anomalies of daily maximum temperature extremes for an area over Ireland, for the 30‐year period 1981‐2010. We employ extreme value theory (EVT) to model the data using the generalised Pareto distribution...
Article
Full-text available
Intrinsic water use efficiency (iWUE), defined as the ratio of photosynthesis to stomatal conductance, is a key variable in plant physiology and ecology. Yet, how rising atmospheric CO2 concentration affects iWUE at broad species and ecosystem scales is poorly understood. In a field-based study of 244 woody angiosperm species across eight biomes ov...
Article
Full-text available
We review published literature and historical texts to propose that three periods of official Chinese maritime bans impacted the composition and circulation of trade ceramics along Asian trade routes: Ming Ban 1 (1371 – 1509), Ming Ban 2 (1521 – 1529), and Qing Ban (1654 – 1684). We use ceramics collected during a landscape archaeology survey along...
Preprint
Full-text available
The North Atlantic Oscillation (NAO) is the major atmospheric mode ruling European climate variability during winter and its significance is underpinned by the number of recent studies aimed at reconstructing past NAO variability across different time scales and temporal resolutions. We present a new 2000-year multi-annual, proxy-based, local NAO i...
Preprint
Full-text available
The detection of anomalies in real time is paramount to maintain performance and efficiency across a wide range of applications including web services and smart manufacturing. This paper presents a novel algorithm to detect anomalies in streaming time series data via statistical learning. We adapt the generalised extreme studentised deviate test [1...
Article
Full-text available
Despite strong selection for athletic traits in Thoroughbred horses, there is marked variation in speed and aptitude for racing performance within the breed. Using global positioning system monitoring during exercise training, we measured speed variables and temporal changes in speed with age to derive phenotypes for GWAS. The aim of the study was...
Conference Paper
Direct metal laser sintering (DMLS) is a powder bed fusion (PBF) additive manufacturing process commonly used within the medical device and aerospace industries where regulations drive the requirement for stringent quality control. Using in-situ monitoring, the identification of defects, as well as the geometric and dimensional measurement of the l...
Article
Full-text available
Durability traits in Thoroughbred horses are heritable, economically valuable and may affect horse welfare. The aims of this study were to test the hypotheses that (i) durability traits are heritable and (ii) genetic data may be used to predict a horse's potential to have a racecourse start. Heritability for the phenotype ‘number of 2‐ and 3‐year‐o...
Preprint
Full-text available
In this study, we begin a comprehensive characterisation of temperature extremes in Ireland for the period 1981-2010. We produce return levels of anomalies of daily maximum temperature extremes for an area over Ireland, for the 30-year period 1981-2010. We employ extreme value theory (EVT) to model the data using the generalised Pareto distribution...
Article
Full-text available
Archaeological evidence shows that a predecessor of the 2004 Indian Ocean tsunami devastated nine distinct communities along a 40-km section of the northern coast of Sumatra in about 1394 CE. Our evidence is the spatial and temporal distribution of tens of thousands of medieval ceramic sherds and over 5,000 carved gravestones, collected and recorde...
Article
Full-text available
Stomatal conductance (gs) in terrestrial vegetation regulates the uptake of atmospheric carbon dioxide for photosynthesis and water loss through transpiration, closely linking the biosphere and atmosphere and influencing climate. Yet, the range and pattern of gs in plants from natural ecosystems across broad geographic, climatic, and taxonomic rang...
Article
Full-text available
We present a novel approach to estimating the effect of control parameters on tool wear rates and related changes in the three force components in turning of medical grade Co-Cr-Mo (ASTM F75) alloy. Co-Cr-Mo is known to be a difficult to cut material which, due to a combination of mechanical and physical properties, is used for the critical structu...
Article
Characterizing the spatio-temporal variability of relative sea level (RSL) and estimating local, regional, and global RSL trends requires statistical analysis of RSL data. Formal statistical treatments, needed to account for the spatially and temporally sparse distribution of data and for geochronological and elevational uncertainties, have advance...
Article
Full-text available
Background Race distance aptitude in Thoroughbred horses is highly heritable and is influenced largely by variation at the myostatin gene (MSTN). Objectives In addition to MSTN, we hypothesised that other modifying loci contribute to best race distance. Study design Using 3006 Thoroughbreds, including 835 ‘elite’ horses, which were >3 years old,...
Article
In Maths for Business, a large first-year mathematics module, the continuous assessment component comprises 10 weekly quizzes which combine to contribute 40% of the final module mark. If students did not receive the full five marks on their weekly quiz, they were provided with the opportunity to resubmit their corrected weekly quiz with an explanat...
Conference Paper
Previous research has shown that students’ use of module resources strongly relates to the timing of the module’s continuous assessment. In our case study of a large first-year mathematics module for Business students, Maths for Business, we examine this relationship and the resources relied on by students for completing their continuous assessment...
Article
Full-text available
We present archaeological evidence for a trading settlement dating from the 13th to the mid-16th century ce on an elevated headland in Lamreh village about 30 km east of Banda Aceh, on the northern coast of Sumatra, Indonesia. We propose this site was part of historic Lamri, known from documentary sources as an important node in the maritime "silk...
Article
Full-text available
Nous présentons les preuves archéologiques d’un établissement commercial datant du XIIIe au milieu du XVIe siècle EC sur un promontoire élevé du village de Lamreh, à environ 30 km à l’est de Banda Aceh, sur la côte nord de Sumatra, en Indonésie. Nous l’identifions au site historique de Lamri, décrit par les sources textuelles comme un noeud importa...