Henry Mwambi

Henry Mwambi
University of KwaZulu-Natal | ukzn · School of Mathematics, Statistics and Computer Science

PhD

About

177
Publications
43,408
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,747
Citations
Citations since 2017
105 Research Items
1318 Citations
2017201820192020202120222023050100150200250300
2017201820192020202120222023050100150200250300
2017201820192020202120222023050100150200250300
2017201820192020202120222023050100150200250300
Introduction
Henry Mwambi currently works at the School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal. Henry does research in Infectious Diseases, and Biostatistics methodologies covering a wide range of applications. His current project is 'Analysis of high dimensional correlated data in biomedical research problems'. Specific topics include modelling and analysis of longitudinal data, missing and incomplete data methodologies, analysis of correlated survey data, survival analysis, compartmental models for disease transmission dynamics, combining statistical and machine learning methods to deal with high dimensional data, non-linear modelling for growth data, spatial and spatial temporal models for disease mapping, univariate and multivariate joint modelling.

Publications

Publications (177)
Article
Full-text available
Breast cancer (BC) is the most incident cancer type among women. BC is also ranked as the second leading cause of death among all cancer types. Therefore, early detection and prediction of BC are significant for prognosis and in determining the suitable targeted therapy. Early detection using morphological features poses a significant challenge for...
Article
Full-text available
Background Estimating forest productivity is critical for effective management and site assessment. The dominant height is used to calculate the Site Index (SI), which is commonly used to assess forest productivity. In this study, an algebraic difference approach was used to develop a dominant height model incorporating the rainfall effect for Euca...
Chapter
Meta-analysis methods for univariate effect sizes are well-known and developed. However, multiple outcomes are increasingly being measured and reported in medical research studies, which may lead to multiple effect sizes being estimated. The estimated effect sizes could be correlated because they are measured from the same studies. Additionally, th...
Article
Full-text available
The paper focuses on the development of a strategy to integrate forecasting using artificial neural networks (ANN), simulation and optimisation techniques for ambulance deployment to predefined locations with heterogeneous demand patterns under stochastic environments. The metropolitan city of Bulawayo was used as a case study with high variability...
Article
Full-text available
Background The association structure linking the longitudinal and survival sub-models is of fundamental importance in the joint modeling framework and the choice of this structure should be made based on the clinical background of the study. However, this information may not always be accessible and rationale for selecting this association structur...
Article
Full-text available
Background/M&M A vital aspect of disease management and policy making lies in the understanding of the universal distribution of diseases. Nevertheless, due to differences all-over host groups and space–time outbreak activities, data are subject to intricacies. Herein, Bayesian spatio-temporal models were proposed to model and map malaria and anaem...
Article
Full-text available
Background Understanding the relationship between tuberculosis and the risk factors of tuberculosis is vital to be able to address them. Even though tuberculosis is curable and preventable, it remains a public threat, especially in low- and middle-income countries. There are more cases of men infected with tuberculosis compared to women. Methods T...
Article
Full-text available
Despite the rapid growth of developing markets, aided by globalization, comparative studies of cryptocurrency and stock market volatility have focused on the developed markets and neglected developing ones. In this regard, this study compares cryptocurrency volatility with that of the Johannesburg Stock Exchange (JSE), a developing market. GARCH-ty...
Article
Full-text available
Background While the benefits of exclusive breastfeeding are widely acknowledged, it continues to be a rare practice. Determinants of exclusive breastfeeding in Tanzania have been studied; however, the existence and contribution of regional variability to the practice have not been explored. Methods Tanzania demographic and health survey data for...
Article
Full-text available
TB is preventable and treatable but remains the leading cause of death in South Africa. The deaths due to TB have declined, but in 2017, around 322 000 new cases were reported in the country. The need to eradicate the disease through research is increasing. This study used population-based National Income Dynamics Survey data (Wave 1 to Wave 5) fro...
Article
Full-text available
Diseases have been studied separately, but two diseases have inherent dependencies on each other, modelling them separately negates practical reality. The authors’ modelling processes are based on univariate separate regressions, which connect each illness to covariates separately. Therefore, the focus of this article is to estimate the spatial cor...
Article
Full-text available
Malaria and anaemia are common diseases that affect children, particularly in Africa. Studies on the risk associated with these diseases and their synergy are scanty. This work aims to study the spatial pattern of malaria and anaemia in Nigeria and adjust for their risk factors using separate models for malaria and anaemia. This study used Bayesian...
Article
Full-text available
Evidence-based knowledge of the relationship between foods and nutrients is needed to inform dietary-based guidelines and policy. Proper and tailored statistical methods to analyse food composition databases (FCDBs) could assist in this regard. This review aims to collate the existing literature that used any statistical method to analyse FCDBs, to...
Article
Full-text available
Background One of the public health problems all over the world is tuberculosis. An important factor for human well-being is good health. Worldwide, there are more cases of men with tuberculosis than women. Therefore, identifying risk factors associated with tuberculosis among men is essential. This study uses a survey logistic regression model to...
Preprint
Full-text available
Background This study aim was to identify the risk factors associated with multidrug-resistant tuberculosis (MDR-TB) disease. The Weibull model has shown to perform better than the Cox proportional models with respect to the accuracy and efficient of the estimates. Therefore, a Weibull parametric model was employed to identify predictors of death i...
Article
Longitudinal studies of correlated cognitive and disability outcomes among older adults are characterized by missing data due to death or loss to follow-up from deteriorating health conditions. The Mini-Mental State Examination (MMSE) score for assessing cognitive function ranges from a minimum of 0 (floor) to a maximum of 30 (ceiling). To study th...
Article
Full-text available
In December 2019, a new pandemic called the coronavirus began ravaging the world. By May 2020, the pandemic had caused great loss of lives and disrupted the way of lives in more ways than one. The nature of the disease saw several strategies to curb its spread rolled out. These strategies included closing of businesses and borders, restriction of m...
Article
Full-text available
Objectives We used machine learning algorithms to track how the ranks of importance and the survival outcome of four socioeconomic determinants (place of residence, mother’s level of education, wealth index and sex of the child) of under-5 mortality rate (U5MR) in sub-Saharan Africa have evolved. Settings This work consists of multiple cross-secti...
Preprint
Full-text available
Background: Joint modeling is an active area of research which has seen an increasing interest in medical research for various dynamic medical scenarios for studying possible relationships between longitudinal biomarkers and survival outcomes. The association structure that links the survival and the longitudinal sub-models is of great importance w...
Article
Full-text available
Abstract Background The CD4 cell count signifies the health of an individual’s immune system. The use of data-driven models enables clinicians to accurately interpret potential information, examine the progression of CD4 count, and deal with patient heterogeneity due to patient-specific effects. Quantile-based regression models can be used to illus...
Article
Full-text available
Understanding and identifying the markers and clinical information that are associated with colorectal cancer (CRC) patient survival is needed for early detection and diagnosis. In this work, we aimed to build a simple model using Cox proportional hazards (PH) and random survival forest (RSF) and find a robust signature for predicting CRC overall s...
Article
Full-text available
Understanding independent and joint predictors of adverse pregnancy outcomes is essential to inform interventions toward achieving sustainable development goals. We aimed to determine the joint predictors of preterm birth and perinatal death among singleton births in northern Tanzania based on cohort data from the Kilimanjaro Christian Medical Cent...
Article
A two-stage joint survival model is used to analyze time to event outcomes that could be associated with biomakers that are repeatedly collected over time. A Two-stage joint survival model has limited model checking tools and is usually assessed using standard diagnostic tools for survival models. The diagnostic tools can be improved and implemente...
Article
Full-text available
There is a vast amount of geo-referenced data in many fields of study including ecological studies. Geo-referencing is usually by point referencing; that is, latitudes and longitudes or by areal referencing, which includes districts, counties, states, provinces and other administrative units. The availability of large geo-referenced datasets for mo...
Article
Full-text available
Food composition databases (FCDBs) provide the nutritional content of foods and are essential for developing nutrition guidance and effective intervention programs to improve nutrition of a population. In public and nutritional health research studies, FCDBs are used in the estimation of nutrient intake profiles at the population levels. However, s...
Article
Full-text available
Quantile regression offers an invaluable tool to discern effects that would be missed by other conventional regression models, which are solely based on modeling conditional mean. Quantile regression for mixed-effects models has become practical for longitudinal data analysis due to the recent computational advances and the ready availability of ef...
Article
Full-text available
Background Colorectal cancer (CRC) is the third most common cancer among women and men in the USA, and recent studies have shown an increasing incidence in less developed regions, including Sub-Saharan Africa (SSA). We developed a hybrid (DNA mutation and RNA expression) signature and assessed its predictive properties for the mutation status and s...
Article
Full-text available
Abstract Cancer tumor classification based on morphological characteristics alone has been shown to have serious limitations. Breast, lung, colorectal, thyroid, and ovarian are the most commonly diagnosed cancers among women. Precise classification of cancers into their types is considered a vital problem for cancer diagnosis and therapy. In this p...
Article
Full-text available
Many surveys are often complex cross-sectional studies that involve clustered data. Such surveys can have the additional complexity of the measurement error problem. Ignoring the measurement error problem and the clustering aspect may lead to incorrect inferences and conclusions. The purpose of this study was to demonstrate the application of regre...
Article
Full-text available
Anemia is a major public health problem in Africa, affecting an increasing number of children under five years. Guinea is one of the most affected countries. In 2018, the prevalence rate in Guinea was 75% for children under five years. This study sought to identify the factors associated with anemia and to map spatial variation of anemia across the...
Article
Full-text available
Background Preterm birth is a significant contributor of under-five and newborn deaths globally. Recent estimates indicated that, Tanzania ranks the tenth country with the highest preterm birth rates in the world, and shares 2.2% of the global proportion of all preterm births. Previous studies applied binary regression models to determine predictor...
Article
Full-text available
Background The rising burden of the ongoing COVID-19 epidemic in South Africa has motivated the application of modeling strategies to predict the COVID-19 cases and deaths. Reliable and accurate short and long-term forecasts of COVID-19 cases and deaths, both at the national and provincial level, are a key aspect of the strategy to handle the COVID...
Article
Full-text available
The increase in health research in sub-Saharan Africa (SSA) has led to a high demand for biostatisticians to develop study designs, contribute and apply statistical methods in data analyses. Initiatives exist to address the dearth in statistical capacity and lack of local biostatisticians in SSA health projects. The Sub-Saharan African Consortium f...
Article
Difficulty in obtaining the correct measurement for an individual’s longterm exposure is a major challenge in epidemiological studies that investigate the association between exposures and health outcomes. Measurement error in an exposure biases the association between the exposure and a disease outcome. Usually, an internal validation study is req...
Preprint
Full-text available
Bacground: Anemia is a major public health problem in Africa with an increasing number of children under 5years getting infected. Guinea is one of the most affected countries. In 2018, the prevalence rate was 75% inchildren under 5 years. This study sought to identify the factors associated with anemia and to map spatialvariation of anemia across t...
Article
Full-text available
Background: This study aims to make use of a longitudinal data modelling approach to analyze data on the number of CD4+cell counts measured repeatedly in HIV-1 Subtype C infected women enrolled in the Acute Infection Study of the Centre for the AIDS Programme of Research in South Africa. Methodology: This study uses data from the CAPRISA 002 Acu...
Article
Full-text available
Objective: We aimed to determine the key predictors of perinatal deaths using machine learning models compared with the logistic regression model. Design: A secondary data analysis using the Kilimanjaro Christian Medical Centre (KCMC) Medical Birth Registry cohort from 2000 to 2015. We assessed the discriminative ability of models using the area u...
Article
Full-text available
It is of great interest for a biomedical analyst or an investigator to correctly model the CD4 cell count or disease biomarkers of a patient in the presence of covariates or factors determining the disease progression over time. The Poisson mixed-effects models (PMM) can be an appropriate choice for repeated count data. However, this model is not r...
Article
Full-text available
The increase in health research in sub-Saharan Africa (SSA) has generated large amounts of data and led to a high demand for biostatisticians to analyse these data locally and quickly. Donor-funded initiatives exist to address the dearth in statistical capacity, but few initiatives have been led by African institutions. The Sub-Saharan African Cons...
Article
Full-text available
The simultaneous spatiotemporal modeling of multiple related diseases strengthens inferences by borrowing information between related diseases. Numerous research contributions to spatiotemporal modeling approaches exhibit their strengths differently with increasing complexity. However, contributions that combine spatiotemporal approaches to modelin...
Article
Background Although, Tuberculosis (TB) is curable if the treatment is adhered to and completed it is still a major cause of death globally including South Africa. The success rate for TB treatment was 77.2% in 2014, of which more than 37 000 lives were lost because of it in South Africa. Several studies have been carried out on this subject, but th...
Article
Full-text available
Background: HIV infected patients may experience many intermediate events including between-event transition throughout their follow up. Through modelling these transitions, we can gain a deeper understanding of HIV disease process and progression and of factors that influence the disease process and progression pathway. In this work, we present t...
Article
Full-text available
Background: Ordinal health longitudinal response variables have distributions that make them unsuitable for many popular statistical models that assume normality. We present a multilevel growth model that may be more suitable for medical ordinal longitudinal outcomes than are statistical models that assume normality and continuous measurements. M...
Article
Full-text available
Background Lesotho is the country located in the Sub-Saharan region of Africa countries where under-five mortality (U5M) is still a big issue due to some significant social and demographic risk factors. Hence, the investigation of some social and demographic factors that are associated with the U5M, is a critical problem that needs due consideratio...
Article
Full-text available
This study examined the applicability of artificial neural network models in modelling univariate time series ambulance demand for short-term forecasting horizons in Zimbabwe. Bulawayo City Councils’ ambulance services department was used as a case study. Two models, feed-forward neural network (FFNN) and seasonal autoregressive integrated moving a...
Article
Full-text available
Introduction: Combination antiretroviral therapy has become the standard care of human immunodeficiency virus (HIV) infected patients and has further led to a dramatically decreased progression probability to acquired immune deficiency syndrome (AIDS) for patients under such a therapy. However, responses of the patients to this therapy have recorde...
Article
Full-text available
Background: More than five million perinatal deaths occur each year globally. Despite efforts put forward during the millennium development goals era, perinatal deaths continue to increase relative to under-five deaths, especially in low- and middle-income countries. This study aimed to determine predictors of perinatal death in the presence of mi...
Article
Full-text available
Background: CD4 cell and viral load count are highly correlated surrogate markers of human immunodeficiency virus (HIV) disease progression. In modelling the progression of HIV, previous studies mostly dealt with either CD4 cell counts or viral load alone. In this work, both biomarkers are in included one model, in order to study possible factors...
Article
Full-text available
Background: Modelling of longitudinal biomarkers and time-to-event data are important to monitor disease progression. However, these two variables are traditionally analyzed separately or time-varying Cox models are used. The former strategy fails to recognize the shared random-effects from the two processes while the latter assumes that longitudi...
Article
Full-text available
Background: Patients infected with HIV may experience a succession of clinical stages before the disease diagnosis and their health status may be followed-up by tracking disease biomarkers. In this study, we present a joint multistate model for predicting the clinical progression of HIV infection which takes into account the viral load and CD4 cou...