# Statistical-Methods For Assessing Agreement Between 2 Methods Of Clinical Measurement

## Abstract

In clinical measurement comparison of a new measurement technique with an established one is often needed to see whether they agree sufficiently for the new to replace the old. Such investigations are often analysed inappropriately, notably by using correlation coefficients. The use of correlation is misleading. An alternative approach, based on graphical techniques and simple calculations, is described, together with the relation between this analysis and the assessment of repeatability.

... To assess interobserver agreement for the CBF index, we used Person's correlation of individual index measurements for an average of two measurements by each investigator. In addition, interobserver agreement for the CBF index was determined as the limit of agreement determined with the Bland-Altman analysis [26]. To evaluate the difference in quantitative CBF values between the ischemic side and the healthy contralateral side, Wilcoxon matched pair signed rank tests were used to compare the CBF values using 123 I-IMP SPECT and each CTP method between the ischemic side and the healthy contralateral side. ...
... To determine CBF index differences between two CTP techniques, mean differences in CBF index between each CTP method and SPECT imaging were compared by means of Wilcoxon matched pair signed rank tests. Finally, the limits of agreement between any two measures were assessed by Bland-Altman analysis [26]. ...
... We are aware of no other study that has compared the ability of BEANR directly with that of rSVD relative to an 123 I-IMP SPECT reference. Moreover, our two investigators produced significant and excellent interobserver agreement for each method with sufficiently small limits of agreement for clinical purposes, contrasting with previous research in which the reproducibility of measurements was assessed by Bland-Altman analysis to determine whether the mean difference and limits of agreements were small enough for clinical purposes [26][27][28][29][30]. Therefore, we consider our results reproducible both in academic studies and in routine clinical practice. ...
Purpose Bayesian estimation with advanced noise reduction (BEANR) in CT perfusion (CTP) could deliver more reliable cerebral blood flow (CBF) measurements than the commonly used reformulated singular value decomposition (rSVD). We compared the efficacy of CBF measurement by CTP using BEANR and rSVD, evaluating both relative to N-isopropyl-p-[(123) I]- iodoamphetamine (¹²³I-IMP) single-photon emission computed tomography (SPECT) as a reference standard, in patients with cerebrovascular disease. Methods Thirty-one patients with suspected cerebrovascular disease underwent both CTP on a 320 detector-row CT system and SPECT. We applied rSVD and BEANR in the ischemic and contralateral regions to create CBF maps and calculate CBF ratios from the ischemic side to the healthy contralateral side (CBF index). The analysis involved comparing the CBF index between CTP methods and SPECT using Pearson’s correlation and limits of agreement determined with Bland–Altman analyses, before comparing the mean difference in the CBF index between each CTP method and SPECT using the Wilcoxon matched pairs signed-rank test. Results The CBF indices of BEANR and ¹²³I-IMP SPECT were significantly and positively correlated (r = 0.55, p < 0.0001), but there was no significant correlation between the rSVD method and SPECT (r = 0.15, p > 0.05). BEANR produced smaller limits of agreement for CBF than rSVD. The mean difference in the CBF index between BEANR and SPECT differed significantly from that between rSVD and SPECT (p < 0.001). Conclusions BEANR has a better potential utility for CBF measurement in CTP than rSVD compared to SPECT in patients with cerebrovascular disease.
... The level of agreement between the measure at time 1 and at time 2 on the DHI and its subscales was illustrated using Bland-Altman plots in Figure 7.1 (Bland and Altman, 1986). ...
... A Bland-Altman plot has been proposed as a graphical technique to illustrate the repeatability of a measure (Bland and Altman, 1986). The plot, shown in Figure 6.1 illustrates the agreement between two measurements. ...
p>The aims were to characterize dizziness in terms of its severity and nature, to describe the limitations experienced by quantifying and establishing dimensions of quality of life, to model the processes and factors involved in the quality of life of dizzy individuals and to develop and assess a dizziness-specific quality of life questionnaire. Questionnaire surveys were carried out in clinic (N=405) and general population (N=55) samples of dizzy individuals. In addition two comparison groups were surveyed: clinic population of facial pain patients and individuals without dizziness in the general population. Characteristics of dizziness in clinic and general population samples are described and compared. Psychometric properties were established for two applied questionnaires: the commonly used Dizziness Handicap Inventory (DHI) and the newly applied quality of life questionnaire, the Functional Limitations Profile (FLP). Both were found to be reliable measures for groups of dizzy individuals. Although there is some support for validity of the DHI, a revised subscale structure is proposed reflecting the intrinsic properties of the items more accurately than the original. The FLP appears to be a valid measure of the quality of life of dizzy individuals. Quality of life was quantified in the four survey groups and comparisons made. A significant reduction in quality of life was found for dizzy individuals, the greatest reduction being for the psychosocial aspects. The limitations reported by dizzy individuals are shown to be specific and different from comparison groups. Factor analysis of FLP responses suggests a three-dimensional model of quality of life consisting of psychological, physical and social well-being, supplemented by a contingent factor representing other health problems. This model underpins the newly developed questionnaire, the Dizziness Impact Profile (DIP), constructed from analysis of item responses on the original FLP. The DIP appears to be valid and reliable based on analysis of the item scores from the FLP, but requires further validation. Increased understanding of dizziness and the limitations in lifestyle experienced by its sufferers and the development of the Dizziness Impact Profile to quantify these in a convenient way is important to meet the needs of dizzy individuals in terms of service provision and planning.</p
... www.nature.com/scientificreports/ of continuous variables comprised boxplots. Interrater agreement was assessed with Bland-Altman Limits of Agreement (BA LoA)30,31 , adjusted for longitudinal correlated data as the mice were measured up to four times32,33 . The estimated bias (i.e. the mean difference), as well as the BA LoA were supplemented by respective 95% confidence intervals (95% CI). ...
Lymphedema affects 20% of women diagnosed with breast cancer. It is a pathology with no known cure. Animal models are essential to explore possible treatments to understand and potentially cure lymphedema. The rodent hindlimb lymphedema model is one of the most widely used. Different modalities have been used to measure lymphedema in the hindlimb of mice, and these are generally poorly assessed in terms of the interrater agreement; thus, there could be a risk of measuring bias and poor reproducibility. We examined the interrater agreement of µCT-scans, electronic caliper thickness of the paw and plethysmometer in the measurement of lymphedema in the hindlimb of mice. Three independent raters assessed 24 C57BL6 mice using these three modalities four times (week 1, 2, 4 and 8) with a total of 96 samples. The mean interrater differences were then calculated. The interrater agreement was highest in the µCT-scans, with an extremely low risk of measurement bias. The interrater agreement in the plethysmometer and electronic caliper was comparable with a low to moderate risk of measurement bias. The µCT-scanner should be used whenever possible. The electronic caliper should only be used if there is no µCT-scanner available. The plethysmometer should not be used in rodents of this size.
... Standard error of measurement (SEM) was derived from regression line interpretation, with coefficients of variation (CV) calculated to assess absolute agreement. Limits of agreement from Cr DBS /Cr S and Crt DBS /Crt S means were plotted using Bland Altman [10]. [11] . ...
... For intra-observer reliability, we also plotted the differences between the values of the two measurements as a function of the means, for each observer. We estimated the systematic bias (which is the average of the differences among patient's data) and the proportional bias (which is the slope of the regression line of a Bland-Altman plot) [29]. ...
Purpose An inclinometer smartphone application has been developed to enable the measurement of the angle of trunk inclination (ATI) to detect trunk surface asymmetry. The objective was to determine the reliability and validity of the smartphone app in the hands of non-professionals. Methods Three non-professional observers and one expert surgeon measured maximum ATI twice in a study involving 69 patients seen in the spine clinics to rule out scoliosis or for regular follow-up (10-18 y.o., Cobb [0°-58°]). Observers were parents not familiar with scoliosis screening nor use of an inclinometer. They received training from a 4-minute video. Intra and inter-observer reliability was determined using the generalizability theory and validity was assessed from intraclass correlation coefficients (ICC), agreement with the expert on ATI measurements using Bland-Altman analysis, and correct identification of the threshold for consultation (set to ≥6° ATI). Results Intra-observer and inter-observer reliability coefficients were excellent ϕ = 0.92. The standard error of measurement was 1.5° (intra-observer, 2 measurements) meaning that a parent may detect a change of 4° between examinations 95% of the time. Comparison of measurements between non-professionals and the expert resulted in ICC varying from 0.82 [0.71-0.88] to 0.84 [0.74-0.90] and agreement on the decision to consult occurred in 83 to 90% of cases. Conclusion The use of a smartphone app resulted in excellent reliability, sufficiently low standard error of measurement (SEM) and good validity in the hands of non-professionals. The device and the instructional video are adequate means to allow detection and regular examination of trunk asymmetries by non-professionals.
... There are three aspects to consider when assessing agreement between predicted and measured REE. First, Bland-Altman analysis is the reference method for that purpose [48]; however, such analysis has been inconsistently applied. Instead, some studies used alternative but insufficient approaches based on the mean difference between predicted and measured REE (bias predicted REE was compared with the reported mean measured REE by Bland-Altman analysis. ...
Humans acquire energy from the environment for survival. A central question for nutritional sciences is how much energy is required to sustain cellular work while maintaining an adequate body mass. Because human energy balance is not exempt from thermodynamic principles, the energy requirement can be approached from the energy expenditure. Conceptual and technological advances have allowed understanding of the physiological determinants of energy expenditure. Body mass, sex, and age are the main factors determining energy expenditure. These factors constitute the basis for predictive equations for resting (REE) and total (TEE) energy expenditure in healthy adults. These equations yield predictions that differ up to ~400 kcal/d for REE and ~550 kcal/d for TEE. Identifying additional factors accounting for such variability and the most valid equations appears relevant. This review used novel approaches based on mathematical modeling of REE and analyses of the data from which REE predictive equations were generated. As for TEE, R2 and SE were considered because only a few predictive equations are available. From these analyses, Oxford's and Plucker's equations appear valid for predicting REE and TEE in adults, respectively.
... In addition, the standard error measurement was assessed (SEM, multiplying the standard deviation by the square root of 1 minus the ICC [25]), allowing analysis between the different trials. Finally, the Bland-Altman plot was designed, analyzing the different trials of each test [25,29]. All statistical analyses were carried out using SPSS Version 27.0 (SPSS Inc., Chicago, IL, USA) and p-values of < 0.05 were considered statistically significant. ...
This study aimed to analyze the reliability of the tests included in the motor competence assessment (MCA) bat- tery and compare the effects of the number of trials per test. Thirty female volleyball players (14.6 ± 1.3 years of age) were tested. The participants performed two or three trials of each test. Intra-class correlation (ICC) was calculated, and a paired sample t-test analyzed the variations between trials (1st vs. 2nd vs. 3rd). Results revealed a significant difference between the first and the second trials for jumping sideways [t(29) = -4.108, p < 0.01], standing long jump [t(29) = -3.643, p < 0.01], and shuttle run [t(29) = -3.139, p < 0.01]. No significant result was registered in the shifting platforms, ball throwing and kicking between the first and second trials. Hence, any difference was recorded between the second and third trial. High ICC values were registered in lateral jumps, among the three repetitions of ball kicking and ball throwing, and between the last two repetitions of shuttle run. Almost perfect values were recorded for the shifting platforms and standing long jump. Nevertheless, there seems to be a learning effect between the first and the second repetition—no differences were registered only considering the two manipulative tests. In conclusion, except for jumping sideways, the MCA tests are reliable and only need to be performed two times instead of three.
... Due to the interval nature of the audiogram measurements, non-parametric representations of the Bland-Altman plots were presented to investigate the agreement between both methods (23,24). Distribution-free tests and confidence intervals (CI) were computed for the median to examine the bias, and also for the 2.5 th and 97.5 th percentiles to discover outliers (25). ...
Studies involving soundscape perception often exclude participants with hearing loss to prevent impaired perception from affecting experimental results. Participants are typically screened with pure tone audiometry, the "gold standard" for identifying and quantifying hearing loss at specific frequencies, and excluded if a study-dependent threshold is not met. However, procuring professional audiometric equipment for soundscape studies may be cost-ineffective, and manually performing audiometric tests is labour-intensive. Moreover, testing requirements for soundscape studies may not require sensitivities and specificities as high as that in a medical diagnosis setting. Hence, in this study, we investigate the effectiveness of the uHear app, an iOS application, as an affordable and automatic alternative to a conventional audiometer in screening participants for hearing loss for the purpose of soundscape studies or listening tests in general. Based on audiometric comparisons with the audiometer of 163 participants, the uHear app was found to have high precision (98.04%) when using the World Health Organization (WHO) grading scheme for assessing normal hearing. Precision is further improved (98.69%) when all frequencies assessed with the uHear app is considered in the grading, which lends further support to this cost-effective, automated alternative to screen for normal hearing.
... Due to the interval nature of the audiogram measurements, non-parametric representations of the Bland-Altman plots were presented to investigate the agreement between both methods (23,24). Distribution-free tests and confidence intervals (CI) were computed for the median to examine the bias, and also for the 2.5 th and 97.5 th percentiles to discover outliers (25). ...
Studies involving soundscape perception often exclude participants with hearing loss to prevent impaired perception from affecting experimental results. Participants are typically screened with pure tone audiometry, the "gold standard" for identifying and quantifying hearing loss at specific frequencies, and excluded if a study-dependent threshold is not met. However, procuring professional audiometric equipment for soundscape studies may be cost-ineffective, and manually performing audiometric tests is labour-intensive. Moreover, testing requirements for soundscape studies may not require sensitivities and specificities as high as that in a medical diagnosis setting. Hence, in this study, we investigate the effectiveness of the uHear app, an iOS application, as an affordable and automatic alternative to a conventional audiometer in screening participants for hearing loss for the purpose of soundscape studies or listening tests in general. Based on audiometric comparisons with the audiometer of 163 participants, the uHear app was found to have high precision (98.04 %) when using the World Health Organization (WHO) grading scheme for assessing normal hearing. Precision is further improved (98.69 %) when all frequencies assessed with the uHear app is considered in the grading, which lends further support to this cost-effective, automated alternative to screen for normal hearing.
... Furthermore, Bland-Altman plots, the calculation of systematic bias (with standard deviation [SD]), and 95% limits of agreement (limits of agreement = bias ± 1.96 × SD) were used to interpret the direction and variability of the compared-with-MRI and estimation errors. 22 Heteroscedasticity was examined by linear regression and confirmed when R 2 ≥ .10. 23 The standard error of measurement (SEM) was calculated from the square root of the mean square error term in a repeated-measures analysis of variance 23 and expressed in absolute (in square centimeters) and relative terms (in percentage) as a coefficient of variation (CV = [100 × SEM]/mean). Because the size of the QUAD ACSA varies along the thigh length, 24 the CV allowed us to compare the errors generated at the different regions of the muscle group. ...
... Systematic bias was expressed as the mean difference between two methods by Bland-Altman plots. The 95% limits of agreement (LOA) were defined as bias ± 1.96 SD, and SD was the standard deviation of the difference [13]. p < 0.05 indicated statistical significance. ...
Background: The Jamar hydraulic dynamometer is a widely recognized tool for measuring grip strength. Nevertheless, the devices used most often in Asian countries are spring-type dynamometers, represented by the CAMRY dynamometer or Smedley dynamometer. We aimed to evaluate the reliability and validity of the CAMRY dynamometer compared with the Jamar dynamometer. Methods: This was a cross-sectional study using a random crossover design in the grip strength test with two dynamometers. A total of 1064 healthy community-dwelling older adults aged 50-90 years old, which included 686 minorities and 378 Han Chinese, were recruited into the study from July to September 2021. We assessed the reliability and validity of the CAMRY EH101 dynamometer, and the Jamar dynamometer was regarded as the reference device. The order of testing with two dynamometers was randomized in a 1:1 ratio, with a 10-min gap between the two devices. Intraclass correlation coefficients (ICCs) and Bland-Altman analysis were calculated to assess reliability and validity between the two devices. Results: The average handgrip strength (HGS) values at six times by the Jamar and CAMRY devices were 25.0 ± 7.9 kg and 24.6 ± 7.5 kg, respectively. The ICC values between the two devices were 0.815-0.854, and the systematic bias underestimated by the CAMRY dynamometer was 0.5 kg in men and 0.6 kg in women. We carried out a linear regression equation by sex, and their relationship was found as follows: male HGS (kg)Jamar = 8.001 + 0.765 × HGS (kg)CAMRY; female HGS (kg)Jamar = 3.681 + 0.840 × HGS (kg)CAMRY. Conclusions: The CAMRY EH101 dynamometer provides excellent reliability and validity. This device can serve as a reliable, inexpensive, and practical device to assess grip strength in geriatric clinical practice. Clinical trial registration: Chinese Clinical Trial Registry: ChiCTR2100046367 ; Date of clinical trial reistration: 15/05/2021.
... The limit of agreement comprises the differences between two measurements lying in the reference interval defined as mean difference ± 1.96 × SD. If the two methods are comparable, the differences should be small and close to 0 [24]. Precision and accuracy of both methods were evaluated by Lin's concordance correlation coefficient (CCC) [25]. ...
Background After initiating cardioprotective agents, a fall of estimated glomerular filtration rate (eGFR) has been reported in several studies. Our goal was to evaluate the accuracy of change of Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) eGFR in patients with type 2 diabetes (T2D) after short-term pharmacological intervention with angiotensin-converting enzyme inhibitor, angiotensin-receptor blocker, gliptin or sodium-glucose cotransporter-2 inhibitor. Methods We analyzed 190 patients with T2D in the early stage of the disease, having no overt renal impairment by CKD-EPI equation. In each patient, we measured GFR (mGFR) by applying the constant infusion input clearance technique with sinistrin (Inutest; Fresenius, Linz, Austria) at baseline and after short-term (4–12 weeks) pharmacological intervention with cardioprotective agents (ramipril, telmisartan, linagliptin, metformin, empagliflozin) that potentially lead to an alteration of renal function. Simultaneously, a standardized analysis of serum creatinine was performed and eGFR was estimated by the CKD-EPI equation. Results Average mGFR was 111 ± 20 ml/min/1.73m ² , whereas eGFR was lower with 93 ± 13 ml/min/1.73m ² . The ratio eGFR/mGFR in relation to mGFR was almost curvilinear, showing an underestimation of renal function by eGFR in the upper normal range. At baseline only 80 patients (42%) lay within ± 10% of mGFR and the concordance correlation coefficient (CCC) was extremely low (− 0.07). After short-term pharmacological intervention changes in eGFR and mGFR correlated with each other (r = 0.286, p < 0.001). For example, for a given mGFR of 111 ml/min/1.73m ² , a change of mGFR by ± 10% corresponded to ± 11 ml/min/1.73m ² , but the confidence interval of eGFR was 25 ml/min/1.73m ² . The CCC was low (0.22). Conclusion The agreement between eGFR by CKD-EPI and mGFR is modest and the change of renal function after short-term pharmacological intervention is not accurately and precisely reflected by the change of eGFR in patients with T2D in the early stage of their disease.
... Pearson correlation coefficient was calculated to determine the association between the samples of the two collection methods. The Bland-Altman plot (Bland & Altman, 1986) was used to detect relative and proportional biases and measurement agreement. A p-value < .05 was considered statistically significant. ...
Saliva collection and handling procedures for salivary C‐reactive protein (CRP) can be challenging due to a lack of standardized protocols. This study compared two collection methods used to quantify salivary CRP. Twenty‐two Chinese adults provided two unstimulated whole saliva samples using passive drool and cotton‐based collection devices in two consecutive mornings at baseline and 1 month later. The effects of various factors on CRP levels were analyzed using linear mixed models. Salivary CRP levels were significantly affected by collection time and method, but not day or wave. The CRP peaked upon awakening and declined 45 min later. CRP levels were significantly higher in the passive drool than in the cotton‐based method. The Bland–Altman plot revealed relative and proportional biases. The difference in the CRP levels between the methods decreased as the CRP levels increased. Results suggest that passive drool and cotton‐based collection methods should not be used interchangeably for measuring low levels of salivary CRP.
... Bland-Altman analysis of the Doppler and BSI techniques, compared to ground-truth pressure drop are presented in Fig. 8 [15]. Figure 8A shows a statistically significant bias of pressure drop estimations made using Doppler, when compared to ground-truth pressure drop of 3.91 mmHg (p < 0.05). ...
Background: Transvalvular pressure drops are assessed using Doppler echocardiography for the diagnosis of heart valve disease. However, this method is highly user-dependent and may overestimate transvalvular pressure drops by up to 54%. This work aimed to assess transvalvular pressure drops using velocity fields derived from blood speckle imaging (BSI), as a potential alternative to Doppler. METHODS: A silicone 3D-printed aortic valve model, segmented from a healthy CT scan, was placed within a silicone tube. A CardioFlow 5000MR flow pump was used to circulate blood mimicking fluid to create eight different stenotic conditions. Eight PendoTech pressure sensors were embedded along the tube wall to record ground-truth pressures (10 kHz). The simplified Bernoulli equation with measured probe angle correction was used to estimate pressure drop from maximum velocity values acquired across the valve using Doppler and BSI with a GE Vivid E95 ultrasound machine and 6S-D cardiac phased array transducer. Results: There were no significant differences between pressure drops estimated by Doppler, BSI and ground-truth at the lowest stenotic condition (10.4 ± 1.76, 10.3 ± 1.63 vs. 10.5 ± 1.00 mmHg, respectively; p > 0.05). Significant differences were observed between the pressure drops estimated by the three methods at the greatest stenotic condition (26.4 ± 1.52, 14.5 ± 2.14 vs. 20.9 ± 1.92 mmHg for Doppler, BSI and ground-truth, respectively; p < 0.05). Across all conditions, Doppler overestimated pressure drop (Bias = 3.92 mmHg), while BSI underestimated pressure drop (Bias = -3.31 mmHg). Conclusions: BSI accurately estimated pressure drops only up to 10.5 mmHg in controlled phantom conditions of low stenotic burden. Doppler overestimated pressure drops of 20.9 mmHg. Although BSI offers a number of theoretical advantages to conventional Doppler echocardiography, further refinements and clinical studies are required with BSI before it can be used to improve transvalvular pressure drop estimation in the clinical evaluation of aortic stenosis.
... The measurement results per method were additionally compared with a two-sample Student's t-test for paired samples. The Bland-Altman plot was used to visualize the agreement of the results of the two methods 20 . The repeatability of each instrument was evaluated by determining the coefficient of variance (CV (Fig. 2), which corresponds to a nearly complete concordance of the measurement results of ANDROVISION eFlow and the NUCLEOCOUNTER as a reference method 21 . ...
Exact analysis of sperm concentration in raw and diluted semen is of major importance in swine artificial insemination, as sperm concentration is one of the most important characteristics of an ejaculate determining the value of the ejaculate and the productive life of the boar. The study compares different methods for sperm concentration analysis in raw and diluted boar semen: NUCLEOCOUNTER SP-100, the ANDROVISION with Leja chambers and the new ANDROVISION eFlow system. The Concordance Correlation Coefficient (CCC) between NUCLEOCOUNTER and ANDROVISION eFlow was 0.955 for raw (n = 185 ejaculates) and 0.94 for diluted semen (n = 109 ejaculates). The CCC between NUCLEOCOUNTER and ANDROVISION with Leja chambers was 0.66. A Bland–Altman plot of split-sample measurements of sperm concentration with NUCLEOCOUNTER and ANDROVISION eFlow showed that 95.1% of all measurements lay within ± 1.96 standard deviation. The coefficients of variance were 1.6 ± 1.3%, 3.6 ± 3.6% and 7.3 ± 6.3% for NUCLEOCOUNTER, ANDROVISION eFlow and ANDROVISION with Leja chambers in diluted semen, respectively. NUCLEOCOUNTER and ANDROVISION eFlow are comparable tools to measure the concentration of raw and diluted boar semen. In comparison to ANDROVISION with Leja chambers, concentration analyses of diluted semen using NUCLEOCOUNTER or ANDROVISION eFlow show a higher repeatability within and a higher concordance between the methods.
... 3D Dice similarity coefficient was used to quantify the performance of the automatic ocular segmentation results compared to manual segmentations. Bland-Altman [27] bias (Eq. 3) and agreement (CI 95 , Eq. 4) and mean difference (L 1 , Eq. 5) metrics were used to quantify differences between the automatic and manual ocular measurements: ...
Objectives: To differentiate hypo-/hypertelorism (abnormal) from normal fetuses using automatic biometric measurements and machine learning (ML) classification based on MRI. Methods: MRI data of normal (n = 244) and abnormal (n = 52) fetuses of 22-40 weeks' gestational age (GA), scanned between March 2008 and June 2020 on 1.5/3T systems with various T2-weighted sequences and image resolutions, were included. A fully automatic method including deep learning and geometric algorithms was developed to measure the binocular (BOD), inter-ocular (IOD), ocular (OD) diameters, and ocular volume (OV). Two new parameters, BOD-ratio and IOD-ratio, were defined as the ratio between BOD/IOD relative to the sum of both globes' OD, respectively. Eight ML classifiers were evaluated to detect abnormalities using measured and computed parameters. Results: The automatic method yielded a mean difference of BOD = 0.70 mm, IOD = 0.81 mm, OD = 1.00 mm, and a 3D-Dice score of OV = 93.7%. In normal fetuses, all four measurements increased with GA. Constant values were detected for BOD-ratio = 1.56 ± 0.05 and IOD-ratio = 0.60 ± 0.05 across all GA and when calculated from previously published reference data of both MRI and ultrasound. A random forest classifier yielded the best results on an independent test set (n = 58): AUC-ROC = 0.941 and F1-Score = 0.711 in comparison to AUC-ROC = 0.650 and F1-Score = 0.385 achieved based on the accepted criteria that define hypo/hypertelorism based on IOD (< 5th or > 95th percentiles). Using the explainable ML method, the two computed ratios were found as the most contributing parameters. Conclusions: The developed fully automatic method demonstrates high performance on varied clinical imaging data. The new BOD and IOD ratios and ML multi-parametric classifier are suggested to improve the differentiation of hypo-/hypertelorism from normal fetuses. Key points: • A fully automatic method for computing fetal ocular biometry from MRI is proposed, achieving high performance, comparable to that of an expert fetal neuro-radiologist. • Two new parameters, IOD-ratio and BOD-ratio, are proposed for routine clinical use in ultrasound and MRI. These two ratios are constant across gestational age in normal fetuses, consistent across studies, and differentiate between fetuses with and without hypo/hypertelorism. • Multi-parametric machine learning classification based on automatic measurements and the two new ratios improves the identification of fetal ocular anomalies beyond the accepted criteria (<5th or >95th IOD percentiles).
... We evaluated bias between mean differences (agreement) for each creatinine or cystatin C-based eGFR equation, or both, and the corresponding mGFR value using Bland-Altman plots. 28 Using mGFR as the reference, we compared performance of each equation and compared the proportion of participants correctly classified by mGFR stage. The parameters for performance included bias, measured as median (eGFR-mGFR) and expressed as mL/min per 1·73 m²; relative bias, measured as median (eGFR/mGFR) and reported as a percentage; precision, measured as log Root Mean Square Error (RMSE) and reported as standard deviation of log (eGFR-mGFR); and accuracy, measured as proportion of eGFR results within 30% of mGFR (P 30 ) and reported as a percentage. ...
Background The burden of kidney disease in many African countries is unknown. Equations used to estimate kidney function from serum creatinine have limited regional validation. We sought to determine the most accurate way to measure kidney function and thus estimate the prevalence of impaired kidney function in African populations. Methods We measured serum creatinine, cystatin C, and glomerular filtration rate (GFR) using the slope-intercept method for iohexol plasma clearance (mGFR) in population cohorts from Malawi, Uganda, and South Africa. We compared performance of creatinine and cystatin C-based estimating equations to mGFR, modelled and validated a new creatinine-based equation, and developed a multiple imputation model trained on the mGFR sample using age, sex, and creatinine as the variables to predict the population prevalence of impaired kidney function in west, east, and southern Africa. Findings Of 3025 people who underwent measured GFR testing (Malawi n=1020, South Africa n=986, and Uganda n=1019), we analysed data for 2578 participants who had complete data and adequate quality measurements. Among 2578 included participants, creatinine-based equations overestimated kidney function compared with mGFR, worsened by use of ethnicity coefficients. The greatest bias occurred at low kidney function, such that the proportion with GFR of less than 60 mL/min per 1·73 m² either directly measured or estimated by cystatin C was more than double that estimated from creatinine. A new creatinine-based equation did not outperform existing equations, and no equation, including the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) 2021 race-neutral equation, estimated GFR within plus or minus 30% of mGFR for 75% or more of the participants. Using a model to impute kidney function based on mGFR, the estimated prevalence of impaired kidney function was more than two-times higher than creatinine-based estimates in populations across six countries in Africa. Interpretation Estimating GFR using serum creatinine substantially underestimates the individual and population-level burden of impaired kidney function in Africa with implications for understanding disease progression and complications, clinical care, and service provision. Scalable and affordable ways to accurately identify impaired kidney function in Africa are urgently needed. Funding The GSK Africa Non-Communicable Disease Open Lab. Translations For the Luganda, Chichewa and Xitsonga translations of the abstract see Supplementary Materials section.
... Spearman's rank correlation coefficient was calculated to examine the correlation between symptom scores and cell counts. Using the method of Bland [Bland et al. 1986], a coefficient of repeatability was calculated for the recounted slides (twice the standard deviation of the log transformed differences in repeated counts). ...
p>Evidence is presented of conjunctival mast cell accumulation in SAC in the absence of increased numbers of eosinophils or neutrophils. By examining thin sections of conjunctiva using immunohistochemistry and in-situ hybridisation, immunoreactivity for a range of cytokines known to play key roles in the pathophysiology of allergic diseases is demonstrated. Furthermore, these cytokines are well recognized to selectivity upregulate cell adhesion molecules critical to the recruitment of leucocytes to areas of allergic inflammation. These findings support the hypothesis that mast cell degranulation can drive the conjunctival late phase response (LPR) in the absence of the activation of other inflammatory cells and thus provide a link between the type I hypersensitivity reaction to pollen and the clinical disease of SAC. This hypothesis was tested using conjunctival allergen challenge to generate a LPR in human subjects followed by the recording of symptom scores and study of the tissue cell and cytokine changes. These data provide evidence that the conjunctival mast cell is well positioned to orchestrate the early immune response to allergen prior to leukocyte recruitment and activation. Mast cells, however, are heterogeneous and in man, are typically described according to their content of serine proteases. To determine whether this distinction extended to a functional heterogeneity based on cytokine content, the distribution of the T<sub>H</sub>2-like cytokines, IL-4, IL-13, IL-5 and IL-6 between mast cell subtypes was investigated. The tissue regulation of mast cell growth, development, function and survival is poorly understood, but stem cell factor (SCF) is known to play a major role. This thesis reports that human mast cells are themselves a source of this cytokine. This key finding provides novel evidence that a mechanism exists to regulate the biology of this important and ubiquitous cell in an autocrine manner.</p
... Mean of wave two measure and repeatability measure of fruit and/or vegetable consumption against difference between wave two measure and repeatability measure of fruit and/or vegetable consumption using a robust sample wave two and repeatability measures Figures are produced using the Bland-Altman method(Bland and Altman 1986) ...
p>This is the first study to examine changes in food intake through increased physical access in an area of low fruit and vegetable consumption. The changes in fruit and vegetables were explored in a framework of consumption comprised of changes in physical access, availability, affordability, attitudes and other factors impinging on the buying and consuming of fruits and vegetables. The food habits, shopping patterns and socio-demographic characteristics were collected from 1009 respondents before the opening of the new superstore and 615 of the same respondents after the opening of the new superstore using a self-completed prospective seven-day food checklist, as well as interviewer-administered and self completed questionnaires. Overall, fruit and vegetable consumption increased by 0.04 portions per day to 2.92 portions per day (p=0.555) following the opening of the new superstore. However, those respondents with lower intakes of fruits and vegetables before the opening of the superstore had significant increases in consumption level irrespective of changes in physical access (p<0.001). Two hundred and eighteen respondents used the new store as their main source of fruit and vegetable shopping, and increased the consumption levels by 0.15 portions per day to 2.75 portions per day (p=0.229). Analysis showed that distance to the new store was a major factor in its use - people using the store lived significantly closer to it than those who did not (p=0.005). Positive changes in the factors in the framework of consumption for those using the new superstore did not affect the level of fruit and vegetable consumption. From the results it may be concluded that physical access to fruits and vegetables through the opening of a locally accessible superstore is not a rate-limiting step in their increased consumption for this population. Improvement of physical access to fruit and vegetables on its own may not be an effective strategy to improve fruit and vegetable consumption and thus health status.</p
... The average within-person difference and the standard deviation of that difference are reported. Agreement was further investigated by examining Bland-Altman plots [19] of school differences versus home measurement with 95% limits of agreement (LOA). Intraclass correlation coefficients were calculated between the school and home measurements of height and weight to enable us to determine how much of the variance in schools was due to measurement. ...
Full-text available
Background The purpose of this study is to describe and assess a remote height and weight protocol that was developed for an ongoing trial conducted during the SARS COV-2 pandemic. Methods Thirty-eight rural families (children 8.3 ± 0.7 years; 68% female; and caregivers 38.2 ± 6.1 years) were provided detailed instructions on how to measure height and weight. Families obtained measures via remote data collection (caregiver weight, child height and weight) and also by trained staff. Differences between data collection methods were examined. Results Per absolute mean difference analyses, slightly larger differences were found for child weight (0.21 ± 0.21 kg), child height (1.53 ± 1.29 cm), and caregiver weight (0.48 ± 0.42 kg) between school and home measurements. Both analyses indicate differences had only minor impact on child BMI percentile (− 0.12, 0.68) and parent BMI (0.05, 0.13). Intraclass coefficients ranged from 0.98 to 1.00 indicating that almost all of the variance was due to between person differences and not measurement differences within a person. Conclusion Results suggest that remote height and weight collection is feasible for caregivers and children and that there are minimal differences in the various measurement methods studied here when assessing group differences. These differences did not have clinically meaningful impacts on BMI. This is promising for the use of remote height and weight measurement in clinical trials, especially for hard-to reach-populations. Trial registration Clinical. Registered in clinicaltrials.gov (NCT03304249) on 06/10/2017.
... Therefore, we analyzed the correlation between body measurement methods using Lin's concordance on individuals studied only once by the same operator to determine the agreement between methods (Lin 1989;Barnhart et al., 2007). Furthermore, the graphical method of Bland and Altman was used because it is based on the definition of the concordance between two series of measurements (Bland and Altman, 1986). The two series are concordant if one does not overestimate or underestimate the other significantly. ...
The present study was designed to verify the effectiveness of the image analysis method for body measurement in dromedary camel compared to manual measurements as a reference method. To achieve this aim, twenty-one linear body measurements were estimated on 59 adult Sahraoui dromedary camels (22 males and 37 females) with a normal clinical condition by using a measuring stick or vernier caliper (standard method). On the other hand, image analysis on profile, front, or behind photographs was processed using Axiovision Software. Overall mean comparison, relative error, variance, Pearson’s correlation coefficient, and coefficient of variance showed that the image analysis method was accurate in relation to the manual measurement. Furthermore, image analysis results indicated relevant accuracy (bias correction factor, Cb ≈1) and precision (Pearson ρ ≈1) which were significantly correlated with the results of the reference method (Lin’s concordance correlation coefficients rccc ≈ 1). According to Bland–Altman upper and lower limits of agreement, the concordance was estimated between 93.22 and 98.3%. Passing-Bablok regression showed a good relationship between the results of the two methods displaying no significant systematic and proportional bias. The image analysis method for linear body measurements in dromedary camel showed results that are in agreement with the manual measuring method. Therefore, the image analysis could be considered a valid tool for camel conformation trait studies.
... For the evaluation of the Acceleration and Deep Learning algorithms, we used leave-one-subject-out cross-validation to prevent overfitted results. We also show Bland-Altman plots [183] to visualize the results. ...
Body-worn sensors, so-called wearables, are getting more and more popular in the sports domain. Wearables offer real-time feedback to athletes on technique and performance, while researchers can generate insights into the biomechanics and sports physiology of the athletes in real-world sports environments outside of laboratories. One of the first sports disciplines, where many athletes have been using wearable devices, is endurance running. With the rising popularity of smartphones, smartwatches and inertial measurement units (IMUs), many runners started to track their performance and keep a digital training diary. Due to the high number of runners worldwide, which transferred their data of wearables to online fitness platforms, large databases were created, which enable Big Data analysis of running data. This kind of analysis offers the potential to conduct longitudinal sports science studies on a larger number of participants than ever before. In this dissertation, both studies showing how to extract endurance running-related parameters from raw data of foot-mounted IMUs as well as a Big Data study with running data from a fitness platform are presented.
... substantial, and > 0.8 almost perfect (Landis and Koch, 1977). Bland and Altman figures were plotted for ICC scores > 0.75 to allow for a visual interpretation of measurement agreement focusing on a reference range within which 95% of all differences between measurements could lie (Bland and Altman, 1986;Mansournia et al., 2021). ...
In order for electroencephalography (EEG) with sensory stimuli measures to be used in research and neurological clinical practice, demonstration of reliability is needed. However, this is rarely examined. Here we studied the test-retest reliability of the EEG latency and amplitude of evoked potentials and spectra as well as identifying the sources during pin-prick stimulation. We recorded EEG in 23 healthy older adults who underwent a protocol of pin-prick stimulation on the dominant and non-dominant hand. EEG was recorded in a second session with rest intervals of 1 week. For EEG electrodes Fz, Cz, and Pz peak amplitude, latency and frequency spectra for pin-prick evoked potentials was determined and test-retest reliability was assessed. Substantial reliability ICC scores (0.76–0.79) were identified for evoked potential negative-positive amplitude from the left hand at C4 channel and positive peak latency when stimulating the right hand at Cz channel. Frequency spectra showed consistent increase of low-frequency band activity (< 5 Hz) and also in theta and alpha bands in first 0.25 s. Almost perfect reliability scores were found for activity at both low-frequency and theta bands (ICC scores: 0.81–0.98). Sources were identified in the primary somatosensory and motor cortices in relation to the positive peak using s-LORETA analysis. Measuring the frequency response from the pin-prick evoked potentials may allow the reliable assessment of central somatosensory impairment in the clinical setting.
... Since the t test showed no significant difference (n.s.) between the raters, the collapsed ultrasound data (rater 1 + rater 2) were used. To assess the absolute agreement between the 3D ultrasound approach and MRI for the tissue lengths of both legs, Bland-Altman plots were utilized [8]. ...
Purpose Human muscle–tendon units (MTUs) are highly plastic and undergo changes in response to specific diseases and disorders. To investigate the pathological changes and the effects of therapeutic treatments, the use of valid and reliable examination methods is of crucial importance. Therefore, in this study, a simple 3D ultrasound approach was developed and evaluated with regard to: (1) its validity in comparison to magnetic resonance imaging (MRI) for the assessment of the gastrocnemius medialis (GM) MTU, muscle belly, and Achilles tendon lengths; and (2) its reliability for static and dynamic length measurements. Methods Sixteen participants were included in the study. To evaluate the validity and reliability of the novel 3D ultrasound approach, two ultrasound measurement sessions and one MRI assessment were performed. By combining 2D ultrasound and 3D motion capture, the tissue lengths were assessed at a fixed ankle joint position and compared to the MRI measurements using Bland–Altman plots. The intra-rater and inter-rater reliability for the static and dynamic length assessments was determined using the coefficient of variation, standard error of measurement (SEM), minimal detectable change (MDC 95 ), and intraclass correlation coefficient (ICC). Results The 3D ultrasound approach slightly underestimated the length when compared with MRI by 0.7%, 1.5%, and 1.1% for the GM muscle belly, Achilles tendon, and MTU, respectively. The approach showed excellent intra-rater as well as inter-rater reliability, with high ICC (≥ 0.94), small SEM (≤ 1.3 mm), and good MDC 95 (≤ 3.6 mm) values, with even better reliability found for the static length measurements. Conclusion The proposed 3D ultrasound approach was found to be valid and reliable for the assessment of the GM MTU, muscle belly, and Achilles tendon lengths, as well as the tissue lengthening behavior, confirming its potential as a useful tool for investigating the effects of training interventions or therapeutic treatments (e.g., surgery or conservative treatments such as stretching and orthotics). Level of evidence Level II.
... Analyses were conducted for the whole-body, right arm and right leg test outcomes, respectively. The bias and 95% limits of agreement (95% LOA) were calculated for each outcome measure according to sex and graphically depicted using Bland-Altman plots [22], with the x-axis, representing the mean of both body composition assessment methods, and the y-axis, representing the difference between the two methods, compared. Finally, one sample t-tests were performed to determine whether or not the mean difference between both methods was significantly different from zero. ...
Bio-electrical impedance analysis (BIA) and dual-energy X-ray absorptiometry (DXA) are methods to estimate human body composition. This study aimed to compare sex-specific outcomes for estimating segmental and whole-body composition in 83 healthy participants (21.9 ± 1.5 years, 56% men) using Inbody S10 BIA and Norland Elite DXA devices. One-way repeated measures ANOVAs showed significantly lower whole-body fat% and whole-body fat mass values alongside higher whole-body lean mass values resulting from BIA when compared to DXA (both sexes: p < 0.001). In men, whole-body bone mineral content was significantly higher using BIA against DXA (p < 0.001). Regardless of sex, no significant BIA versus DXA difference was found in arm fat mass (men: p = 0.180, women: p = 0.233), whereas significantly lower leg fat mass values were found with BIA versus DXA (both sexes: p < 0.001). Additionally, significantly higher arm lean mass (both sexes: p < 0.001) and leg lean mass (only women: p < 0.001) were found in BIA versus DXA. Moderate to very strong positive associations (p < 0.05) between BIA and DXA outcome measures were found, except for arm fat mass (men: p = 0.904, women: p = 0.130) and leg fat mass (only men: p = 0.845). This study highlights (sex-dependent) differences in corresponding test outcomes between BIA and DXA both at the segmental and whole-body level.
... Spatiotemporal and coefficient of variation (CoV) parameters, peak values in joint angles, moment and GRFs were compared using the repeated-measures paired t-tests or Wilcoxon signed-rank test, based on the normality of data tested with the Kolmogorov-Smirnov test. Mean differences (bias) and 95% CI of the difference were calculated (Bland & Altman, 1986). Joint angles, moments and GRFs waveforms of both approaches were compared using statistical parametric mapping (SPM), using a repeated-measures paired t -test. ...
Background Instrumented treadmills have become more mainstream in clinical assessment of gait disorders in children, and are increasingly being applied as an alternative to overground gait analysis. Both approaches differ in multiple elements of set-up ( e.g. , overground versus treadmill, Pug-in Gait versus Human Body Model-II), workflow ( e.g. , limited amount of steps versus many successive steps) and post-processing of data ( e.g. , different filter techniques). These individual elements have shown to affect gait. Since the approaches are used in parallel in clinical practice, insight into the compound effect of the multiple different elements on gait is essential. This study investigates whether the outcomes of two approaches for 3D gait analysis are interchangeable in typically developing children. Methods Spatiotemporal parameters, sagittal joint angles and moments, and ground reaction forces were measured in typically developing children aged 3–17 years using the overground (overground walking, conventional lab environment, Plug-In Gait) and treadmill (treadmill walking in virtual environment, Human Body Model-II) approach. Spatiotemporal and coefficient of variation parameters, and peak values in kinematics and kinetics of both approaches were compared using repeated measures tests. Kinematic and kinetic waveforms from both approaches were compared using statistical parametric mapping (SPM). Differences were quantified by mean differences and root mean square differences. Results Children walked slower, with lower stride and stance time and shorter and wider steps with the treadmill approach than with the overground approach. Mean differences ranged from 0.02 s for stride time to 3.3 cm for step width. The patterns of sagittal kinematic and kinetic waveforms were equivalent for both approaches, but significant differences were found in amplitude. Overall, the peak joint angles were larger during the treadmill approach, showing mean differences ranging from 0.84° (pelvic tilt) to 6.42° (peak knee flexion during swing). Mean difference in peak moments ranged from 0.02 Nm/kg (peak knee extension moment) to 0.32 Nm/kg (peak hip extension moment), showing overall decreased joint moments with the treadmill approach. Normalised ground reaction forces showed mean differences ranging from 0.001 to 0.024. Conclusion The overground and treadmill approach to 3D gait analysis yield different sagittal gait characteristics. The systematic differences can be due to important changes in the neuromechanics of gait and to methodological choices used in both approaches, such as the biomechanical model or the walkway versus treadmill. The overview of small differences presented in this study is essential to correctly interpret the results and needs to be taken into account when data is interchanged between approaches. Together with the research/clinical question and the context of the child, the insight gained can be used to determine the best approach.
... The results from the proposed method are compared to the ground truth using the Bland-Altman test [15] to find the limits of agreement (LOA). Fig. 3 and Table I show the LOA for various exercises. ...
Range of motion (ROM) is an important indicator of an individual's physical health, and its degradation impacts their ability to perform activities of daily living. The elderly are particularly susceptible to mobility loss due to muscular decline, neuromuscular disorders, sedentary lifestyle, etc. Thus, they must undergo periodic ROM assessments to track their physical well-being and consult doctors for any decline in ROM. An at-home ROM assessment device can assist the elderly to self-perform ROM assessment and facilitate remote monitoring of and compliance to therapy. The pervasive adoption of digital voice assistants (DVAs), that include a monocular camera, offers an opportunity for at-home ROM assessment. This paper proposes using a DVA for ROM measurement by utilizing 2D pose estimation techniques to estimate 3D limb pose for specific exercises. The system employs the MediaPipe library to perform 2D pose estimation and uses the joint coordinates to find the 3D pose of the limb using a 2D projection method. To validate the system, it is first compared with a 3D human model performing various shoulder and elbow exercises in a virtual environment. Next, for further validation, a neurologically intact individual performs the same exercises and the results of the proposed system are compared with the results from a markerless optical motion capture system (Kinect). The Bland-Altman limits of agreement (LOA) are computed and provided for the two sets of comparisons. The results demonstrate the feasibility of the proposed system in providing reliable ROM measurements using a DVA and suggest possible enhancements.
... Therefore, descriptive and explorative time-series analyses will be performed [76]. The overall physical activity as well as the amount of physical activity for different groups will be compared between measurements with the activity tracker and the questionnaire (the BSA-F [54] module) both in minutes per week at t 2 (post-treatment) using the Bland-Altman method [77]. ...
Background One relevant strategy to prevent the onset and progression of type 2 diabetes mellitus (T2DM) focuses on increasing physical activity. The use of activity trackers by patients could enable objective measurement of their regular physical activity in daily life and promote physical activity through the use of a tracker-based intervention. This trial aims to answer three research questions: (1) Is the use of activity trackers suitable for longitudinal assessment of physical activity in everyday life? (2) Does the use of a tracker-based intervention lead to sustainable improvements in the physical activity of healthy individuals and in people with T2DM? (3) Does the accompanying digital motivational intervention lead to sustainable improvements in physical activity for participants using the tracker-based device? Methods The planned study is a randomized controlled trial focused on 1642 participants with and without T2DM for 9 months with regard to their physical activity behavior. Subjects allocated to an intervention group will wear an activity tracker. Half of the subjects in the intervention group will also receive an additional digital motivational intervention. Subjects allocated to the control group will not receive any intervention. The primary outcome is the amount of moderate and vigorous physical activity in minutes and the number of steps per week measured continuously with the activity tracker and assessed by questionnaires at four time points. Secondary endpoints are medical parameters measured at the same four time points. The collected data will be analyzed using inferential statistics and explorative data-mining techniques. Discussion The trial uses an interdisciplinary approach with a team including sports psychologists, sports scientists, health scientists, health care professionals, physicians, and computer scientists. It also involves the processing and analysis of large amounts of data collected with activity trackers. These factors represent particular strengths as well as challenges in the study. Trial Registration The trial is registered at the World Health Organization International Clinical Trials Registry Platform via the German Clinical Studies Trial Register (DRKS), DRKS00027064 . Registered on 11 November 2021.
... The Spearman rank correlation test was used to calculate the correlation between manual counting, computer image analysis, and semiquantitative estimation, as well as the consistency between different observers. Bland-Altman analysis was used for comparison of manual counting, computer image analysis and semiquantitative estimation [18,19]. The statistical analysis was performed with MedCalc 20.009 software (MedCalc Software Ltd), and P < 0.05 was considered statistically significant. ...
P53 prognostic cut-off values differ between studies of mantle cell lymphoma (MCL), and its immunohistochemistry (IHC) interpretation is still based on semiquantitative estimation, which might be inaccurate. This study aimed to investigate the optimal cut-off value for p53 in predicting prognosis of patients with MCL and the possible use of computer image analysis to identify the positive rate of p53. We calculated p53 positive rate using QuPath software and compared it with the data obtained by manual counting and semiquantitative estimation. Survival curves were generated by using the Youden index and the Kaplan–Meier method. The chi-squared (χ2) test was used to compare MIPI, Ann Arbor stage, and cell morphology with p53. Spearman rank correlation test and Bland–Altman analysis were used to compare manual counting, computer image analysis and semiquantitative estimation, as well as the consistency between different observers. The optimal cut-off value of p53 for predicting prognosis was 20% in MCL patients. Patients with p53 ≥ 20% had a significantly worse overall survival (OS) than those with p53 < 20% (P < 0.0001). MCL patients with MIPI intermediate to high risk, Ann Arbor stage III–IV, and blastoid/pleomorphic variant cell morphology had more p53 ≥ 20%. There was a strong correlation between computer image analysis and manual counting of p53 from the same areas in MCL tissues (Spearman’s rho = 0.966, P < 0.0001). The results of computer analysis are completely consistent between observers, and computer image analysis of Ki-67 can predict the prognosis of MCL patients. MCL patients with p53 ≥ 20% had a shorter OS and a tendency for MIPI intermediate to high risk, Ann Arbor stage III–IV, and blastoid/pleomorphic variant. Computer image analysis could determine the actual positive rate of p53 and Ki-67 and is a more attractive alternative than semiquantitative estimation in MCL. Graphical abstract
... The performance of the regression formula for AL est based on the training dataset was evaluated by Bland-Altman plots and the 95% limits of agreements (95% LoA = d ± 1.96SD, where LoA is the limits of agreement, d is the mean difference between methods and SD is the standard deviation of difference of both) [18,19]. The ICC was calculated to assess the absolute agreement between the two measures. ...
Background To generate and validate a method to estimate axial length estimated (AL est ) from spherical equivalent (SE) and corneal curvature [keratometry (K)], and to determine if this AL est can replace actual axial length (AL act ) for correcting transverse magnification error in optical coherence tomography angiography (OCTA) images using the Littmann-Bennett formula. Methods Data from 1301 participants of the Raine Study Gen2-20 year follow-up were divided into two datasets to generate (n = 650) and validate (n = 651) a relationship between AL, SE, and K. The developed formula was then applied to a separate dataset of 46 participants with AL, SE, and K measurements and OCTA images to estimate and compare the performance of AL est against AL act in correcting transverse magnification error in OCTA images when measuring the foveal avascular zone area (FAZA). Results The formula for AL est yielded the equation: AL est = 2.102K − 0.4125SE + 7.268, R ² = 0.794. There was good agreement between AL est and AL act for both study cohorts. The mean difference [standard deviation (SD)] between FAZA corrected with AL est and AL act was 0.002 (0.015) mm ² with the 95% limits of agreement (LoA) of − 0.027 to 0.031 mm ² . In comparison, mean difference (SD) between FAZA uncorrected and corrected with AL act was − 0.005 (0.030) mm ² , with 95% LoA of − 0.064 to 0.054 mm ² . Conclusions AL act is more accurate than AL est and hence should be used preferentially in magnification error correction in the clinical setting. FAZA corrected with AL est is comparable to FAZA corrected with AL act , while FAZA measurements using images corrected with AL est have a greater accuracy than measurements on uncorrected images. Hence, in the absence of AL act , clinicians should use AL est to correct for magnification error as this provides for more accurate measurements of fundus parameters than uncorrected images.
... EC and TTE data were extracted post-hoc while blinded from the results of the paired measurements. Interchangeability was evaluated using Bland-Altman analysis [12]. Bias (mean difference between methods), bias% (bias expressed as a percentage to the mean value), LoA (bias ±1.96 × SD of the differences), and %error (%error = ±1.96 ...
Introduction: The aim was to evaluate the agreement between cardiac output estimates obtained by electrical cardiometry (EC) and transthoracic echocardiography (TTE) in very preterm infants. Methods: This is a single-center prospective observational study in infants born<32 weeks gestational age within 48 h of birth. Continuous EC was recorded and simultaneous TTE obtained on day 1 and day 2 of life. Blinded TTE measurements were performed within a 10 s timeframe using beat-to-beat EC data. The primary outcome was %error of left ventricular (LV) output in milliliters per kilogram per minute (cardiac index (CI)) obtained by TTE compared to LV-CI from EC. Secondary outcome parameters were bias, %bias, limits of agreement and include measures of right ventricular (RV) output and LV systolic time intervals. Results: Analysis was performed for 34 infants (median (IQR) gestational age 29 + 0 (24 + 5 to 30 + 6) weeks + days, birthweight 960 (748 to 1,490) grams) including 44 pairwise LV output measurements on 24 participants (22 on day 1 and day 2). The %error was 54% for LV-CI (EC: 214 (38) mL/kg/min vs. TTE: 163 (47) mL/kg/min). The %error was 78% for RV-CI (EC: 213 (37) mL/kg/min vs. TTE: 241 (77) mL/kg/min). While only LV-CI values affected LV-CI bias, signal quality, heart rate, and RV-CI values affected RV-CI bias. Conclusion: EC is not interchangeable with TTE to estimate indices of LV or RV output in very preterm infants within the first 48 h postnatally. EC may not measure LV output distinctly in very preterm infants with intra- and extracardiac shunts.
... The acoustic and psychoacoustic indicators were computed with a commercial software package (ArtemiS SUITE, HEAD acoustics GmbH, Herzogenrath, Germany). Bland-Altman (BA) parametric statistics and plots were generated to examine the agreement between in situ, and OCV and HATS calibrations methods (29). Additionally, 95 % confidence intervals were computed for the mean to examine the level of bias, and also for both the upper and lower bounds of the 95 % limits of agreement (LoA) to discover outliers. ...
Full-text available
To increase the availability and adoption of the soundscape standard, a low-cost calibration procedure for reproduction of audio stimuli over headphones was proposed as part of the global Soundscape Attributes Translation Project'' (SATP) for validating ISO/TS~12913-2:2018 perceived affective quality (PAQ) attribute translations. A previous preliminary study revealed significant deviations from the intended equivalent continuous A-weighted sound pressure levels ($L_{\text{A,eq}}$) using the open-circuit voltage (OCV) calibration procedure. For a more holistic human-centric perspective, the OCV method is further investigated here in terms of psychoacoustic parameters, including relevant exceedance levels to account for temporal effects on the same 27 stimuli from the SATP. Moreover, a within-subjects experiment with 36 participants was conducted to examine the effects of OCV calibration on the PAQ attributes in ISO/TS~12913-2:2018. Bland-Altman analysis of the objective indicators revealed large biases in the OCV method across all weighted sound level and loudness indicators; and roughness indicators at \SI{5}{\%} and \SI{10}{\%} exceedance levels. Significant perceptual differences due to the OCV method were observed in about \SI{20}{\%} of the stimuli, which did not correspond clearly with the biased acoustic indicators. A cautioned interpretation of the objective and perceptual differences due to small and unpaired samples nevertheless provide grounds for further investigation.
... Bland and Altmanś Limits of Agreement is the most popular [14], and recommended statistical method for evaluation of agreement [15,16]. The standard error of measurement (SEM) is similarly regarded as a suitable parameter of agreement, but is, sensitive to variability in the population [17]. ...
Full-text available
Purpose To evaluate intra- and inter-rater agreement and reliability of seven reported disc height index (DHI) measurement methods on standing lateral X-ray of lumbar spine. Methods The adult patients who had standing lateral X-ray of lumbar spine were recruited. Seven methods were used to measure DHI of each lumbar intervertebral disc level, including a ratio of sum of anterior and posterior disc height (DH) to disc diameter (Method 1), a ratio of middle DH to mid-vertebral body height (Method 2), a ratio of middle DH to disc diameter (Method 3), a ratio of the mean of anterior, middle, and posterior DH to the sagittal diameter of the proximal vertebral body (Method 4), a ratio of DH to vertebral height which cross the centre of adjacent vertebral bodies (Method 5), a ratio of the mean of anterior, middle, and posterior DH to the mean of proximal and distal vertebral body height (Method 6), and a ratio of the sum of anterior and posterior DH to the sum of superior and inferior disc depth (Method 7). Two raters conducted the measurements (one medical student (SS) and the other an experienced spine surgeon (XC)). Bland and Altmańs Limits of Agreement (LOA) with standard difference were calculated to examine intra- and inter-rater agreements between two out of seven methods for DHI. Intra-class correlations (ICC) with 95% confidence intervals were calculated to assess intra- and inter-rater reliability. Results The intra-rater reliability in DHI measurements for 288 participants were ICCs from 0.807 (0.794, 0.812) to 0.922 (0.913, 0.946) by rater 1 (SS) and from 0.827 (0.802, 0.841) to 0.918 (0.806, 0.823) by rater 2 (XC). Method 2, 3, and 5 on all segmental levels had bias (95 % CI does not include zero) or/and out of the acceptable cut-off proportion (>50 %). A total of 609 outliers in 9174 segmental levels’ LOA range. Inter-rater reliability was good-to-excellent in all but method 2 (0.736 (0.712, 0.759)) and method 5 (0.634 (0.598, 0.667)). ICCs of related lines to good-to-excellent reliability methods was excellent in all but only indirect lines in method 1 and 4 (ICCs lie in the range from 0.8 to 0.9). Conclusion Following a structured protocol, intra- and inter-rater reliability was good-to-excellent for most DHI measurement methods on X-ray. However, the complicated methods (more indirect lines) and IVD degeneration (nucleus pulposus degeneration and disc herniation) potentially affected the agreement on inter-rater measurements. Method 7 is the best reproducible method to measure disc height index for all intervertebral disc segmental levels with a good-to-excellent intra- and inter-rater reliability and agreement.
... However, measures of association are not indicative of agreement between prediction methods; more important is the absolute agreement and random errors between prediction methods. 34 In this study, although there were no fixed and proportional bias by means of the ordinary leastproducts regression analysis, there were substantial absolute differences (Multiple-point = 3.5 ± 2.4 kg, 6.1 ± 3.7%; Two-Point 45−75 = 4.9 ± 4.5 kg, 8.6 ± 6.2%; and Two-Point 45−90 = 3.1 ± 2.0 kg, 5.7 ± 4.0%) in the prediction of the 1RM. As an example, using the multiple point method, there might be an absolute error up to 9.8% in the estimation of the 1RM, which makes a large difference when selecting the loads for achieving the desired adaptations. ...
This study aimed to evaluate whether lifting velocity can be used to estimate the overhead press one repetition maximum (1RM) and to explore the differences in the accuracy of the 1RM between three velocity-based methods. Twenty-seven weightlifters (16 men and 11 women) participated. The first session was used to test the overhead press 1RM. The second session consisted of an incremental loading test during the overhead press. The mean velocity was registered using a transducer attached to the barbell. A 1-way repeated-measures analysis of variance (ANOVA) with Bonferroni post hoc corrections was applied to the absolute differences between the actual and predicted 1RMs. Raw differences with 95% limits of agreement and ordinary least-products regressions were used to test the concurrent validity of the 1RM prediction methods with respect to the actual 1RM. The ANOVA did not reveal significant differences for the absolute differences respect to the actual 1RM between the three 1RM prediction methods ( F = 3.2, p = .073). The absolute errors were moderate for the Multiple-Point (6.1 ± 3.7%), Two-Point 45−75 (8.6 ± 6.2%), and Two-Point 45−90 methods (5.7 ± 4.0%). The validity analysis showed that all the 1RM prediction methods underestimated the actual 1RM (1.0–2.2 kg), but ordinary least-products regressions failed to show fixed or proportional bias. These results suggest that the Multiple-Point and Two-Point 45−90 velocity-based methods might be viable tools to predict the overhead press 1RM in weightlifters, but practitioners are encouraged to use the direct 1RM for a more accurate prescription of the training loads.
... Time spent in moderate to vigorous intensity of physical activity was calculated using accelerometer wear time as denominator and expressed in minutes. Bland-Altman plots were conducted for describing the agreement between objective actigraph data and selfreported data (OPAQ) [24] using Stata 17.0. Due to the small sample, for the Bland-Altman LOA (limits of agreement), all observations were regarded as independent even though they come from the same student. ...
Objective To adapt and partly validate a Danish online version of the patient-reported outcome measure (PROM) Oxford Physical Activity Questionnaire (“OPAQ”) and evaluate mobile phones and tablets as data capturing tool to identify potential problems and deficiencies in the PROM prior to implementation in the full study. Methods The OPAQ was translated into Danish by a formalised forward-backward translation procedure. Face validity was examined by interviewing 12 school students aged 10–15, recruited from two Danish public schools. After modifications, the online version of the Danish OPAQ was pilot tested in a convenience sample of seven school students for 1 week. Simultaneous objective accelerometer data were captured during the registration period. Results No major challenges were identified when translating OPAQ. Based on the interviews, the Danish version of OPAQ was perceived to be easy to understand in general, and the questions were relevant for tracking activities during the week. Five of the 12 participants had difficulties with understanding the introductory question: “what is your cultural background” in the original OPAQ. The interviews revealed that the participants recalling 7 days forgot to record some of the physical activity they had done during the week, indicating issues with the weekly recall method. After transforming to the online version, this was reported to be easy and quick to fill in (taking 1–3 min per day), and participants reported the daily design was helpful to remember activities. There was good correspondence between the online version and objective actigraphs with a tendency to underreport. Six participants reported 10–60 min less moderate to vigorous physical activity compared to the actigraphs, while one participant reported 3 min more. Conclusion Participants found the online OPAQ quick and easy to complete during a 1-week period. Completing daily rather than weekly may help limit issues with recall. Overall, there was good agreement between the objective actigraphs and the OPAQ, though the OPAQ tended to slightly underreport moderate to vigorous physical activity. The Danish online version of OPAQ may be useful for capturing school students’ physical activity when objective measures are not feasible.
... Based on this assumption, 95% of the differences lie between Fig. 1 SG model (a) and MRIbased model (b) of the right shoulder -Also shown is the parent coordinate system of the GHJ. Each motion is calculated from the Cardan sequence YXZ the limits of agreement (LOA), representing two fold its standard deviation (Bland and Altman 1986). ...
Joint motion calculated using multi-body models and inverse kinematics presents many advantages over direct marker-based calculations. However, the sensitivity of the computed kinematics is known to be partly caused by the model and could also be influenced by the participants’ anthropometry and sex. This study aimed to compare kinematics computed from an anatomical shoulder model based on medical images against a scaled-generic model and quantify the effects of anatomical errors and participants’ anthropometry on the calculated joint angles. Twelve participants have had planar shoulder movements experimentally captured in a motion lab, and their shoulder anatomy imaged using an MRI scanner. A shoulder multi-body dynamics model was developed for each participant, using both an image-based approach and a scaled-generic approach. Inverse kinematics have been performed using the two different modelling procedures and the three different experimental motions. Results have been compared using Bland–Altman analysis of agreement and further analysed using multi-linear regressions. Kinematics computed via an anatomical and a scaled-generic shoulder models differed in average from 3.2 to 5.4 degrees depending on the task. The MRI-based model presented smaller limits of agreement to direct kinematics than the scaled-generic model. Finally, the regression model predictors, including anatomical errors, sex, and BMI of the participant, explained from 41 to 80% of the kinematic variability between model types with respect to the task. This study highlighted the consequences of modelling precision, quantified the effects of anatomical errors on the shoulder kinematics, and showed that participants' anthropometry and sex could indirectly affect kinematic outcomes.
... The 95% limits of agreement were arbitrarily set, in accordance with Bland and Altman, as the bias ± 1.96 standard deviations. (13) A p value < 0.05 was considered statistically significant (SPSS version 15.0, Chicago, IL). ...
Objectives The aim of this study was to develop and validate a questionnaire which assesses knowledge of signs and symptoms of relative energy deficiency in sport (REDS) among healthcare professionals and physically active individuals. Design Cross-Sectional Study. Methods The questionnaire was created in two phases: item development and item validation. Item development was established through a review of the literature, expert review (n = 4), and pre-testing among healthcare professionals, dietetic students, and the general population (n = 35). Validity (item analysis, construct validity) and internal reliability were assessed by administrating the developed questionnaire to Healthcare Professionals (HP) (n = 97) and Physically Active Individuals (PAI) who engaged in moderate to intense physical activity (n = 77). The questionnaire was re-administered in a subset of the same groups (n = 88) for test-retest reliability. Results The expert responses showed >80 % acceptability and pretesting through interviews indicated good content and face validity. Item response analysis resulted in removal of 6 items due to low discrimination ability. Construct validity was confirmed with significantly higher knowledge scores in HP compared with non-health professionals (mean difference (95 % CI) = 2.8 (1.9, 3.7)). Internal consistency, assessed using Cronbach's alpha (α =0.79), and test-retest reliability using intra-class correlation coefficients (ICC = 0.80; Spearman's correlation = 0.84, p < 0.001), were good. The final questionnaire had 18 items that assessed the knowledge of signs and symptoms of REDS. Conclusion The questionnaire provides a valid and reliable tool to assess knowledge of signs and symptoms of RED-S among HP and PAI. The questionnaire will guide future education requirements by assessing current knowledge.
Purpose: To determine the effects of monocular light deprivation on diurnal rhythms in retinal and choroidal thickness. Methods: Twenty participants, ages 22 to 45 years, underwent spectral domain optical coherence tomography imaging every three hours, from 8 AM to 8 PM, on two consecutive days. Participants wore an eye patch over the left eye starting at bedtime of day 1 until the end of the last measurement on day 2. Choroidal, total retinal, photoreceptor outer segment + retinal pigment epithelium (RPE), and photoreceptor inner segment thicknesses were determined. Results: For both eyes, significant diurnal variations were observed in choroidal, total retinal, outer segment + RPE, and inner segment thickness (P < 0.001). For light-deprived eyes, choroid diurnal variation persisted, although the choroid was significantly thinner at 8 AM and 11 AM (P < 0.01) on day 2 compared to day 1. On the other hand, diurnal variations in retinal thickness were eliminated in the light-deprived eye on day 2 when the eye was patched (P > 0.05). Total retinal and inner segment thicknesses significantly decreased (P < 0.001) and outer segment + RPE thickness significantly increased (P < 0.05) on day 2 compared to day 1. Conclusions: Blocking light exposure in one eye abolished the rhythms in retinal thickness, but not in choroidal thickness, of the deprived eye. Findings suggest that the rhythms in retinal thickness are, at least in part, driven by light exposure, whereas the rhythm in choroidal thickness is not impacted by short-term light deprivation.
p>This study was designed to test the hypothesis, that women, whose growth is impaired in early life (as evidenced by short stature, small head circumference and/or low birthweight) and who become 'fat' as adults are insulin resistant, become hyperglycaemic in pregnancy and give birth to fat, hyperinsulinaemic babies who are at increased risk of diabetes in adult life. Overall, full-term babies in Mysore (urban India) were heavier (mean birthweight, 2956g) than those born in Pune (rural India) (2665g), but lighter than Southampton babies (3441g). Neonatal body composition was similar to that in Pune, with relative fat-sparing and decreased muscle mass. Neonatal body composition; fatter mothers had fatter babies, taller mothers had longer babies and mothers with smaller head size had babies with smaller heads. Babies of mothers with gestational diabetes (GDM) were significantly larger in all measurements, especially body fat. These 'macrosomic' changes were present across the range of 'normal' maternal glucose concentrations. Cord blood glucose and insulin concentrations were strongly and positively related to maternal blood glucose concentrations and to neonatal anthropometry. In conclusion, GDM prevalence was high in this population. The highest glucose concentrations and insulin resistant indices were found in mothers with evidence of impaired growth in early life and who had become relatively fat as adults. 'Macrosomic' changes were seen across the range of 'normal' maternal glucose concentrations.</p
Currently, there is no way to assess mechanical loading variables such as peak ground reaction forces (pGRF) and peak loading rate (pLR) in clinical settings. The purpose of this study was to develop accelerometry-based equations to predict both pGRF and pLR during walking and running. One hundred and thirty one subjects (79 females; 76.9 ± 19.6kg) walked and ran at different speeds (2-14km·h-1) on a force plate-instrumented treadmill while wearing accelerometers at their ankle, lower back and hip. Regression equations were developed to predict pGRF and pLR from accelerometry data. Leave-one-out cross-validation was used to calculate prediction accuracy and Bland-Altman plots. Our pGRF prediction equation was compared with a reference equation previously published. Body mass and peak acceleration were included for pGRF prediction and body mass and peak acceleration rate for pLR prediction. All pGRF equation coefficients of determination were above 0.96, and a good agreement between actual and predicted pGRF was observed, with a mean absolute percent error (MAPE) below 7.3%. Accuracy indices from our equations were better than previously developed equations. All pLR prediction equations presented a lower accuracy compared to those developed to predict pGRF. Walking and running pGRF can be predicted with high accuracy by accelerometry-based equations, representing an easy way to determine mechanical loading in free-living conditions. The pLR prediction equations yielded a somewhat lower prediction accuracy compared with the pGRF equations.
Biomechanics plays a key role in occurrence, prevention and rehabilitation of the landing injuries. Some factors can affect the biomechanical performance of landing. To evaluate the effect of various factors, we measured kinematic, kinetic and EMG properties of 16 subjects while they land with changed conditions. We found that landing on level ground with two legs was stable, and the body would be injured before dynamic postural stability was impaired. Compared with the dominant lower limb, the non-dominant limb has a more effective protective mechanism in that the ankle motion is restrained by higher flexor activities. Women are prone to transform the landing energy to the joint motion, whereas men are more likely to transform it to friction. The semi-rigid stabilizer was helpful for men in increasing shank muscle activities. For women, high stabilizer rigidity had little influence on the muscle activities, and it could contribute to larger injury risk. Terrain stiffness did not appear to influence ankle biomechanics.
Introduction: Measurement of physical activity (PA) using commercial activity trackers such as Fitbit devices has become increasingly popular, also for people with haemophilia (PWH). The accuracy of the Fitbit model Charge 3 has not yet been examined. Aims: To compare the Fitbit Charge 3 against the research-grade accelerometer ActiGraph GT3X-BT in measuring average daily steps and minutes spent in different PA intensities. Methods: Twenty-four young PWH wore a wrist-worn Fitbit Charge 3 and hip-worn ActiGraph GT3X-BT simultaneously for seven consecutive days in free-living conditions. Correlation of and differences between the devices for daily averages of PA parameters were assessed using Pearson's correlation coefficient and paired t-test, respectively. Agreement between devices was assessed using Bland-Altman plots. Results: Twenty participants (mean age 21.8) were included in the analyses. We found moderate to high correlations between Fitbit and ActiGraph measured daily averages for all PA variables, but statistically significant differences between devices for all variables except daily minutes of moderate PA. Fitbit overestimated average daily steps, minutes of light, vigorous and moderate-to-vigorous PA. Bland-Altman plots showed a measurement bias between devices for all parameters with increasing overestimation by the Fitbit for higher volumes of PA. Conclusion: The Fitbit Charge 3 overestimated steps and minutes of light, moderate and moderate-to-vigorous PA as compared to the ActiGraph GT3X-BT, and this bias increased with PA volume. The Fitbit should therefore be used with caution in research, and we advise users of the device to be cognizant of this overestimation.
Objective: Cochlear implant (CI) candidacy and postoperative outcomes are assessed using sets of speech perception tests that vary from center to center, limiting comparisons across institutions and time periods. The objective of this study was to determine if scores on one speech perception test could be reliably predicted from scores on another test. Study design: Arizona Biomedical (AzBio) Sentence Test, Consonant-Nucleus-Consonant word (CNCw), and Hearing in Noise Test (HINT) scores in quiet for the implanted ear were collected for individuals who received a CI between 1985 and 2019. Scores collected during the same testing session were analyzed using Bland-Altman plots to assess agreement between testing methods. Simple linear regression with logit transformation was used to generate predictive functions and 95% confidence intervals for expected mean and individual scores. Setting: Single academic medical center. Patients: A total of 1,437 individuals with a median age of 59.9 years (range, 18-95 yr) and 46% (654 of 1,437) male. Interventions: N.A. Main outcome measures: Agreement as a function of test score, mean, variance, and correlation coefficients. Results: A total of 2,052 AzBio/CNCw, 525 AzBio/HINT, and 7,187 CNCw/HINT same-session score pairings were identified. Pairwise test comparisons demonstrated limited agreement between different tests performed in the same session, and a score correlation between different speech tests revealed large variances. Conclusion: Transformation functions between test batteries were predictive of mean scores but performed poorly for prediction of individual scores. Point-wise comparisons of scores across CI test batteries should be used with caution in clinical and research settings.
Full-text available
Purpose: To simultaneously acquire spectroscopic signals from two MRS voxels using a multi-banded 2 spin-echo, full-intensity acquired localized (2SPECIAL) sequence, and to decompose the signal to their respective regions by a novel voxel-GRAPPA (vGRAPPA) decomposition approach for in vivo brain applications at 7 T. Methods: A wideband, uniform rate, smooth truncation (WURST) multi-banded pulse was incorporated into SPECIAL to implement 2SPECIAL for simultaneous multi-voxel spectroscopy (sMVS). To decompose the acquired data, the voxel-GRAPPA decomposition algorithm is introduced, and its performance is compared to the SENSE-based decomposition. Furthermore, the limitations of two-voxel excitation concerning the multi-banded adiabatic inversion pulse, as well as of the combined B0 shim and B1 + adjustments, are evaluated. Results: It was successfully shown that the 2SPECIAL sequence enables sMVS without a significant loss in SNR while reducing the total scan time by 21.6% compared to two consecutive acquisitions. The proposed voxel-GRAPPA algorithm properly reassigns the signal components to their respective origin region and shows no significant differences to the well-established SENSE-based algorithm in terms of leakage (both <10%) or Cramér-Rao lower bounds (CRLB) for in vivo applications, while not requiring the acquisition of additional sensitivity maps and thus decreasing motion sensitivity. Conclusion: The use of 2SPECIAL in combination with the novel voxel-GRAPPA decomposition technique allows a substantial reduction of measurement time compared to the consecutive acquisition of two single voxels without a significant decrease in spectral quality or metabolite quantification accuracy and thus provides a new option for multiple-voxel applications.
Full-text available
Full-text available
Feasibility of automated volume-derived cardiac functional evaluation has successfully been demonstrated using cardiovascular magnetic resonance (CMR) imaging. Notwithstanding, strain assessment has proven incremental value for cardiovascular risk stratification. Since introduction of deformation imaging to clinical practice has been complicated by time-consuming post-processing, we sought to investigate automation respectively. CMR data (n = 1095 patients) from two prospectively recruited acute myocardial infarction (AMI) populations with ST-elevation (STEMI) (AIDA STEMI n = 759) and non-STEMI (TATORT-NSTEMI n = 336) were analysed fully automated and manually on conventional cine sequences. LV function assessment included global longitudinal, circumferential, and radial strains (GLS/GCS/GRS). Agreements were assessed between automated and manual strain assessments. The former were assessed for major adverse cardiac event (MACE) prediction within 12 months following AMI. Manually and automated derived GLS showed the best and excellent agreement with an intraclass correlation coefficient (ICC) of 0.81. Agreement was good for GCS and poor for GRS. Amongst automated analyses, GLS (HR 1.12, 95% CI 1.08–1.16, p < 0.001) and GCS (HR 1.07, 95% CI 1.05–1.10, p < 0.001) best predicted MACE with similar diagnostic accuracy compared to manual analyses; area under the curve (AUC) for GLS (auto 0.691 vs. manual 0.693, p = 0.801) and GCS (auto 0.668 vs. manual 0.686, p = 0.425). Amongst automated functional analyses, GLS was the only independent predictor of MACE in multivariate analyses (HR 1.10, 95% CI 1.04–1.15, p < 0.001). Considering high agreement of automated GLS and equally high accuracy for risk prediction compared to the reference standard of manual analyses, automation may improve efficiency and aid in clinical routine implementation. Trial registration: ClinicalTrials.gov, NCT00712101 and NCT01612312.
The continuous growth in the craft beer market, and the increased interest and demands of consumers, have directed the efforts of brewers towards the production of differential and innovative beers. In this sense, non-conventional yeasts - other than the traditional domesticated ale (Saccharomyces cerevisiae) and lager (Saccharomyces pastorianus) yeasts - have gained attention as tools for new product development. Of great interest is the cryotolerant species S. eubayanus, isolated for the first time in Patagonia (Argentina) and identified as one of the parents of the hybrid species S. pastorianus (lager yeast). This work sought to contribute to the knowledge about the fermentative behavior of S. eubayanus, develop starter cultures suitable for the brewing industry, and evaluate different strategies for its application on a productive scale. First, five strains of S. eubayanus (representative of the five geographically structured subpopulations) were tested in laboratory-scale fermentations, evaluating their fermentation performance (fermentation rate, attenuation, sugars consumption) and their organoleptic profile (esters, higher alcohols, phenols, sensory evaluation). The different fermentative characteristics observed and the differential production of volatile compounds between the strains demonstrated intraspecific phenotypic variability of the species. The strain CRUB 1568T was selected due to its good fermentation performance, moderate production of phenols (4-vinylguaiacol and 4-vinylphenol), and its contribution of the fruity ester’s ethyl hexanoate and ethyl octanoate. From the selected strain, it was sought to establish the propagation conditions for the development of starter cultures for brewing. Different techniques were evaluated to determine the viability and vitality of the biomass produced, selecting the alkaline methylene violet staining method and laboratory-scale fermentations, respectively. A culture medium based on commercial malt extract was selected and optimized by evaluating 9 factors, with two levels for each factor, through an experimental Plackett-Burman design. Two factors showed significant influence on biomass production (yeast extract and acid casein peptone), and were later optimized by a Central Composite Design, through Response Surface Methodology. With the optimized culture medium (malt extract medium with 0.31 % w/v of yeast extract and 0.12 % w/v of acid casein peptone) it was possible to obtain a biomass production equal to that of the non-optimized medium (12 g/L), but with a 40 % a cost reduction. Finally, the propagation was scaled-up to 20 L, being able to obtain pure and quality S. eubayanus cultures, with adequate cell density to inoculate fermenters of up to 700 L. Prior to transfer the starter cultures to the brewing industry, the interaction of S. eubayanus with hops was evaluated. For this, the fermentation performance of S. eubayanus CRUB 1568T in hopped and unhopped wort was studied, as well as the impact of the yeast on the measured (IBU) and sensory perception of bitterness. S. eubayanus showed tolerance to hop compounds (20 and 60 IBUs), presenting even an increased fermentation rate in hopped worts (34 % and 62 %, respectively). On the other hand, the IBUs obtained for beers fermented with S. eubayanus were 18 % higher than for the industrial strains, with this difference being notorious sensorially and even perceived as a harsh bitterness. These results allow us to adopt criteria for the application of S. eubayanus in brewing, and adequately predict the final profile of the beers produced with this yeast. Subsequently, S. eubayanus CRUB 1568T was tested on brewing wort at a semi-pilot scale (20-40 L), observing a similar behavior to that obtained in the laboratory. Nevertheless, when this wild yeast was applied on a productive scale (1000-1500 L), it showed a poor fermentation performance. Fermentations were completed in 20 days (almost twice as long as laboratory and semi-pilot scales) with an average attenuation of 30 % (half of what was previously observed). It was decided to test whether an increase in oxygen supply during fermentation could help improve fermentation performance on a larger scale. An improvement in fermentation performance was found by applying different oxygenation regimens, achieving complete fermentations in 5-7 days, with attenuation levels of 60 %. This work encourages the use of non-conventional yeasts in brewing. The selection of native yeasts from the environment represents a valuable resource, not only for the development of innovative beers with greater productive differentiation, but also opens the possibility of conferring regional character to the product, with a consequent improvement in competitiveness. It was certainly found that the application of these yeasts in the industry requires different handling than traditional brewing yeasts, and opens a starting point for research and experimentation in relation to the different strategies that can be addressed.
Ciprofloxacin, a fluoroquinolone antibacterial agent, is not recommended in pediatric population on account of its possible adverse effect on growing cartilage. It is being commonly used for treatment of variety of infections in children in our country and very little information is available on the risks involved in its use. A questionnaire was sent to 750 pediatricians in the last week of November 1990, to retrospectively judge over the previous 2 month period the extent of its use and identify the adverse drug reactions (ADRs). One hundred and fifty-four pediatricians replied, of which 147 had prescribed ciprofloxacin in a total of 3341 patients under 18 years of age, enteric fever being the commonest indication for its use. One hundred and fifty-nine ADRs were reported in 104 (3.1%) patients. They were: gastrointestinal in 50% of these 104 patients, CNS in 23%, skin and allergic in 19.1%, musculoskeletal in 8.6%, hematological in 3.8%, CVS in 2.9% and nephrological in 0.9% cases. Of 159 ADRs, 8 (5%) were severe, 76 (47.8%) were moderate and 75 (47.2%) were mild. Therapy needed discontinuation in only 9 (0.3%) patients. Two new ADRs were identified, viz., sudden death after intravenous ciprofloxacin and sinus nodal arrest causing bradycardia.
Preproduction and current models of the miniature Wright peak flow meter have been compared with the standard Wright peak flow meter on normal and abnormal subjects. Early problems in production appear to have been overcome, and the current model agrees to within 3% with the standard peak flow meter, which is as close as the agreement between two standard instruments. The new mini-meter may be enclosed in a case, making direct comparisons with other instruments possible.
Methods of analysis used in the comparison of two methods of measurement are reviewed. The use of correlation, regression and the difference between means is criticized. A simple parametric approach is proposed based on analysis of variance and simple graphical methods.
The accuracy of the Nellcor N-101 pulse oximeter has been evaluated in adult patients receiving general anaesthesia or intensive care. Readings obtained noninvasively with this instrument were compared with measurements made on arterial blood using a Radiometer OSM2 oximeter. The pulse oximeter was easy to use and within the range tested (70–100 percent saturation of haemoglobin with oxygen) the readings were within I digit of the values obtained by in vitro measurement.
Full textFull text is available as a scanned copy of the original print version. Get a printable copy (PDF file) of the complete article (361K), or click on a page image below to browse page by page. 242 243
Seventy-three low birthweight babies were independently assessed for gestational age using the scoring system of Dubowitz et al. (1970) and 5 neurological reflexes described by Robinson (1966). The results obtained by the 5 reflexes were compared with those obtained by the scoring system and were found to be accurate estimations of gestational age. The 5 reflexes may be used for babies of gestational ages 29 to 37 weeks, but above 37 weeks the scoring system must be used.
The relation between pre-treatment blood-pressure and the fall in pressure after treatment was examined for most classes of antihypertensive drugs. Positive correlations were demonstrated for all drugs, for placebo, and for bed rest. This suggests that for all manoeuvres response is related to the height of the pretreatment pressure. Substitution of the pre-treatment and achieved pressures by random numbers reveals that positive correlations are mathematically inevitable and do not indicate any action on a basic mechanism of essential hypertension. After statistical correction for mathematical associations between the variables the apparent effects were generally lost. A correlation between the pre-treatment value of any variable and its change after a therapeutic intervention thus may not be valid.
M mode echocardiographic anteroposterior indexes of left ventricular function derived from long and short axis parasternal planes were compared in one hundred cases. In all the disease groups studied the paired values were within acceptable statistical limits of comparability and interchangeability; that is they were within two standard deviations of the mean difference in both directions. Values from either plane can usually be considered as being representative of the expected values for the individual.
The accuracy of the Nellcor N-101 pulse oximeter has been evaluated in adult patients receiving general anaesthesia or intensive care. Readings obtained noninvasively with this instrument were compared with measurements made on arterial blood using a Radiometer OSM2 oximeter. The pulse oximeter was easy to use and within the range tested (70-100 percent saturation of haemoglobin with oxygen) the readings were within I digit of the values obtained by in vitro measurement.
Normal pregnant women are resistant to the pressor effect of intravenously administered angiotensin II (AII), but women who are destined to develop hypertensive complications in pregnancy show an increased sensitivity to AII several weeks before the onset of the first clinical symptoms. In 231 normotensive nulliparous women (age 25 +/- 5 years), an angiotensin sensitivity test (AST) was performed between weeks 28 and 32 of gestation. If an effective angiotensin pressor dose (APD) of less than 10 ng . kg-1 . min-1, is considered to be a positive test result, 58 subjects had a positive AST and 173 had a negative AST. Twenty-six of 34 women who ultimately developed pregnancy-induced hypertension (PIH) or preeclampsia had a positive test, and the diagnosis was made early. Each of the eight pregnant subjects with a false negative test developed only a mild form of the hypertensive disorder. In this series, 11 women had a premature onset of labor; eight of them also had an APD of less than 10 ng . kg-1 . min-1. The study confirms the high predictive value of negative test results. Therefore, the AST can be used as an appropriate method for identifying women who are destined to develop hypertensive complications in pregnancy. However, because of the low practicability of the test, it may not be recommended as a screening method in routine prenatal care.
Data have emerged that provide the scientific basis for therapeutic drug monitoring of mycophenolic acid (MPA) in transplant patients receiving mycophenolate mofetil (MMF), the parent drug, in combination with other immunosuppressive agents. There is a significant relationship between the dose-interval MPA AUC and risk for acute rejection based on retrospective investigations in renal and heart transplant patients and on prospective investigations in renal transplant patients. The MPA dose-interval AUC varies naturally by more than 10-fold in renal and heart transplant patients. Other significant sources of pharmacokinetic variability for MPA include the effects of concomitant medications, and the effects of disease states such as renal dysfunction and liver disease on the steady state MPA AUC. Individualized MMF dose evaluation, guided by MPA plasma concentrations, is becoming the standard of practice at a growing number of transplant centers worldwide because of these factors and because of the need to closely evaluate the immunosuppression afforded by MPA when a change in the immunosuppression regimen in stable transplant patients is planned. Investigations of therapeutic drug monitoring strategies with an emphasis on identifying an optimal abbreviated sampling strategy for MPA AUC estimation are ongoing. Based on the concentration-outcome studies and experience at the authors' institutions and other centers, the authors propose a set of therapeutic drug monitoring guidelines for MPA in stable renal and heart transplant patients for the immediate (first 3 months posttransplant) and maintenance (>3 months) periods. When MPA binding to human serum albumin is altered, as occurs in patients with significant renal dysfunction, liver disease, or a substantial reduction in human serum albumin concentration, the possibility of increased MPA free fraction and free concentration will need to be taken into account in the interpretation of MPA total concentrations.
Cyclosporin was introduced into clinical practice in the early 1980s and has since been shown to prolong survival for transplant recipients. Because cyclosporin is a narrow therapeutic index drug and there are significant consequences associated with ‘subtherapeutic’ and ‘supratherapeutic’ concentrations, cyclosporin therapy is monitored as part of routine patient follow-up. However, the optimal method for the therapeutic drug monitoring of cyclosporin has yet to be defined. Currently, the most common method involves monitoring pre-dose trough concentrations, but this method is less than ideal. Other methods of monitoring cyclosporin therapy include monitoring the area under the concentration-time curve, limited sampling strategies, monitoring of single concentrations other than troughs and pharmacodynamic monitoring. Bayesian forecasting has been used successfully in clinical practice with other drugs with narrow therapeutic indices. However, few studies are available regarding Bayesian forecasting and cyclosporin. Existing studies are preliminary in nature and involve the old Sandimmun® formulation rather than the Neoral® formulation. Although these methods show promise, they have not gained widespread acceptance. This is because of their impracticality and the lack of prospective studies comparing other monitoring methods with trough concentration monitoring. Further comparative studies evaluating the impact of the specific monitoring method on definite patient outcomes are warranted.
The purpose of this study was to characterize the pharmacokinetic parameters of mycophenolic acid (MPA) in Korean kidney transplant recipients. Plasma MPA concentrations of 10 Korean kidney transplant recipients administered a lower dose of mycophenolate mofetil (MMF; 750 mg twice a day) were measured at 2 weeks of MMF therapy by high-performance liquid chromatography (HPLC). The plasma MPA concentration-time curve pattern of patients taking lower doses of MPA was consistent with previously reported profiles of patients taking the fully recommended doses. The plasma MPA concentration-time curve was characterized by an early sharp peak within 1 hour and a small second peak in some patients at 4 to 12 hours postdose. The mean C(max) and AUC were 8.73 +/- 4.65 microg/mL and 18.45 +/- 4.25 microg*h/mL, respectively. The mean fraction of free MPA was 1.60% +/- 0.23%. Patients' age, weight, body surface area, and renal function did not influence the AUC. The free fraction of MPA appeared not to be affected by serum albumin and renal function when creatinine clearance was above 40 mL/min. Regression analysis between each plasma concentration and AUC for the limited sampling strategy of MMF therapeutic drug monitoring demonstrated that the concentrations of predose and 1- and 8-hour postdose were positively correlated with AUC (r = 0.74545, p = 0.0133; r = 0.68485, p = 0.0289; and r = 0.63636, p = 0.0479, respectively). The pattern of the concentration-time profile of MPA in Korean kidney recipients was similar to the results of other studies performed in Caucasians, although there was interindividual variability of AUC, C(max), and t(max). MPA concentrations of predose and 1- and 8-hour postdose were positively correlated with AUC.
To investigate the pharmacokinetics of mycophenolic acid (MPA) in Chinese adult renal allograft recipients, and to generate the validated model equations for estimation of the MPA area under the plasma concentration-time curve from 0 to 12 hours (AUC(12)) with a limited sampling strategy. The pharmacokinetics in 75 Chinese renal allograft recipients treated with mycophenolate mofetil 2 g/day in combination with cyclosporin and corticosteroids were determined. The MPA concentration was assayed by high-performance liquid chromatography at pre-dose (C(0)) and at 0.5 (C(0.5)), 1 (C(1)), 1.5 (C(1.5)), 2 (C(2)), 4 (C(4)), 6 (C(6)), 8 (C(8)), 10 (C(10)) and 12 (C(12)) hours after dosing on day 14 post-transplant. Patients were randomly divided into: (i) a model group (n = 50) to generate the model equations by multiple stepwise regression analysis for estimation of the MPA AUC by a limited sampling strategy; and (ii) a validation group (n = 25) to evaluate the predictive performance of the model equations. The mean MPA AUC(12) was 52.97 +/- 15.09 mg . h/L, ranging from 24.0 to 102.3 mg . h/L. The patient's age and serum albumin level had a significant impact on the MPA AUC(12). The correlation between the pre-dose MPA trough level (C(0)) and the MPA AUC(12) was poor (r(2) = 0.02, p = 0.33). Model equations 7 (MPA AUC(12) = 14.81 + 0.80 . C(0.5) + 1.56 . C(2) + 4.80 . C(4), r(2) = 0.70) and 11 (MPA AUC(12) = 11.29 + 0.51 . C(0.5) + 2.13 . C(2) + 8.15 . C(8), r(2) = 0.88) were selected for MPA AUC calculation in Chinese patients, resulting in good agreements between the estimated MPA AUC and the full MPA AUC(12), with a mean prediction error of +/-10.1 and +/-6.9 mg . h/L, respectively. In Chinese renal allograft recipients, MPA pharmacokinetics manifest substantial interindividual variability, and the MPA AUC(12) tends to be higher than that in Caucasian patients receiving the same dose of mycophenolate mofetil. Two validated model equations with three sampling timepoints are recommended for MPA AUC estimation in Chinese patients.
