Article

Rank-Biserial Correlation

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

A formula is developed for the correlation between a ranking (possibly including ties) and a dichotomy, with limits which are always ±1. This formula is shown to be equivalent both to Kendall's and Spearman's.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... The polygon tool within the software was applied to compute CSA. Given the known correlation between body mass and tendon morphology [28], CSA and thickness values were also adjusted to one-third of the participant's body mass for the statistical analysis [14]. The PT characterization was evaluated with runners in the supine position, with both knees 30 • bent [24,27]. ...
... For variables that met the assumption of normality, a Student's t-test for independent samples was applied to assess differences in AT and PT morphology between road and trail runners. When the assumption of normality was violated, the Mann-Whitney U test was applied, with effect sizes expressed as rank-biserial correlations [28]. Effect sizes were During the evaluation of the AT, runners were in a prone position with both knees extended and their feet positioned outside of the bed, keeping the ankle in neutral position [25,27]. ...
... software was applied to compute CSA. Given the known correlation between body mass and tendon morphology [28], CSA and thickness values were also adjusted to one-third of the participant's body mass for the statistical analysis [14]. The PT characterization was evaluated with runners in the supine position, with both knees 30° bent [24,27]. ...
Article
Full-text available
Background: Unlike road running, mountain and trail running typically cover longer distances and include uphill and downhill segments that impose unique physiological and mechanical demands on athletes. Objectives: This study aimed to identify morphological differences in the patellar and Achilles tendons between trail and road runners. Moreover, the potential influence of weekly mileage and accumulated positive elevation gain on the morphology of both tendons was obtained. Design: Cross-sectional comparative study. Methods: Thirty-three road runners (11 women, 22 men) and thirty-three trail runners (13 women, 20 men) were recruited and their weekly mileage and elevation gain collected. All participants had a weekly training volume exceeding 20 km. The thickness and cross-sectional area (CSA) of their patellar and Achilles tendons were evaluated using ultrasound. Results: Independent samples t-tests revealed significant differences between groups for the Achilles tendon (p < 0.003) but not for the patellar tendon (p > 0.330). Further, Spearman's correlation coefficients indicated moderate positive correlations for the thickness and CSA of the Achilles tendon with weekly running volume (0.256 and 0.291, respectively) and with elevation gain (0.332 and 0.334, respectively), suggesting a tendency for the tendon to adapt to greater training loads, enhancing its structural integrity and resilience. Conclusions: Trail runners exhibit larger and thicker Achilles tendons, likely due to increased weekly mileage and elevation gain, highlighting the adaptive response to mechanical overload from uphill running.
... Non-parametric Wilcoxon rank-sum tests were performed to test for significant differences (significance defined as P < 0.05) in each parameter's median value between years prior to and following the TMDL and Control Program enactment. Effect size tests were also conducted to estimate the magnitude of the effects, with r values varying from 0, indicating no effect to 1 indicating a large effect (Cureton, 1956). ...
... The lake is considered flooded when gauge height is greater than 2.74 m above Zero Rumsey. Since the 1950s, this has occurred 12 times : 1956: , 1958 1965, 1970, 1974, 1986, 1995, 1998, 2011, 2017, and 2019. There does not appear to be a significant change in lake conditions during these times, except for specific conductance which typically was low during flood periods (Supplemental Figure 2C). ...
Article
Full-text available
Clear Lake is a large, natural lake in northern California, USA, with many beneficial uses but also substantive environmental issues. The lake has a long history of water quality problems including mercury contamination, pesticide usage, invasive species, and high rates of primary production. In recent years, an increase in cyanobacterial harmful algal blooms (cyanoHABs) has been documented in the lake, adding to the environmental issues faced by aquatic species present in the lake and the local community. Extensive observations of various physical, chemical, and biological parameters in Clear Lake began in the mid-1900s. The most pertinent of these data sets and findings have been reviewed and analyzed with the intent of improving our understanding of the causes and drivers of cyanoHABs, toxin production, and identifying data gaps. Several parameters including average annual water temperature have remained relatively constant over the past 70 years, although the seasonally averaged water temperatures have shifted in a manner that may now favor cyanobacterial dominance. Clear Lake has also witnessed recent changes in several environmental variables such as total phosphorus concentrations that might contribute to blooms. An analysis of lake conditions prior to and following the enactment of a total maximum daily load (TMDL) for phosphorus in 2007 indicates little measurable influence on total phosphorus concentrations in Clear Lake. The present trajectory of lake chemistry suggests that additional research and management efforts will be needed to address the recurrence of cyanoHABs in the future. Future lake management strategies should include consideration of the role of internal nutrient loads to lessen cyanoHABs. Furthermore, a better understanding of cyanobacterial community interactions and top-down effects on bloom formation within the lake can help guide future cyanoHAB management strategies.
... with location parameter = 0 (Wetzels & Wagenmakers, 2012). Rank-biserial correlation (R x ) was calculated as the effect size (Cureton, 1956). ...
... (R. Kirk, 1996). R x thresholds were set at: trivial ≤ 0.09; small ≥ 0.1; moderate ≥ 0.3; large ≥ 0.5; very large ≥ 0.7 (Cureton, 1956;Hopkins, 2002). All analyses were completed using JASP 0.16 (JASP Team, Amsterdam). ...
Article
Full-text available
Athlete stature and armspan is anecdotally assumed to provide an advantage in mixed martial arts (MMA), despite an absence of supporting data. In contrast, winners of MMA bouts have been shown to be younger than bouts losers. Whilst absolute measurements of stature, armspan and armspan:stature scale (A:S) have been shown to not distinguish between winners and losers of MMA bouts, relative differences between competitors have not been analysed. This study aimed to analyse 5 years of athlete age and morphological data to replicate and expand previous studies to determine whether absolute and/or relative age and morphological variables effect winning and losing in MMA. Bayes factor (BF>3) inferential analyses conducted on the cohort overall (n=2,229 professional bouts), each year sampled and each individual body mass division found that only absolute (winners = 29.8±4 years; losers = 30.7±4.2 years) and relative age (winners=0.82±5.3 years younger than losers) differentiates between winners and losers across the whole cohort, in 4 of the 5 years, and in 4 of the 13 divisions sampled. Armspan appears to provide an advantage in heavyweight only (winners = 198.4±6.6cm; losers = 196.1±7.7cm), with greater A:S being a disadvantage (winners = 1.003±0.022cm∙cm-1; losers = 1.010±0.023 cm∙cm-1) in women’s strawweight only. No variables had any effect on how bouts were won. These results confirm previous reports that the effect of athlete morphology is greatly overstated in MMA, appearing to be irrelevant in most divisions. Bout winners tend to be younger than losers, particularly in divisions displaying more diverse skill requirements.
... Posteriormente, se comprobó la distribución muestral de los datos mediante la prueba de Shapiro-Wilk para menos de 50 observaciones, resultando en una distribución no paramétrica. Dado lo anterior es que las medidas utilizadas para reportar hallazgos consta de la mediana, primer y tercer cuartil (Q1 y Q3) junto con su rango intercuartílico (RIC), además, para diferencias de grupo primero se calcularon los puntajes Z de la escala de homofobia interiorizada total en la muestra para poder realizar agrupaciones divididas simétricamente y comparar los puntajes de las actitudes, posteriormente, se utilizó el test de Brunner-Munzel (Karch, 2023) calculando una permutación aleatoria de 10000 con tiempo límite a 5 segundos y la correlación de rangos biseriales (rrb) como una medida del tamaño del efecto no paramétrico (Cureton, 1956) y el tamaño del efecto (δ) para el análisis de poder. Adicionalmente, se realizó un análisis de varianza (ANOVA) para saber si existió alguna fuente de variación entre las tres orientaciones sexuales respecto a todas las variables obtenidas considerando la magnitud de varianza (η2) y pruebas post hoc mediante la prueba de Tukey (Ptukey). ...
Article
Full-text available
Este artículo explora la relación entre las actitudes estructurales y la homofobia interiorizada en una muestra de 32 hombres homosexuales en Guadalajara, Jalisco, México. Basándose en la teoría sociológica de las emociones y la influencia cultural en el comportamiento, se examina cómo las normas sociales influyen en la construcción de la identidad sexual y el desarrollo de la homofobia internalizada. Los resultados revelan asociaciones significativas entre actitudes individuales y diversas dimensiones de la homofobia interiorizada, como ocultamiento, consumo de drogas, apego a la norma sexual y auto-rechazo. Las actitudes que promueven la masculinidad tradicional y la necesidad de ocultar la orientación sexual se relacionan con estrategias como el consumo de sustancias. La creencia en normas sociales rígidas, especialmente en torno a la masculinidad, está vinculada a niveles más altos de homofobia interiorizada. La identificación de estas asociaciones destaca la importancia de abordar las normas culturales que perpetúan la discriminación hacia la diversidad sexual. Se sugiere la implementación de intervenciones educativas y de sensibilización para desafiar estas normas y fomentar la aceptación de la diversidad sexual.
... Posteriormente, se comprobó la distribución muestral de los datos mediante la prueba de Shapiro-Wilk para menos de 50 observaciones, resultando en una distribución no paramétrica. Dado lo anterior es que las medidas utilizadas para reportar hallazgos consta de la mediana, primer y tercer cuartil (Q1 y Q3) junto con su rango intercuartílico (RIC), además, para diferencias de grupo primero se calcularon los puntajes Z de la escala de homofobia interiorizada total en la muestra para poder realizar agrupaciones divididas simétricamente y comparar los puntajes de las actitudes, posteriormente, se utilizó el test de Brunner-Munzel (Karch, 2023) calculando una permutación aleatoria de 10000 con tiempo límite a 5 segundos y la correlación de rangos biseriales (rrb) como una medida del tamaño del efecto no paramétrico (Cureton, 1956) y el tamaño del efecto (δ) para el análisis de poder. Adicionalmente, se realizó un análisis de varianza (ANOVA) para saber si existió alguna fuente de variación entre las tres orientaciones sexuales respecto a todas las variables obtenidas considerando la magnitud de varianza (η2) y pruebas post hoc mediante la prueba de Tukey (Ptukey). ...
Article
Full-text available
Este artículo explora la relación entre las actitudes estructurales y la homofobia interiorizada en una muestra de 32 hombres homosexuales en Guadalajara, Jalisco, México. Basándose en la teoría sociológica de las emociones y la influencia cultural en el comportamiento, se examina cómo las normas sociales influyen en la construcción de la identidad sexual y el desarrollo de la homofobia internalizada. Los resultados revelan asociaciones significativas entre actitudes individuales y diversas dimensiones de la homofobia interiorizada, como ocultamiento, consumo de drogas, apego a la norma sexual y auto-rechazo. Las actitudes que promueven la masculinidad tradicional y la necesidad de ocultar la orientación sexual se relacionan con estrategias como el consumo de sustancias. La creencia en normas sociales rígidas, especialmente en torno a la masculinidad, está vinculada a niveles más altos de homofobia interiorizada. La identificación de estas asociaciones destaca la importancia de abordar las normas culturales que perpetúan la discriminación hacia la diversidad sexual. Se sugiere la implementación de intervenciones educativas y de sensibilización para desafiar estas normas y fomentar la aceptación de la diversidad sexual.
... The effect size of each indicator has been measured with the following statistics: AUC [20] values computed from the MWW test statistic, rank-biserial correlation, R_RB = 2 AUC-1 [21], and γ0.5, a non-parametric version of Cohen's d statistic [22,23]. ...
Article
Full-text available
Viral diversity and disease progression in chronic infections, and particularly how quasispecies structure affects antiviral treatment, remain key unresolved issues. Previous studies show that advanced liver fibrosis in long-term viral infections is linked to higher rates of antiviral treatment failures. Additionally, treatment failure is associated with high quasispecies fitness, which indicates greater viral diversity and adaptability. As a result, resistant variants may emerge, reducing retreatment effectiveness and increasing the chances of viral relapse. Additionally, using a mutagenic agent in monotherapy can accelerate virus evolution towards a flat-like quasispecies structure. This study examines 19 chronic HCV patients who failed direct-acting antiviral (DAA) treatments, using NGS to analyze quasispecies structure in relation to fibrosis as a marker of infection duration. Results show that HCV evolves towards a flat-like quasispecies structure over time, leading also to advanced liver damage (fibrosis F3 and F4/cirrhosis). Based on our findings and previous research, we propose that the flat-like fitness quasispecies structure is the final stage of any quasispecies in chronic infections unless eradicated. The longer the infection persists, the lower the chances of achieving a cure. Interestingly, this finding may also be applicable to other chronic infection and drug resistance in cancer.
... Since the data for these metrics does not follow a normal distribution, as verified by the Shapiro-Wilk test [31], we use the non-parametric Wilcoxon signed-rank test [25] to determine whether there are statistically significant improvements or declines in code quality after the patches are applied. Additionally, we compute the Rank-Biserial Correlation [32] to quantify the magnitude of changes, interpreting effect sizes using Cohen's guidelines [33]. ...
Preprint
Full-text available
In recent years, AI-based software engineering has progressed from pre-trained models to advanced agentic workflows, with Software Development Agents representing the next major leap. These agents, capable of reasoning, planning, and interacting with external environments, offer promising solutions to complex software engineering tasks. However, while much research has evaluated code generated by large language models (LLMs), comprehensive studies on agent-generated patches, particularly in real-world settings, are lacking. This study addresses that gap by evaluating 4,892 patches from 10 top-ranked agents on 500 real-world GitHub issues from SWE-Bench Verified, focusing on their impact on code quality. Our analysis shows no single agent dominated, with 170 issues unresolved, indicating room for improvement. Even for patches that passed unit tests and resolved issues, agents made different file and function modifications compared to the gold patches from repository developers, revealing limitations in the benchmark's test case coverage. Most agents maintained code reliability and security, avoiding new bugs or vulnerabilities; while some agents increased code complexity, many reduced code duplication and minimized code smells. Finally, agents performed better on simpler codebases, suggesting that breaking complex tasks into smaller sub-tasks could improve effectiveness. This study provides the first comprehensive evaluation of agent-generated patches on real-world GitHub issues, offering insights to advance AI-driven software development.
... Since adoption is a binary categorical variable (non-users v. users) and trust levels are ordinal data, a point-biserial correlation analysis was deemed the most appropriate method. We use the function below (Cureton 1956), ...
Preprint
Full-text available
Language models (LMs) are revolutionizing knowledge retrieval and processing in academia. However, concerns regarding their misuse and erroneous outputs, such as hallucinations and fabrications, are reasons for distrust in LMs within academic communities. Consequently, there is a pressing need to deepen the understanding of how actual practitioners use and trust these models. There is a notable gap in quantitative evidence regarding the extent of LM usage, user trust in their outputs, and issues to prioritize for real-world development. This study addresses these gaps by providing data and analysis of LM usage and trust. Specifically, our study surveyed 125 individuals at a private school and secured 88 data points after pre-processing. Through both quantitative analysis and qualitative evidence, we found a significant variation in trust levels, which are strongly related to usage time and frequency. Additionally, we discover through a polling process that fact-checking is the most critical issue limiting usage. These findings inform several actionable insights: distrust can be overcome by providing exposure to the models, policies should be developed that prioritize fact-checking, and user trust can be enhanced by increasing engagement. By addressing these critical gaps, this research not only adds to the understanding of user experiences and trust in LMs but also informs the development of more effective LMs.
... Az elemzésben legtöbbet használt statisztika módszerek két minta, csoport értékeinek összeha- (Cureton, 1956). Ennek értéke -1 és 1 között alakulhat, és a teszt által kimutatott szám a hipotézis szempontjából kedvező és a nem kedvező párok, rangsorolt helyezések arányát jelzi (Kerby, 2014). ...
... We explored the weighted mean degree of different classes (controls and epilepsy types) using boxplots (see Fig 1), where the median (red line), 25 th -75 th percentiles (blue box), non-outlier extremes (black dashed lines) and outliers (red crosses) of the distributions are presented. Effect size was quantified using the rank-biserial correlation [43] (|r|2[0,1], where 0 means no rank correlation and 1 means perfect separation between groups), and significance was calculated using the Wilcoxon rank-sum and Kruskal-Wallis tests. To further quantify the differences between classes, receiver operating characteristic (ROC) curves were calculated for all frequency bands. ...
Article
Full-text available
Epilepsy is one of the most common neurological disorders in children. Diagnosing epilepsy in children can be very challenging, especially as it often coexists with neurodevelopmental conditions like autism and ADHD. Functional brain networks obtained from neuroimaging and electrophysiological data in wakefulness and sleep have been shown to contain signatures of neurological disorders, and can potentially support the diagnosis and management of co-occurring neurodevelopmental conditions. In this work, we use electroencephalography (EEG) recordings from children, in restful wakefulness and sleep, to extract functional connectivity networks in different frequency bands. We explore the relationship of these networks with epilepsy diagnosis and with measures of neurodevelopmental traits, obtained from questionnaires used as screening tools for autism and ADHD. We explore differences in network markers between children with and without epilepsy in wake and sleep, and quantify the correlation between such markers and measures of neurodevelopmental traits. Our findings highlight the importance of considering the interplay between epilepsy and neurodevelopmental traits when exploring network markers of epilepsy.
... If s a > s b , I = 1 and 0 otherwise. Model M exhibits no bias between groups when R M = 0. Our approach is inspired by the rank-biserial correlation (Cureton, 1956). It measures the probability a candidate sampled from group (a R ← A) is higher-ranked than a candidate from the reference group (b ∈ B). ...
Preprint
Full-text available
Large language models (LLMs) are now being considered and even deployed for applications that support high-stakes decision-making, such as recruitment and clinical decisions. While several methods have been proposed for measuring bias, there remains a gap between predictions, which are what the proposed methods consider, and how they are used to make decisions. In this work, we introduce Rank-Allocational-Based Bias Index (RABBI), a model-agnostic bias measure that assesses potential allocational harms arising from biases in LLM predictions. We compare RABBI and current bias metrics on two allocation decision tasks. We evaluate their predictive validity across ten LLMs and utility for model selection. Our results reveal that commonly-used bias metrics based on average performance gap and distribution distance fail to reliably capture group disparities in allocation outcomes, whereas RABBI exhibits a strong correlation with allocation disparities. Our work highlights the need to account for how models are used in contexts with limited resource constraints.
... Kendall's W coefficient of concordance was reported as the effect size (W < 0.10 negligible, W < 0.25 small, W < 0.40 moderate, otherwise large effect). Rank biserial correlation (r rb ) was used as effect size for significant post-hoc test (0 being the lowest and 1 being the highest effect size) (Cureton 1956). All statistical tests were performed at α = 0.05. ...
Article
Full-text available
Transspinal (or transcutaneous spinal cord) stimulation is a promising noninvasive method that may strengthen the intrinsic spinal neural connectivity in neurological disorders. In this study we assessed the effects of cervical transspinal stimulation on the amplitude of leg transspinal evoked potentials (TEPs), and the effects of lumbosacral transspinal stimulation on the amplitude of arm TEPs. Control TEPs were recorded following transspinal stimulation with one cathode electrode placed either on Cervical 3 (21.3 ± 1.7 mA) or Thoracic 10 (23.6 ± 16.5 mA) vertebrae levels. Associated anodes were placed bilaterally on clavicles or iliac crests. Cervical transspinal conditioning stimulation produced short latency inhibition of TEPs recorded from left soleus (ranging from − 6.11 to -3.87% of control TEP at C-T intervals of -50, -25, -20, -15, -10, 15 ms), right semitendinosus (ranging from − 11.1 to -4.55% of control TEP at C-T intervals of -20, -15, 15 ms), and right vastus lateralis (ranging from − 13.3 to -8.44% of control TEP at C-T intervals of -20 and − 15 ms) (p < 0.05). Lumbosacral transspinal conditioning stimulation produced no significant effects on arm TEPs. We conclude that in the resting state, cervical transspinal stimulation affects the net motor output of leg motoneurons under the experimental conditions used in this study. Further investigations are warranted to determine whether this protocol may reactivate local spinal circuitry after stroke or spinal cord injury and may have a significant effect in synchronization of upper and lower limb muscle synergies during rhythmic activities like locomotion or cycling.
... The results show that the alternative hypothesis (group female is less than group male) is accepted for most of the variables except for questions E_3 (Know the different dimensions and strategies for monitoring and evaluating student learning) and PP_2 (Know the standards of ethical conduct in science teaching, consistent with the interests of students and the educational community) where the null hypothesis is accepted indicating that there is no difference between the selfperception of women and men. For most of the variables except for questions E_3 and PP_2 as shown in Table 4, there are small to medium negative effect sizes mean that the male group tends to be larger than female group measured with the rank-biserial correlation [47], [48]. For all tests, the alternative hypothesis specifies that group female is less than group male. ...
Article
Full-text available
This study analyses the self-perception of 274 teachers from public, urban, and rural schools in Manizales, Colombia, using a Likert scale instrument developed considering the scientific competencies determined by UNESCO. In the analysis of the results, it was found that, even though in the sample analyzed, women have greater training in research and scientific competencies, their perception of their abilities in this aspect is lower than that of men. With the Mann-Whitney U test and rank-biserial correlation, it was possible to test the alternative hypothesis that the female self-perception of capabilities is lower than the male for each question. The instrument was validated with the internal consistency index with an α=0.98. Additionally, the instrument has been validated with a confirmatory factor analysis, obtaining values of comparative fit index (CFI) of 0.869 and Tucker-Lewis's index (TLI) of 0.858 with RMSEA and SRMR of 0.103 and 0.063, respectively. The paper provides insights into the self-perception of scientific competencies among teachers, which can inform teacher training and professional development programs. The study highlighted the gender gap in self-perception of scientific competencies, which can inform policies and interventions to promote gender equity in science education.
... Statistical significance was established at p<0.05. Effect sizes were assessed using the rank biserial correlation (r rb ) (38) and calculated using JASP 0.11.1.0. ...
Article
Full-text available
Objective This study aimed at assessing the alterations in upper limb motor impairment and connectivity between motor areas following the post-stroke delivery of cathodal transcranial direct current stimulation sessions. Methods Modifications in the Fugl-Meyer Assessment scores, connectivity between the primary motor cortex of the unaffected and affected hemispheres, and between the primary motor and premotor cortices of the unaffected hemisphere were compared prior to and following six sessions of cathodal transcranial direct current stimulation application in 13 patients (active = 6; sham = 7); this modality targets the primary motor cortex of the unaffected hemisphere early after a stroke. Results Clinically relevant distinctions in Fugl-Meyer Assessment scores (≥9 points) were observed more frequently in the Sham Group than in the Active Group. Between-group differences in the alterations in Fugl-Meyer Assessment scores were not statistically significant (Mann-Whitney test, p=0.133). ROI-to-ROI correlations between the primary motor cortices of the affected and unaffected hemispheres post-therapeutically increased in 5/6 and 2/7 participants in the Active and Sham Groups, respectively. Between-group differences in modifications in connectivity between the aforementioned areas were not statistically significant. Motor performance enhancements were more frequent in the Sham Group compared to the Active Group. Conclusion The results of this hypothesis-generating investigation suggest that heightened connectivity may not translate into early clinical benefits following a stroke and will be crucial in designing larger cohort studies to explore mechanisms underlying the impacts of this intervention. ClinicalTrials.gov Identifier: NCT02455427.
... We treated group as an independent variable, and the scores for each of the stories in ILDT and the total score, and the scores for each of the stages of the decisionmaking process as dependent variables. As a measure of the effect size we used rank-biserial correlation coefficients (Cureton 1956). ...
Article
Full-text available
Decision-making capability is essential in fulfilling the need for autonomy of people with intellectual disability. In this study we aimed to examine decision-making capability regarding important social situations in people with intellectual disability at different stages of decision-making process. We studied 80 vocational school students with mild intellectual disability and 80 students of a similar age from mass vocation schools. We assessed decision-making with Important Life Decisions Task (ILDT). Students with intellectual disability obtained significantly lower scores than controls for each of the stories in ILDT as in each stage and overall final score in the decision-making process. The magnitude of difference in scores between groups varied in different stages of decision-making process. The most notable difficulties in decision-making regarding important social situations in people with intellectual disability are related to the evaluation of alternatives stage. Pattern of differences obtained in our study may be related to the content of decision-making problems.
... Posteriormente comprobó la distribución muestral mediante la prueba de normalidad de Shapiro-Wilk para decidir las pruebas estadísticas adecuadas para la inferencia, y se llegó a la conclusión de que la distribución fue No-Paramétrica. Por lo anterior es que las medidas utilizadas para reportar hallazgos consta de la mediana, primer y tercer cuartil junto con su rango intercuartílico (RIC), además, para diferencias de grupo para aciertos, omisiones, comisiones y tiempos de reacción se utilizó el test de Brunner-Munzel (Karch, 2023) calculando una permutación aleatoria de 10000 con tiempo límite a 5 segundos y la correlación de rangos biseriales (rrb) como una medida del tamaño del efecto no paramétrico (Cureton, 1956), con el que después se haría una conversión a tamaño del efecto (δ) (Ellis, 2010;Friedman, 1968) Como nota, el Ո al multiplicarse por 100 permite conocer el porcentaje de superposición (Ո%). Después, se realizó el análisis de concordancia con sus respectivos intervalos de confianza (Cinf | Csup) y el gráfico de Bland-Altman para verificar concordancia y coherencia entre ambas medidas, mientras más cercano a 1 más concordante es la correlación. ...
Article
Full-text available
Resumen El funcionamiento ejecutivo es una serie de procesos cognitivos que coordinan y organizan otros procesos mentales para adaptarse a las demandas del entorno, siendo la inhibición voluntaria uno de estos procesos. La inhibición implica interpretar señales y monitorear conflictos, activando la corteza del cíngulo anterior para un mayor control cognitivo. La teoría del control atencional sugiere que la inhibición y el control atencional comparten sustratos cerebrales, y su disfunción afecta la capacidad de inhibición y alternancia. En una tarea donde se deben de inhibir respuestas preponderantes ante ciertos estímulos, como lo es la tarea tipo Go/No-Go, la familiaridad con los estímulos podría influir en el rendimiento, ya que la exposición a elementos culturalizados puede facilitar las respuestas preponderantes. Sin embargo, este efecto no ha sido explorado del todo. Para poder explorar este fenómeno se realizó un estudio observacional con participantes 15 participantes normotípicos sometidos a dos tareas Go/No-Go: Una tarea con estímulos familiares culturalmente y otra tarea con estímulos no familiares culturalmente. Se analizaron las respuestas correctas, el tiempo de reacción, las comisiones y omisiones de ambas versiones. Los resultados sugieren que existe un tamaño del efecto amplio (δ = 1.16, p = < 0.001) diferencial de tiempo de respuesta entre ambas versiones a favor de la tarea con familiaridad donde se respondía con mayor velocidad. Adicionalmente, las medidas de concordancia y correspondencia se obtuvo una similitud baja (-0.06 [-0.36 | 0.25]) entre los tiempos de respuesta de las tareas con familiaridad cultural y sin esta. Se concluye que existe la influencia de un efecto diferencial cuando se usa un estímulo culturalmente familiar y uno que no lo es, aunque, este efecto se observa en los tiempos de respuesta. Se sugiere prestar atención a este factor como un elemento activo de las tareas Go/NoGo en futuros estudios. Palavras-chave: funcionamiento ejecutivo, tarea neuropsicológica, go/no-go, inhibición, diseño de tareas, familiaridad cultural. Artigo recebido: 05/12/2023; Artigo aceito: 29/04/2024. Correspondencias relacionadas con este artículo deben ser enviadas a Jesua Resumo O funcionamento executivo é uma série de processos cognitivos que coordenam e organizam outros processos mentais para se adaptar às demandas do ambiente, sendo a inibição voluntária um desses processos. A inibição envolve a interpretação de sinais e o monitoramento de conflitos, ativando o córtex cingulado anterior para maior controle cognitivo. A teoria do controle atencional sugere que a inibição e o controle atencional compartilham substratos cerebrais, e sua disfunção afeta a inibição e a capacidade de troca. Em uma tarefa em que as respostas preponderantes a determinados estímulos devem ser inibidas, como a tarefa Go/No-Go, a familiaridade com os estímulos pode influenciar o desempenho, pois a exposição a itens culturalizados pode facilitar as respostas preponderantes. Entretanto, esse efeito ainda não foi totalmente explorado. Para explorar esse fenômeno, foi realizado um estudo observacional com 15 participantes normotípicos que foram submetidos a duas tarefas Go/No-Go: uma tarefa com estímulos culturalmente familiares e outra com estímulos culturalmente desconhecidos. As respostas corretas, o tempo de reação, as comissões e as omissões foram analisadas para ambas as versões. Os resultados sugerem que há um grande diferencial de tempo de resposta de tamanho de efeito (δ = 1,16, p = < 0,001) entre as duas versões em favor da tarefa de familiaridade, em que as respostas foram mais rápidas. Além disso, as medidas de concordância e correspondência mostraram uma baixa similaridade (-0,06 [-0,36 | 0,25]) entre os tempos de resposta das tarefas com e sem familiaridade cultural. Conclui-se que existe a influência de um efeito diferencial ao usar um estímulo culturalmente familiar e um não familiar, embora esse efeito seja observado nos tempos de resposta. Sugere-se prestar atenção a esse fator como um elemento ativo das tarefas Go/NoGo em estudos futuros. Palabras clave: funcionamento executivo, tarefa neuropsicológica, go/no-go, inibição, design de tarefa, familiaridade cultural. Résumé Le fonctionnement exécutif est une série de processus cognitifs qui coordonnent et organisent d'autres processus mentaux pour s'adapter aux exigences de l'environnement, et l'inhibition volontaire est l'un de ces processus. L'inhibition implique l'interprétation des signaux et la surveillance des conflits, activant le cortex cingulaire antérieur pour un meilleur contrôle cognitif. La théorie du contrôle attentionnel suggère que l'inhibition et le contrôle attentionnel partagent des substrats cérébraux et que leur dysfonctionnement affecte l'inhibition et la capacité de commutation. Dans une tâche où les réponses prépondérantes à certains stimuli doivent être inhibées, telle que la tâche Go/No-Go, la familiarité avec les stimuli peut influencer la performance, car l'exposition à des éléments culturalisés peut faciliter les réponses prépondérantes. Toutefois, cet effet n'a pas encore été pleinement exploré. Pour explorer ce phénomène, une étude d'observation a été réalisée avec 15 participants normotypiques qui ont été soumis à deux tâches Go/No-Go: une tâche avec des stimuli culturellement familiers et une autre avec des stimuli culturellement non familiers. Les réponses correctes, le temps de réaction, les commissions et les omissions ont été analysés pour les deux versions. Les résultats suggèrent qu'il existe une différence de temps de réponse à effet important (δ = 1,16, p = < 0,001) entre les deux versions en faveur de la tâche de familiarité, pour laquelle les réponses étaient plus rapides. En outre, les mesures d'accord et de correspondance ont montré une faible similarité (-0,06 [-0,36 | 0,25]) entre les temps de réponse des tâches avec et sans familiarité culturelle. Il est conclu que l'utilisation d'un stimulus culturellement familier et d'un stimulus non familier peut avoir un effet différentiel, bien que cet effet soit observé dans les temps de réponse. Il est suggéré de prêter attention à ce facteur en tant qu'élément actif des tâches Go/NoGo dans les études futures. Mots-clés : executive functioning, neuropsychological task, go/no-go, inhibition, task design, cultural familiarity. Abstract Executive functioning is a cognitive process that coordinates and organizes other mental processes to adapt to the demands of the environment, with the inhibition prepotent responses being a fundamental skill of this process. Inhibition involves interpreting signals and monitoring conflicts, activating the anterior cingulate cortex for greater cognitive control. The attentional control theory suggests that inhibition and attentional control share brain substrates, and their interruption affects the capacity for inhibition and switching. In a task where preponderant responses to certain stimuli must be inhibited, such as the Go/No-Go task, cultural familiarity could influence performance, since exposure to culturalized elements can become preponderant responses. However, this effect has not been fully explored. To explore this phenomenon, an observational study was carried out with 15 normotypical participants subjected to the Go/No-Go task in two versions: the traditional task and a variant with novel stimuli. The correct answers, the reaction time, the commissions, and omissions of both versions were analyzed. The results suggest that there is a large effect size (δ = 1.16, p = < 0.001) reaction time differential between the two versions in favor of the traditional version, which responded more quickly. Additionally, the concordance and correspondence measures obtained a low similarity (-0.06 [-0.36 | 0.25]) between the reaction times of the traditional and alternative tasks. It is concluded that the data suggest the influence of a differential effect when using a culturally learned stimulus and one that is not, although this effect is concentrated in response times. It is suggested to pay attention to this factor as an active element of the Go/NoGo task in future studies.
... (see Figure 1). Results of the rank-biserial correlation indicated the difference between the two proportions, meaning that 49.4% of the proportion of integration frequencies differed between K-6 and 7-12 grades (Cureton, 1956;Kerby, 2014). The median value of 2.1 (mean rank 29.61) at the K-6 level indicates that the technology-integrated activities were implemented only a few times per semester. ...
Article
This study examined technology integration in K–12 public schools based on the PICRAT (Passive, Interactive, Creative, Replace, Amplify, Transformative) matrix classification. Seventy-six K–12 teachers voluntarily completed an online survey between December 2020 and February 2021 that asked about their (a) grade-level teaching (b) implementation frequency of various technology-integrated activities, and (c) support. Results showed that teachers’ real-class implementation of technology-integrated activities and the PICRAT levels of integration differed significantly between K–6 vs. 7–12 grades. More frequent use of overall technology-integrated activities was found in grades 7–12 than in K–6. Further, an analysis based on the PICRAT matrix found higher frequencies for Passive and Replace levels and lower frequencies for Creative and Transformative use for all grades. Finally, teacher support significantly predicted overall technology-integration frequencies for both K–6 and 7–12 grades. With regard to prediction on each level of PICRAT classification, support in K–6 grades was significantly related to all levels of technology integration except for Passive. However, in 7–12 grade, support failed to predict the frequency of technology integration for all levels except Amplify. Future directions are discussed, including the role of professional development and the importance of technology-integration frameworks in practice.
... The Wilcoxon signed-rank test was used in the Subscales of the questionnaires that had no normal distribution of differences between the paired measurements (Wilcoxon, 1945). The effect size for significant results of the Wilcoxon signed-rank test was calculated with a rank-biserial correlation (Cureton, 1956) and interpreted by Cohen's recommendations (Cohen, 2008). ...
Preprint
Full-text available
Public Significance Statement The disclosure of a major depression diagnosis to men has the potential to alleviate male-typical externalizing depression symptoms and reduce gender role conflict, which can improve mental health and may increase likelihood for men to be receptive to mental health care. The increasing evidence for traditional masculinity ideologies being trait-like constructs necessitates focused research on interventions that acknowledge their persistent nature and strategically incorporate them to lessen their adverse effects on men's mental health.
... Note that the probability of net superiority ξ, defined by Eq. (5.1), is equivalent to the rank-biserial correlation [26]. ...
Article
This paper proposes a probabilistic measure for the net superiority of one group (or group mean) over another group (or group mean), named “probability of net superiority”. The proposed net superiority probability analysis is an extension of the previously proposed exceedance probability (EP) analysis. It comes in two versions: one for parametric comparisons (distribution-based) and the other for nonparametric comparisons (distribution-free). In practice, the distribution-based net superiority probability analysis can be used as an alternative to the two-sample two-sided z-test or t-test; the distribution-free net superiority probability analysis can be used as an alternative to the Mann–Whitney U-test. Three examples are presented to demonstrate the application of the proposed net superiority probability analysis.
... Patients with significant decreases in pain over time were classified as VAS pain responders. To determine the significance and directionality of power modulation from baseline, we used a non-parametric rank-biserial Spearman's correlation 88 . This method enabled correlation between continuous ranked variables (i.e., EEG data from two conditions) and a dichotomy (i.e., a generated data vector of − 1 s and 1 s corresponding to the lengths of either EEG data conditions). ...
Article
Full-text available
Limitations in chronic pain therapies necessitate novel interventions that are effective, accessible, and safe. Brain–computer interfaces (BCIs) provide a promising modality for targeting neuropathology underlying chronic pain by converting recorded neural activity into perceivable outputs. Recent evidence suggests that increased frontal theta power (4–7 Hz) reflects pain relief from chronic and acute pain. Further studies have suggested that vibrotactile stimulation decreases pain intensity in experimental and clinical models. This longitudinal, non-randomized, open-label pilot study's objective was to reinforce frontal theta activity in six patients with chronic upper extremity pain using a novel vibrotactile neurofeedback BCI system. Patients increased their BCI performance, reflecting thought-driven control of neurofeedback, and showed a significant decrease in pain severity (1.29 ± 0.25 MAD, p = 0.03, q = 0.05) and pain interference (1.79 ± 1.10 MAD p = 0.03, q = 0.05) scores without any adverse events. Pain relief significantly correlated with frontal theta modulation. These findings highlight the potential of BCI-mediated cortico-sensory coupling of frontal theta with vibrotactile stimulation for alleviating chronic pain.
... The effect sizes were derived from the U-statistics using the rankbiserial correlation coefficient (Cureton, 1956): r ¼ 2U n1Án2 , as well as the area under the receiver operating characteristic curve (AUROC) (Hajian-Tilaki, 2013): AUROC ¼ U n1Án2 : A post-hoc EBI-based estimation of the statistical power was then calculated (Nasseroleslami, 2018). ...
Article
Full-text available
Recent electroencephalography (EEG) studies have shown that patterns of brain activity can be used to differentiate amyotrophic lateral sclerosis (ALS) and control groups. These differences can be interrogated by examining EEG microstates, which are distinct, reoccurring topographies of the scalp's electrical potentials. Quantifying the temporal properties of the four canonical microstates can elucidate how the dynamics of functional brain networks are altered in neurological conditions. Here we have analysed the properties of microstates to detect and quantify signal‐based abnormality in ALS. High‐density resting‐state EEG data from 129 people with ALS and 78 HC were recorded longitudinally over a 24‐month period. EEG topographies were extracted at instances of peak global field power to identify four microstate classes (labelled A‐D) using K‐means clustering. Each EEG topography was retrospectively associated with a microstate class based on global map dissimilarity. Changes in microstate properties over the course of the disease were assessed in people with ALS and compared with changes in clinical scores. The topographies of microstate classes remained consistent across participants and conditions. Differences were observed in coverage, occurrence, duration, and transition probabilities between ALS and control groups. The duration of microstate class B and coverage of microstate class C correlated with lower limb functional decline. The transition probabilities A to D, C to B and C to B also correlated with cognitive decline (total ECAS) in those with cognitive and behavioural impairments. Microstate characteristics also significantly changed over the course of the disease. Examining the temporal dependencies in the sequences of microstates revealed that the symmetry and stationarity of transition matrices were increased in people with late‐stage ALS. These alterations in the properties of EEG microstates in ALS may reflect abnormalities within the sensory network and higher‐order networks. Microstate properties could also prospectively predict symptom progression in those with cognitive impairments.
... Next, we conducted a Wilcoxon signed-ranks test for each distribution to test the hypothesis that the median percent compensation for each group and task was different from 0, indicating significant compensation in response to the perturbation. Effect sizes were measured by calculating the rank-biserial correlation coefficient for each analysis (Cureton, 1956). ...
Article
Full-text available
Purpose The practice of removing “following” responses from speech perturbation analyses is increasingly common, despite no clear evidence as to whether these responses represent a unique response type. This study aimed to determine if the distribution of responses to auditory perturbation paradigms represents a bimodal distribution, consisting of two distinct response types, or a unimodal distribution. Method This mega-analysis pooled data from 22 previous studies to examine the distribution and magnitude of responses to auditory perturbations across four tasks: adaptive pitch, adaptive formant, reflexive pitch, and reflexive formant. Data included at least 150 unique participants for each task, with studies comprising younger adult, older adult, and Parkinson's disease populations. A Silverman's unimodality test followed by a smoothed bootstrap resampling technique was performed for each task to evaluate the number of modes in each distribution. Wilcoxon signed-ranks tests were also performed for each distribution to confirm significant compensation in response to the perturbation. Results Modality analyses were not significant (p > .05) for any group or task, indicating unimodal distributions. Our analyses also confirmed compensatory reflexive responses to pitch and formant perturbations across all groups, as well as adaptive responses to sustained formant perturbations. However, analyses of sustained pitch perturbations only revealed evidence of adaptation in studies with younger adults. Conclusion The demonstration of a clear unimodal distribution across all tasks suggests that following responses do not represent a distinct response pattern, but rather the tail of a unimodal distribution. Supplemental Material https://doi.org/10.23641/asha.24282676
... [2] This article discussed alternative hypotheses, including a stochastic ordering.A method of reporting the effect size for the Mann-Whitney U test is with the rank-biserial correlation. Edward Cureton introduced and named the measure [3] Like other correlation measures, the rank-biserial correlation can range from minus one to plus one, with a value of zero indicating no relationship. Dave Kerby [4] introduced the simple difference formula to compute the rank-biserial correlation from the common language effect size: the correlation is the difference between the proportion of pairs that support the hypothesis minus the proportion that do not. ...
Article
Full-text available
Mann-Whitney test is a non-parametric test of the null hypothesis that use this when two different groups of participants perform both conditions of your study: i.e., it is appropriate for analysing the data from an independent-measures design with two conditions. Aim of study to compare the microcirculation of average velocity distribution of clinically normal and diabetes patients(DM) of 18 female ages of(19-40) and 31 of male ages(20-59) were chosen for the study of Laser Doppler flowmetry in the dorsal pides on foot of normal volunteers and diabetic patients. Data were statistically analyzed in Matlab., program pearson's tests. Findings: results of t-test ,F-test are significant(P<0.05) differences between clinically female and diabetes ,so as normal male and diabetes.
... Demographic data were compared using t-tests and clinical data were not normally distributed, so the Mann-Whitney U-test was used to compare the clinical and control groups and generate rankbiserial correlation effect sizes, which were interpreted as 0.1 = small effect, 0.3 = medium effect, and 0.5 = large effect (Cureton, 1956). Spearman's ρ was calculated to correlate the frequency of negative interpretations and ARS-Q scores with symptoms of depression, anxiety, stress, and eating pathology. ...
Article
Full-text available
Persons with bulimia nervosa (BN) often experience psychosocial difficulties, in particular heightened sensitivity to social rejection and a negative bias toward their social environment. Conversely, social competence and close friendships are protective against mental ill health. The aims of this study were to evaluate the interpretation of ambiguous social scenarios in females with and without BN and to assess the relationship between interpretation biases and clinical characteristics. Females with BN (n = 35) and controls (n = 35) were recruited via social media. Participants completed the Eating Disorder Examination Questionnaire (EDE‐Q), Adult Rejection Sensitivity Questionnaire, the Depression Anxiety and Stress Scales (DASS), and finished sentence stems depicting ambiguous social scenarios. Completed sentence stems were rated as positive, neutral, or negative by blinded researchers. Females with BN made fewer positive and more negative interpretations of sentence stems than controls. The frequency of negative interpretations correlated positively with clinical symptoms on the EDE‐Q, A‐RSQ, and DASS. A negative interpretation bias was found in females with BN, which aligns with the finding shown by Cardi et al. that females with anorexia nervosa have a negative interpretation bias toward ambiguous social scenarios. This bias was not only associated with eating disorder psychopathology but also with depression, anxiety, and stress, highlighting a potential transdiagnostic role. Interventions that address psychosocial difficulties might prevent the onset, reduce symptoms, and improve prognosis.
... To determine statistically significant differences between the groups, a non-parametric Mann-Whitney U test was used, which does not make any assumptions about the distribution of data. The obtained Mann-Whitney test U statistic was then used to calculate the rank-biserial correlation [19] in order to estimate the effect size [20]. Furthermore, the bootstrapping method was used to estimate the mean difference between samples and the 95% confidence interval of the mean difference between samples. ...
Article
Full-text available
Background and Objectives: Non-steroidal anti-inflammatory drugs (NSAIDs), which have anti-inflammatory and analgesic properties, are commonly used in the treatment of various, particularly frequent, as well as chronic, conditions in older patients. Due to common polypragmasia in these patients and a high risk of adverse drug reactions (ADRs) and drug interactions, pain management poses a therapeutic challenge. This study describes the importance of ADR reports in the identification of polypharmacy and the ensuing interactions. Materials and Methods: Both healthcare professionals (HPs) and non-healthcare professionals (non-HPs) reports collected in the EudraVigilance database of NSAIDs, including most commonly co-reported medications and reported reactions, were analysed and differences between HPs and non-HPs reports were identified. Results: In the analysed period and group, non-HPs reported more reactions but indicated fewer drugs as suspect or concomitant. The outcomes of our analysis indicate more HP engagement and more detailed reports of serious ADRs when compared to non-serious individual case safety reports (ICSRs) by non-HPs, which appeared more detailed. Such reactions as kidney failure and increased risk of bleeding are known adverse reactions to NSAIDs and common symptoms of their interactions, which were described in the available literature. They were much more frequently reported by HPs than by non-HPs. Non-HPs more frequently reported reactions that may have been considered less significant by HPs. Conclusions: The differences between healthcare professionals’ (HPs) and non-healthcare professionals’ (non-HPs) reports may result from the fact that the reports from patients and their caregivers require a professional medical diagnosis based on symptoms described by the patient or additional diagnostic tests. This means that when appropriately classified, medically verified, and statistically analysed, the data may provide new evidence for the risks of medication use or drug interactions.
... For this purpose, we conducted the Mann-Whitney U test for independent samples [35], and in the case of multiple comparisons, multiple testing corrections based on the False Discovery Rate (FDR) were applied [36]. Effect sizes were calculated by Rank-Biserial Correlation (r rb , [37]. Pending, we calculated the interconnections of study signals in vs. out of lockdown days, by Spearman correlation coefficients [38] and compared them by Fisher's Z transformation [39]. ...
Article
Full-text available
Much has been written about the COVID-19 pandemic’s epidemiological, psychological, and sociological consequences. Yet, the question about the role of the lockdown policy from psychological and sociological points of view has not been sufficiently addressed. Using epidemiological, psychological, and sociological daily data, we examined the causal role of lockdown and variation in morbidity referring to emotional and behavioral aspects. Dynamics of support requests to the Sahar organization concerning loneliness, depression, anxiety, family difficulties, and sexual trauma were investigated alongside processes of emergency and domestic violence reports to the Ministry of Welfare and Social Affairs. By exploring the signals and predictive modeling for a situation with no lockdown implementation, the lockdown was found as a critical factor in distress rising among the general population, which could affect long after the improvement in pandemic case counts. Applications and implications are discussed in the context of decision-making in dealing with crises as well as the need to allocate resources for adaptive coping.
... Algunos autores han sugerido hasta setenta variantes de medidas de tamaño del efecto (Kirk, 2003). Inclusive se ha propuesto para medidas no paramétricas (Cureton, 1956) denominada correlación de rangos biseriales (rrb). Esta última es de la familia de las relaciones "r" en lugar de las diferencias δ, y se ha propuesto que estas son homologas (Ellis, 2010). ...
Article
Full-text available
En el campo de las ciencias de la conducta resulta indispensable realizar interpretaciones adecuadas de los resultados obtenidos que permitan el rechazo de la hipótesis nula. Se conoce que las características propias de la disciplina cuentan con desventajas para el abordaje estadístico en comparación con otras áreas aplicadas del conocimiento, donde por naturaleza se tiene mayor probabilidad de obtener datos con distribuciones normales y, por lo tanto, usar técnicas paramétricas. Dado lo anterior es que se realiza una recopilación de información de los elementos de mayor importancia sugeridos para ampliar mejorar la robustez de la interpretación tanto para técnicas paramétricas como para las no paramétricas.
... Si bien estos pueden llegar a ser compatibles según sea el objetivo, también pueden conducir a una serie de errores inferenciales, ya que su interpretación se basa casi exclusivamente en los rangos, cálculos que no siempre son concordantes con el promedio y la desviación estándar (Siegel & Castellan, 1998). Una posible solución a esta problemática puede ser el cálculo de la correlación de rangos biseriales (r rb ) propuesta por Cureton (1956) y modificada por Wendt (1972) basada en la fórmula de la U de Mann-Whintey, que resulta de una modificación de la aproximación biserial en donde el centro de atención es el coeficiente de correlación entre rangos (X y Y), más que el producto del coeficiente de correlación de X y Y (r b ). Esta correlación cuenta con varias ventajas, ya que no necesariamente realiza relaciones lineales, es decir, utiliza una relación monotónica entre X y Y similar a la correlación de la tau b de Kendall o la ρ de Spearman que siguen la lógica matemática de la U de Mann-Whitney (Kerby, 2014). ...
Article
Full-text available
El tamaño del efecto (TE) es una medida amplia-mente utilizada para determinar el grado en el que una variable se comporta según la condición de estudio. Esta medida puede utilizarse como una forma de complemento para el valor p que permite robustecer la interpretación de la hipótesis. En estudios de ciencias de la conducta es poco usual que se obtenga una distribución normal dadas las características de las variables; por ello es que se ha sugerido utilizar más cálculos con el fin de aumentar la confiabilidad de los estudios. No obstante, algunos de esos cálculos usan el tamaño del efecto. Wendt en 1972 propuso el cálculo de la correlación de rangos biseriales (r rb) a partir del cálculo de la U de Mann-Whitney. Sin embargo, se desconoce si el cálculo de tamaño del efecto paramétrico (δ) y la rrb de Wendt son equivalentes. El presente estudio busca conocer la concordancia de las medidas de tamaño del efecto paramétrico (δ) y la rrb de Wendt y sus propiedades estadísticas. Se realizó una simulación de 200 posibles resultados diferentes y se calculó la correlación, los intervalos de confiabilidad y predicción, la concordancia y el ratio máximo de predicción de Vovk-Sellke para los resultados de δ, la rrb y su transformación en correlación. Se obtuvieron correlaciones y estabilidad casi perfectas entre las tres condiciones, aunque una variabilidad moderada en la concordancia. Los límites entre los cálculos pudieron tener un efecto sobre la concordancia de los resultados; no obstante, no es desestimable por su estabilidad y fuerza correlativa. La δ, la rrb y la transformación δ en correlación parecen tener propiedades estadísticas con similitudes que las vuelven confiables entre sí. Se reporta un nivel de concordancia entre la rrb y la δ de Cohen clásica con suficiente confiabilidad; el nivel de concordancia entre la δ de Cohen y su conversión con el despeje en r de Pearson es satisfactorio
... Expected values were above 5 and cells were mutually exclusive. Expecting that a higher number of visitors would bring more funding for zoos to invest in in situ projects, we used a rank-biserial correlation (Cureton, 1956), a nonparametric test, to assess the relationship between the number of annual zoo visitors (visitor level: 1; 2; 3; 4; 5; ordinal variable) and whether zoos funded in situ projects (dichotomous categorical variable). We calculated the rank-biserial correlation coefficient and its 95% Confidence Interval (CI). ...
Article
Full-text available
Mismatches between conservation action and conservation needs have been highlighted for diverse species. Lion (Panthera leo) conservation is no exception, raising the question of whether current conservation strategies are always adequate to ensure the long‐term persistence of threatened taxa. To investigate the representation of different lion Evolutionary Significant Units in field research, captive populations, funding allocation, and education, we carried out a literature review and sent an online questionnaire to zoos worldwide. Over 75% of the publications focused on southern and eastern African populations. Uplisting the West African lion to Critically Endangered did not change this result. We received 88 responses from zoos, which reported 346 lions in 83 zoos. Only 14 individuals have West and Central African origins. Over 70% of the respondents reported that they do not include any information on the conservation status or taxonomy of lions from West and Central Africa in their education programs. The minority of zoos funding in situ lion projects did so in Eastern and Southern Africa. We provide recommendations to encourage role‐players involved in lion and other threatened species conservation to address this mismatch by shifting some of their attention and funding to West and Central Africa.
Article
Full-text available
Context In the last decade of data-driven decision-making, Machine Learning (ML) systems reign supreme. Because of the different characteristics between ML and traditional Software Engineering systems, we do not know to what extent the issue-reporting needs are different, and to what extent these differences impact the issue resolution process. Objective We aim to compare the differences between ML and non-ML issues in open-source applied AI projects in terms of resolution time and size of fix. This research aims to enhance the predictability of maintenance tasks by providing valuable insights for issue reporting and task scheduling activities. Method We collect issue reports from Github repositories of open-source ML projects using an automatic approach, filter them using ML keywords and libraries, manually categorize them using an adapted deep learning bug taxonomy, and compare resolution time and fix size for ML and non-ML issues in a controlled sample. Result 147 ML issues and 147 non-ML issues are collected for analysis. We found that ML issues take more time to resolve than non-ML issues, the median difference is 14 days. There is no significant difference in terms of size of fix between ML and non-ML issues. No significant differences are found between different ML issue categories in terms of resolution time and size of fix. Conclusion Our study provided evidence that the life cycle for ML issues is stretched, and thus further work is required to identify the reason. The results also highlighted the need for future work to design custom tooling to support faster resolution of ML issues.
Article
Full-text available
Correlations between altered body temperature and depression have been reported in small samples; greater confidence in these associations would provide a rationale for further examining potential mechanisms of depression related to body temperature regulation. We sought to test the hypotheses that greater depression symptom severity is associated with (1) higher body temperature, (2) smaller differences between body temperature when awake versus asleep, and (3) lower diurnal body temperature amplitude. Data collected included both self-reported body temperature (using standard thermometers), wearable sensor-assessed distal body temperature (using an off-the-shelf wearable sensor that collected minute-level physiological data), and self-reported depressive symptoms from > 20,000 participants over the course of ~ 7 months as part of the TemPredict Study. Higher self-reported and wearable sensor-assessed body temperatures when awake were associated with greater depression symptom severity. Lower diurnal body temperature amplitude, computed using wearable sensor-assessed distal body temperature data, tended to be associated with greater depression symptom severity, though this association did not achieve statistical significance. These findings, drawn from a large sample, replicate and expand upon prior data pointing to body temperature alterations as potentially relevant factors in depression etiology and may hold implications for development of novel approaches to the treatment of major depressive disorder.
Article
Automated Program Repair (APR) is defined as the process of fixing a bug/defect in the source code, by an automated tool. APR tools have recently experienced promising results by leveraging state-of-the-art Neural Language Processing (NLP) techniques. APR tools such as TFix and CodeXGLUE that combine text-to-text transformers with software-specific techniques are outperforming alternatives, these days. However, in most APR studies the train and test sets are chosen from the same set of projects (i.e., when APR fixes a bug in the test set from project A, the model has already seen example fixed bugs from project A in the training set). In the real world, however, APR models are meant to be generalizable to new and different projects. Therefore, there is a potential threat that reported APR models with high effectiveness perform poorly when the characteristics of the new project or its bugs are different than the training set’s (“Domain Shift”). In this study, we first define the problem of domain shift in automated program repair. Next, we measure the potential damage of domain shift on two recent APR models (TFix and CodeXGLUE). Based on this observation, we then propose a domain adaptation framework that can adapt an APR model for a given target project. We conduct an empirical study with three domain adaptation methods FullFineTuning , TuningWithLightWeightAdapterLayers , and CurriculumLearning and two APR models on 2672 bugs from 12 projects. The results show that our proposed framework on average can improve the effectiveness of TFix by 13.05% and CodeXGLUE by 48.78%, in terms of “Exact Match”. Through experiments, we also show that the framework provides high efficiency and reliability (in terms of “Exposure Bias”). Using synthetic data to domain adapt TFix and CodeXGLUE on the projects with no data (Zero-shot learning), also results in an average improvement of 5.76% and 17.62% for TFix and CodeXGLUE, respectively.
Chapter
Chapter 4 describes connections, equivalencies, and relationships relating to two-sample tests of null hypotheses. First, Student’s conventional two-sample t-test is described. Second, a permutation two-sample test is presented and the connection linking the two tests is established. An example analysis illustrates the differences in the two approaches and the connection linking the tests. Third, measures of effect size for two-sample tests are presented for both Student’s two-sample t-test and a permutation two-sample test and the connections linking the various measures are set out. Fourth, the Wilcoxon–Mann–Whitney two-sample rank-test is presented with a permutation alternative for rank-score data and the connection linking the two tests is described. Finally, the connection between a conventional two-sample z-test for proportions and Pearson’s chi-square test of independence is delineated.
Article
Full-text available
Muscle fatigue is a complex phenomenon that is influenced by the type of activity performed and often manifests as a decline in motor performance (mechanical failure). The purpose of our study was to investigate the compensatory strategies used to mitigate mechanical failure. A cohort of 21 swimmers underwent a front-crawl swimming task, which required the consistent maintenance of a constant speed for the maximum duration. The evaluation included three phases: non-fatigue, pre-mechanical failure, and mechanical failure. We quantified key kinematic metrics, including velocity, distance travelled, stroke frequency, stroke length, and stroke index. In addition, electromyographic (EMG) metrics, including the Root-Mean-Square amplitude and Mean Frequency of the EMG power spectrum, were obtained for 12 muscles to examine the electrical manifestations of muscle fatigue. Between the first and second phases, the athletes covered a distance of 919.38 ± 147.29 m at an average speed of 1.57 ± 0.08 m/s with an average muscle fatigue level of 12%. Almost all evaluated muscles showed a significant increase (p < 0.001) in their EMG activity, except for the latissimus dorsi, which showed a 17% reduction (ES 0.906, p < 0.001) during the push phase of the stroke cycle. Kinematic parameters showed a 6% decrease in stroke length (ES 0.948, p < 0.001), which was counteracted by a 7% increase in stroke frequency (ES −0.931, p < 0.001). Notably, the stroke index also decreased by 6% (ES 0.965, p < 0.001). In the third phase, characterised by the loss of the ability to maintain the predetermined rhythm, both EMG and kinematic parameters showed reductions compared to the previous two phases. Swimmers employed common compensatory strategies for coping with fatigue; however, the ability to maintain a predetermined motor output proved to be limited at certain levels of fatigue and loss of swimming efficiency (Protocol ID: NCT06069440).
Article
Purpose: The inherent characteristics of lung tissue independent of breathing maneuvers may provide fundamental information for function assessment. This paper attempted to correlate textural signatures from computed tomography (CT) with pulmonary function measurements. Materials and methods: Twenty-one lung cancer patients were collected with thoracic 4-dimensional CT, DTPA-single-photon emission CT ventilation (VNM) scans, and available spirometry measurements (forced expiratory volume in 1 s, FEV1; forced vital capacity, FVC; and FEV1/FVC). In subregional feature discovery, function-correlated candidates were identified from 79 radiomic features based on the statistical strength to differentiate defected/nondefected lung regions. Feature maps (FMs) of selected candidates were generated on 4-dimensional CT phases for a voxel-wise feature distribution study. Quantitative metrics were applied for validations, including the Spearman correlation coefficient (SCC) and the Dice similarity coefficient for FM-VNM spatial agreement assessments, intraclass correlation coefficient for FM interphase robustness evaluations, and FM-spirometry comparisons. Results: At the subregion level, 8 function-correlated features were identified (effect size>0.330). The FMs of candidates yielded moderate-to-strong voxel-wise correlations with the reference VNM. The FMs of gray level dependence matrix dependence nonuniformity showed the highest robust (intraclass correlation coefficient=0.96 and P<0.0001) spatial correlation, with median SCCs ranging from 0.54 to 0.59 throughout the 10 breathing phases. Its phase-averaged FM achieved a median SCC of 0.60, a median Dice similarity coefficient of 0.60 (0.65) for high (low) functional lung volumes, and a correlation of 0.565 (0.646) between the spatially averaged feature values and FEV1 (FEV1/FVC). Conclusions: The results provide further insight into the underlying association of specific pulmonary textures with both local (VNM) and global (FEV1/FVC, FEV1) functions. Further validations of the FM generalizability and the standardization of implementation protocols are warranted before clinically relevant investigations.
Preprint
Full-text available
Limitations in chronic pain therapies necessitate novel interventions that are effective, accessible, and safe. Brain-computer interfaces (BCIs) provide a promising modality for targeting neuropathology underlying chronic pain by converting recorded neural activity into perceivable outputs. Recent evidence suggests that increased frontal theta power (4–7 Hz) reflects pain relief from chronic and acute pain. Further studies have suggested that vibrotactile stimulation decreases pain intensity in experimental and clinical models. This longitudinal, non-randomized, open-label pilot study's objective was to reinforce frontal theta activity in six patients with chronic upper extremity pain using a novel vibrotactile neurofeedback BCI system. Patients increased their BCI performance, reflecting thought-driven control of neurofeedback, and showed a significant decrease in pain severity and pain interference scores without any adverse events. Pain relief significantly correlated with frontal theta modulation. These findings highlight the potential of BCI-mediated cortico-sensory coupling of frontal theta with vibrotactile stimulation for alleviating chronic pain.
Article
Мета. М’ята перцева є однією із найбільш важливих лікарських та ефіроолійних культур в Україні та світі, тому вивчення особливостей формування її продуктивності є важливим завданням сучасної аграрної науки. У статті представлено результати математичної аналітики ефективності застосування різних доз мінеральних добрив NPK на посівах м’яти перцевої з огляду на врожайність сирої надземної біомаси культури. Методи. Узагальнені результати наукових досліджень, виконаних у різних куточках світу, було оброблено методами множинної регресії для побудови моделі врожайності культури залежно від доз NPK та апроксимації результатів математичного моделювання. Додатково було виконано перевірку гетероскедастичності вхідних даних моделі, та розраховано рангові кореляції. Результати. У результаті статистичних розрахунків за більшістю критеріїв нормальності розподілу даних та гетероскедастичності було відкинуто нульову гіпотезу щодо впливу мінеральних добрив на врожайність м’яти перцевої. Розраховані величини коефіцієнтів рангових кореляцій дозволяють стверджувати про визначальну роль фосфорних мінеральних добрив у продуктивності культури, у той час як роль калійних добрив є мінімальною та за окремими статистичними критеріями може бути описана як неістотна. Множинна регресійна модель урожайності м’яти перцевої залежно від доз внесення NPK мінеральних добрив має середню адекватність вхідному набору даних, але прогностична точність моделі є низькою. Коефіцієнти регресії множинної моделі підтверджують високу роль фосфорних добрив у формуванні продуктивності м’яти перцевої. Висновки. Таким чином, перспективними є подальші польові та вегетаційні досліди щодо формування продуктивності культури за внесення саме азотно-фосфорних мінеральних добрив, у той час як застосування калійних добрив на посівах м’яти перцевої з високою вірогідністю не матиме позитивного ефекту з огляду на продуктивність та економічні показники вирощування лікарської сировини.
Article
Objectives: Test the hypothesis that within patient clinical instability measured by deterioration and improvement in mortality risk over 3-, 6-, 9-, and 12-hour time intervals is indicative of increasing severity of illness. Design: Analysis of electronic health data from January 1, 2018, to February 29, 2020. Setting: PICU and cardiac ICU at an academic children's hospital. Patients: All PICU patients. Data included descriptive information, outcome, and independent variables used in the Criticality Index-Mortality. Interventions: None. Measurements and main results: There were 8,399 admissions with 312 deaths (3.7%). Mortality risk determined every three hours using the Criticality Index-Mortality, a machine learning algorithm calibrated to this hospital. Since the sample sizes were sufficiently large to expect statical differences, we also used two measures of effect size, the proportion of time deaths had greater instability than survivors, and the rank-biserial correlation, to assess the magnitude of the effect and complement our hypothesis tests. Within patient changes were compared for survivors and deaths. All comparisons of survivors versus deaths were less than 0.001. For all time intervals, two measures of effect size indicated that the differences between deaths and survivors were not clinically important. However, the within-patient maximum risk increase (clinical deterioration) and maximum risk decrease (clinical improvement) were both substantially greater in deaths than survivors for all time intervals. For deaths, the maximum risk increase ranged from 11.1% to 16.1% and the maximum decrease ranged from -7.3% to -10.0%, while the median maximum increases and decreases for survivors were all less than ± 0.1%. Both measures of effect size indicated moderate to high clinical importance. The within-patient volatility was greater than 4.5-fold greater in deaths than survivors during the first ICU day, plateauing at ICU days 4-5 at 2.5 greater volatility. Conclusions: Episodic clinical instability measured with mortality risk is a reliable sign of increasing severity of illness. Mortality risk changes during four time intervals demonstrated deaths have greater maximum and within-patient clinical instability than survivors. This observation confirms the clinical teaching that clinical instability is a sign of severity of illness.
Article
Full-text available
Objectives. To estimate effect sizes in quasi-experimental studies. Methods. Methods of the theory of estimation, methods of mathematical statistics. Results. Estimation of the effect size on an ordinal scale, estimation of the effect size on a binary in the case of opposite direction effects in groups, in quasi-experimental studies for the analytical method "differences in differences". Conclusion. The paper considers approaches to assessing absolute and standardized effect sizes in experimental and quasi-experimental studies. A brief review of the estimators of absolute and standardized effect sizes for quantitative and binary study variables is provided. The applied approach is proposed to assess the effect sizes of a binary variable in the case of opposite direction effects in groups within a quasi-experimental studies for the "differences in differences" analytical method. An example of assessment of absolute and standardized effect sizes of quantitative and binary variables in quasi-experimental studies in clinical epidemiology is considered.
Article
The cumulative redundancy bias (CRB) refers to people's difficulty in ignoring the redundancy in cumulatively presented information. When people consider which of two competing agents is better, they are influenced by the sequence of events that led to their accumulative total performance. If one agent was ahead most of the time, people consider this agent better – even if the agents are tied eventually. However, we show that an opposite performance-slope bias (PSB) emerges when participants focus on the performance trajectories of agents. When a trailing agent is substantially catching up to the leading agent, people judge the former as the better agent. In four experiments where we manipulated the magnitude of performance slope difference between two agents, we obtained both effects. In Experiments 1 and 2, a large slope difference between agents overshadowed other factors that may have influenced the reversal from CRB to PSB. In Experiments 3 and 4 we showed conclusively with the same design, but for different real-life contexts that the CRB emerges when the slope difference between agents is small while it reverses as the PSB when the slope difference is large. The four experiments demonstrate individuals' ability for flexible cue utilization.
Article
A relation is developed between Spearman's coefficient of rank correlation rs and the inversions in the two rankings. This leads to an expression for the mean value of rs in samples from a finite population, and to the improvement of Daniels' inequality relating rs and Kendall's coefficient t.
Article
Let x and y be two random variables with continuous cumulative distribution functions f and g. A statistic U depending on the relative ranks of the x's and y's is proposed for testing the hypothesis f=gf = g. Wilcoxon proposed an equivalent test in the Biometrics Bulletin, December, 1945, but gave only a few points of the distribution of his statistic. Under the hypothesis f=gf = g the probability of obtaining a given U in a sample of nxsn x's and mysm y's is the solution of a certain recurrence relation involving n and m. Using this recurrence relation tables have been computed giving the probability of U for samples up to n=m=8n = m = 8. At this point the distribution is almost normal. From the recurrence relation explicit expressions for the mean, variance, and fourth moment are obtained. The 2rth moment is shown to have a certain form which enabled us to prove that the limit distribution is normal if m,nm, n go to infinity in any arbitrary manner. The test is shown to be consistent with respect to the class of alternatives f(x)>g(x)f(x) > g(x) for every x.
Rank correlation methods. London, Charles Griffin and Co
  • M G Kendall
  • M. G. Kendall