Article

The Measurement Of Observer Agreement For Categorical Data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To make the human evaluation more reliable, we calculate the Cohen's Kappa score [37] to assess the consistency among the evaluators. Specifically, we randomly selected 30 questions from the 200 evaluation questions and had responses generated by 7 models (3 global baselines including Samantha-1.11, ...
... We calculate Cohen's Kappa score among all the human evaluators' lists of predictions. The resulting agreement is 0.441, which is larger than the acceptable threshold 0.4 as indicated by [37]. ...
Preprint
Full-text available
We introduce MentalChat16K, an English benchmark dataset combining a synthetic mental health counseling dataset and a dataset of anonymized transcripts from interventions between Behavioral Health Coaches and Caregivers of patients in palliative or hospice care. Covering a diverse range of conditions like depression, anxiety, and grief, this curated dataset is designed to facilitate the development and evaluation of large language models for conversational mental health assistance. By providing a high-quality resource tailored to this critical domain, MentalChat16K aims to advance research on empathetic, personalized AI solutions to improve access to mental health support services. The dataset prioritizes patient privacy, ethical considerations, and responsible data usage. MentalChat16K presents a valuable opportunity for the research community to innovate AI technologies that can positively impact mental well-being.
... Cohen's weighted kappa was assessed between the selection of studies based on title and abstract (Supplementary material, Figure S1) and full-text analysis (Supplementary material, Figure S2). For the selection based on the title and abstract, Cohen's Weighted Kappa was 0.548, indicating a moderate level of agreement [49] between the two independent researchers. In the selection based on the full-text analysis, Cohen's Weighted Kappa was 0.698, indicating a substantial level of agreement [49]. ...
... For the selection based on the title and abstract, Cohen's Weighted Kappa was 0.548, indicating a moderate level of agreement [49] between the two independent researchers. In the selection based on the full-text analysis, Cohen's Weighted Kappa was 0.698, indicating a substantial level of agreement [49]. ...
Article
Full-text available
Purpose of Review Prostate cancer (PCa) is the most prevalent cancer and the third deadliest in Europe among men. PCa has several well-established risk factors; however, the influence of lifestyle factors remains under investigation, which may hinder efforts to encourage healthier behavior adoption. Thus, this systematic review explored the general population’s perceptions, knowledge, and attitudes regarding PCa-related risk factors. Recent Findings Eighteen qualitative studies were included after searching PubMed, Scopus, Web of Science, and EMBASE scientific databases between January 2013 and February 2023. Five major themes emerged from the 18 included studies: PCa knowledge, risk factors, lifestyle pattern changes, motivation/barriers to changing habits, and lifestyle advice support. Participants identified age, family history, genetics, and race/ethnicity as risk factors for PCa, but no consensus has been reached regarding lifestyle. However, most of the participants were willing to adopt healthier habits. Support from healthcare professionals (HPs), family, and friends, the desire for more time with loved ones, and fear of PCa consequences were cited as motivators for habit changes. However, poor economic conditions, work schedules, age, and PCa limitations hamper lifestyle changes. Summary Effective interventions require personalized support and credible information from healthcare providers. Collaboration between family, friends, and HPs is crucial for promoting healthier behaviors and enhancing PCa management. This systematic review highlights the need for further research and innovative approaches to empower individuals towards healthier lifestyles, which could help prevent PCa or, at the very least, promote better treatment outcomes. Graphical Abstract
... Consequently, the inter-rater agreement of the codes yielded a Kappa value of 0.80 (p < 0.010), with a 95% confidence interval of (0.43, 1.16). This indicates substantial agreement for the semi-structured interview for the experts, following the criteria set by [38]. ...
Article
Organic Reaction Mechanism (ORM) is a challenging topic in organic chemistry that students must learn meaningfully. ORM studies are expanding in various ways, yet there is still a scarcity of studies on module development incorporating the teaching and learning ORM. This paper reports the needs analysis phase of the Design and Development research (DDR) study, which aims to explore the need to develop an Organic Module for teaching ORM among experts. Note that five experts were interviewed in this phase. The data were examined using thematic analysis to meet the study's objectives. Based on the findings, experts are having problems teaching the topic of ORM in schools. Most of them claim that students lack basic knowledge of the ORM concept, misunderstand placing the arrow correctly in the mechanism reaction and believe that ORM requires memorization. At the same time, there are not enough learning materials, such as modules appropriate for the pre-university syllabus. Most experts agreed that the ORM module should be developed to enhance and assist students' understanding of organic mechanisms. The study implies that this module may be utilized by chemistry educators, lecturers, or teachers since the ORM module in this study is based on the pre-university syllabus.
... While R 2 values and κ values both provide information on the predictive power of models, i.e. the amount of variance that they explain, some care is needed when interpreting such values. Relatively low numerical R 2 values such as 0.090 might already express a medium effect size (Cohen, 1988), while a κ value of 0.200 is needed to speak of fair agreement strengths (Landis and Koch, 1977). Some authors set the bar even higher (Shrout, 1998). ...
Article
Full-text available
When analyzing management behaviors of small-scale private forest owners, demographic variables such as income, age, or profession, and land characteristics such as forest holding size often emerge as important drivers. However, gender is frequently used in targeted outreach, even though the other variables regularly show higher predictive power. To shed light on this discussion, we examined the influences of a broad set of predictors including both land characteristics and sociodemographic factors such as gender on management activities, owner goals, perceived obstacles, and conservation attitudes as response variables. We used a questionnaire survey to collect quantitative data from 1268 small-scale private forest owners in northwestern Germany. Random forest models were used to predict the responses and to rank the predictors according to their variable importance. We found that the size of forest holdings often had a strong influence on economic activities, while the amount of broadleaf forest was important for conservation-oriented management decisions. While gender-specific outreach is a strong tool to empower formerly marginalized forest owner groups, gender was not found to be an important predictor of forest management activities in our analyses. We advocate considering other characteristics when conceiving communication with forest owners. In order to design carefully targeted policy instruments and outreach to forest owners, we propose a set of easily accessible owner parameters and land characteristics. These factors can guide more individualized conservation outreach strategies in small-scale private forests that are embedded in the overall livelihood systems of their owners.
... Cohen's kappa coefficient (κ) was calculated to estimate the agreement between the two quantification techniques [44]. As detailed by [45], κ measures agreement between two methods as follows: below 0.00 is considered poor; 0.00 to 0.20 is slight; 0.21 to 0.40 is fair; 0.41 to 0.60 is moderate; 0.61 to 0.80 is substantial; and 0.81 to 1.00 is almost perfect. ...
Article
Full-text available
Fusarium fungi cause Fusarium head blight (FHB) in oats, reducing yield and contaminating grains with harmful tri-chothecene mycotoxins. FHB symptoms in oats are often not visually distinct, necessitating alternative detection methods. We developed digital PCR (dPCR) assays as the most accurate DNA-based method to detect trichothecene-producing Fusarium species commonly found in oats. Building on existing quantitative PCR (qPCR) assays, we developed dPCR assays targeting all trichothecene producers (the Tri5 gene), or specific to F. langsethiae (Fl), F. poae (Fp), and F. sporotrichioides (Fs). All targeted single copy genes, except F. poae which targeted rDNA which is a variable and multi-copy target (and hence not as reliable as the other assays for quantification). Optimized dPCR assays showed excellent linearity (R 2 = 0.99) and greater resilience than qPCR to varying oat DNA concentrations. Overall, when comparing assay sensitivity using both fungal and field oat DNA extracts, dPCR assays were superior to qPCR for Tri5, Fl, and Fs, but the converse was true for Fp. Performance comparisons using field samples showed moderate to perfect agreement between qPCR and dPCR for Tri5 and Fl (κ = 0.5 and 0.86) and poor agreement for Fp (κ = 0.00). Strong correlations were observed between the methods for Tri5, Fl, and Fp (r = 0.88-0.97), but unlike dPCR, qPCR did not detect Fs in any of the field samples. We conclude that the dPCR assays for Tri5, Fl, and Fs offer a reliable method for quantification while that for Fp is reliable for fungal detection but less reliable for quantification of the pathogen in field samples.
... Two members of the research team manually reviewed the automated fixation corrections independently of each other. Upon a disagreement (275 out of 1, 693 trials), the differences were discussed in detail, on 34 occasions with a third team member to reach consensus, resulting in an almost perfect inter-rater reliability of 81% determined by Cohen's kappa [61]. During these reviews, we detected that, for a few participants, there was also a noticeable offset on the x-axis. ...
Preprint
Full-text available
As software pervades more and more areas of our professional and personal lives, there is an ever-increasing need to maintain software, and for programmers to be able to efficiently write and understand program code. In the first study of its kind, we analyze fixation-related potentials (FRPs) to explore the online processing of program code patterns that are ambiguous to programmers, but not the computer (so-called atoms of confusion), and their underlying neurocognitive mechanisms in an ecologically valid setting. Relative to unambiguous counterparts in program code, atoms of confusion elicit a late frontal positivity with a duration of about 400 to 700 ms after first looking at the atom of confusion. As the frontal positivity shows high resemblance with an event-related potential (ERP) component found during natural language processing that is elicited by unexpected but plausible words in sentence context, we take these data to suggest that the brain engages similar neurocognitive mechanisms in response to unexpected and informative inputs in program code and in natural language. In both domains, these inputs lead to an update of a comprehender’s situation model that is essential for information extraction from a quickly unfolding input.
... moderate, 0.71-0.80: substantial, and 0.81-1: almost perfect) [21]. In detail, all ICC estimates with corresponding 95% confidence intervals were calculated using the R package psych v2.1.9 ...
Article
Full-text available
Introduction and Hypothesis To date, levator ani muscle (LAM) morphometry has been classified descriptively and semi-quantitatively. New MRI techniques enabling detailed visualization with the 3D pelvic inclination correction system (3D PICS) could offer a one-stop-shop diagnostic modality for quantitative assessment of LAM subdivisions. The aim of this controlled MRI study was to assess morphometric LAM subdivision characteristics in two distinct groups of premenopausal women, namely nulliparous asymptomatic controls and symptomatic patients (Pelvic Organ Prolapse Quantification [POP-Q] ≥ II). Methods Magnetic resonance imaging scans of the 22 women in each group were analyzed applying the 3D PICS coordinate system. A second reading of MRI was used to calculate interrater reliability (IRR). Origins and insertions were expressed in the 3D-Cartesian coordinate system in relation to point 0/0/0 (inferior pubic point). Distances and angles between muscles and planes were described using mean and standard deviation or median with first and third quartiles for all LAM subdivisions. Results Moderate to good IRR was reported except for points close to point 0/0/0. Origins showed no difference between groups. Insertions differed notably in the vertically oriented pubovaginal, puboperineal, and puboanal muscles, with patients exhibiting lower positions along the superior–inferior axis by 6.1–7.7, 8.8, and 8.0–8.2 mm respectively. In contrast, the insertions of the horizontally oriented puborectal muscle showed a smaller difference of 1.8 mm. Muscle lengths were also 4% to 24% longer in cases. Conclusions This in vivo MRI study reveals first geometric 3D data on LAM morphology in 3D PICS for both cases and controls. Exact 3D coordinates of origin/insertion points, lengths, and angles could serve as a basis for future imaging-based POP diagnostics.
... Interrater reliability was then calculated using Cohen's kappa coefficient (K) to test the agreement between the two researchers. The kappa value was κ = 0.94 for 90% overlap of coded segments, indicating near-perfect agreement (Landis and Koch, 1977). In addition, discrepancies in coding were compared and resolved through consensus. ...
Article
Full-text available
Embracing diversity and the demand for social justice are key concerns in modern societies, and it is imperative for physical education (PE) to address student diversity and promote social justice. However, achieving these goals presents challenges for PE teachers. In this context, social justice pedagogies (SJPs) provide guidance on how to address these demands. Although the concept of SJPs has been extensively theorized, there is a research gap concerning concrete teaching practices related to social justice. Despite a growing literature base, knowledge about specific teaching practices that PE teachers employ in their professional practice remains limited. This study aims to gain a deeper understanding of how SJPs can be realized in PE. Following an exploratory qualitative design, semi-structured interviews were conducted with Austrian secondary school PE teachers ( n = 20) to explore teaching practices aligned with SJPs. Qualitative content analysis, informed by SJPs, was applied to analyze the data. The results reveal various teaching practices related to SJPs, such as considering student diversity when selecting teaching content, making individualized adjustments, and promoting social skills and fairness. These teaching practices include teaching goals, content, didactic-methodical approaches, teacher–student(s) interactions, and grading. Based on the findings, this paper discusses how these teaching practices reflect the theoretical considerations of SJPs. It concludes that the explored teaching practices demonstrate opportunities for enacting SJPs in PE.
... we obtained a kappa value k ¼ 0:786À0:024 1À0:024 ¼ 0:762 0:976 % 0:78, indicating substantial agreement between the coders (Landis and Koch 1977). The high consistency between raters confirms that our interpretation of the topic is reliable, increasing the findings' rigor and validity in a specialized setting. ...
Article
Full-text available
The authenticity of cultural features in souvenir design is increasingly emphasized by heritage tourism. This study takes souvenirs of the Moga Caves in Dunhuang, China, as an object of research and explores how designers integrate authentic cultural features of the heritage into product design. The study uses a qualitative approach to interview nine designers and two cultural experts from the souvenir design sector in Dunhuang to obtain perspectives on their consideration of authenticity in the souvenir design process. Thematic analysis was employed to identify critical patterns related to the authenticity of cultural representations in design. The findings revealed five themes: 1) Art Forms and Composition, 2) Contemporary Value, 3) Authenticity and Originality, and 4) Cultural Significance and Representation. The finding indicates that respecting Dunhuang murals’ aesthetic and historical significance while understanding their role in facilitating cross-cultural communication is vital to creating meaningful souvenirs. The study also emphasized the need for collaboration between designers and cultural authorities to ensure authenticity and relevance in the design of souvenirs. This study contributes to a broader discussion of the themes related to cultural heritage, sustainable development, and cultural product design.
... python package, Cohen's Kappa was calculated. Agreement scores were interpreted based on the thresholds suggested by Landis and Koch (1977), where values 0.21-0.40 indicate fair agreement, 0.41-0.60 ...
Article
Full-text available
Incorporating non-verbal data streams is essential to understanding the dynamics of interaction within collaborative learning environments in which a variety of verbal and non-verbal modes of communication intersect. However, the complexity of non-verbal data — especially gathered in the wild from collaborative learning contexts — demands efficient and effective analysis. Methodological advancements are necessary to handle this complexity, enabling researchers to derive meaningful insights from these data streams. The advancement of Generative Artificial Intelligence (GenAI) has significantly broadened its accessibility, making it available to a diverse array of users and demonstrating its utility in aiding data analytics. However, the application of GenAI in multimodal learning analytics, particularly within the context of feature extraction for studying collaborative learning interactions, remains unexplored. This study aims to explore how multimodal large language models (MLLMs) can be utilized as part of the multimodal learning analytics (MMLA) process, focusing on the extraction of postural behaviour. The study focuses on an illustrative case study involving 52 pre-service teachers engaged in a physics-based collaborative learning task, demonstrating how MLLMs can be used for feature extraction. The integration of GenAI techniques in learning research promises a new horizon in understanding and enhancing collaborative learning interactions.
... Categorization was performed by AJ and HC. The inter-coder reliability (κ) was 0.83 (near perfect [15],). ...
Article
Full-text available
Background Children of parents with depression have an increased risk of mental illness themselves and there is an urgent need to implement effective prevention programmes for this population. "Growing Up Healthy and Happy" (“GuG-Auf-Online") is an online family- and group-based cognitive-behavioural preventive programme with a strong evidence base. The aim of the current study was to understand what factors might hamper parents with depression from participating in the programme. Methods An online cross-sectional survey was conducted in Germany with 274 parents who fulfilled the inclusion criteria for the programme (parental history of depression and a child aged eight to 17 years with no mental illness). The survey included several a priori-defined barriers (e.g. online format, feelings of shame) which parents rated in terms of (a) whether the barrier was relevant to them and if so, (b) how much it held them back from participating. Open-ended questions identified additional barriers. In addition to qualitative content analysis according to Mayring (2008), Pearson correlations were calculated to determine whether the current severity of parents’ symptoms were associated with their responses. Results The following aspects emerged as relevant barriers: (a) shame regarding one's depression, (b) overburden and (c) avoidance (not wanting to be reminded of depression). There was no evidence that the online setting was a significant barrier. Most of the correlations between the current severity of parent’s symptoms and their responses were statistically significant (p < .0037). Conclusions The main barriers to participation in prevention related to individual characteristics/ emotional experiences rather than structural issues. Addressing these barriers in the advertisement of future programmes could improve uptake.
... The interobserver reliability analysis results for WLI yielded a kappa value of 0.591, indicating moderate concordance, while for NBI, a kappa value of 0.674 was obtained, indicating significant concordance. 17 Both observers are endoscopy experts with similar years of experience and expertise in using WLI and NBI techniques. Therefore, this study can use the results from either one of the observers. ...
Article
Full-text available
Background: White light imaging (WLI) is the current standard colonoscopy technique for diagnosing colorectal polyps in Indonesia. Various endoscopic imaging techniques have been developed to improve the accuracy of diagnosing colorectal polyps, one of which is narrow band imaging (NBI). We conducted a diagnostic study comparing the performance of NBI against WLI in distinguishing neoplastic from non-neoplastic colorectal polyps. Methods: This was a diagnostic study that analyzes endoscopic pictures of colorectal polyps in patients who underwent colonoscopy using the WLI and NBI techniques. Previously collected biopsy tissue specimens were re-examined by a single pathologist. Results: There were 117 subjects analyzed, and the proportion of subjects with neoplastic polyps was 65.8%. Common indications for colonoscopy were hematochezia (24.8%) and abdominal pain (23.9%). WLI showed moderate inter-observer reliability (kappa value=0.591), while NBI showed significant reliability (kappa value=0.674). NBI demonstrated better sensitivity (84.4%; 95% CI 74.4%–91.7%) and accuracy (78.6%; 95% CI 70.1%–85.7%) compared with WLI (sensitivity 74%; 95% CI 62.8%–83.4% and accuracy 71.8%; 95% CI 62.7%–79.7%). However, the specificity was the same (67.5%; 95% CI 50.9%–81.4%). Conclusion: NBI has better performance than WLI in distinguishing neoplastic and non-neoplastic colorectal polyps.
... After the examiners completed the training stage to ensure proper standardization of the observations of the mastoid processes, a good level of agreement was observed, with intra-and inter-examiner Kappa values >0.80 (0.848 and 0.846, respectively). Landis and Koch (1977) developed a six-level scale to interpret Kappa values, where values of zero are considered to indicate no agreement; 0.00 to 0.20, slight agreement; 0.21 to 0.40, fair agreement; 0.41 to 0.60, moderate agreement; 0.61 to 0.80, substantial agreement; and values above 0.81, almost perfect agreement. The maximum intra-observer error considered acceptable for the present study was 15%. ...
Article
Full-text available
This study aimed to analyze the accuracy of the macromorphoscopic evaluation of the mastoid process of the temporal bone in identified human skeletons for estimating sex. A total of 832 mastoid processes from individuals of both sexes, aged at least 20 years, were examined. All the mastoid processes were analyzed concerning their anatomical characteristics and classified according to the types established by Buikstra and Ubelaker (1994), assigning scores in which the lowest values (1 and 2) correspond to females, higher values (4 and 5) to males and value 3 represents a zone of indeterminate sex. For the total sample, score 3 was predominant, while score 1 was less frequent for both mastoid processes. In addition, men showed more occurrences of score 4, while women showed more classifications in score 2, both on the right and left, with a statistically significant relationship between sex and mastoid process. Overall accuracy was 76%, with males showing a higher percentage of correct observations than females (89.7% and 61.1%, respectively). The right mastoid process showed greater accuracy in predicting sex (p<0.001). Despite the greater number of studies on metric methods in the literature, morphoscopic methods are important in forensic anthropology, especially for sexual diagnosis, given their easy applicability and potential for good results.
... Inter-rater reliability between RD and CGH for full-text screening was calculated (k = .88) and classified as 'substantial agreement' (Landis & Koch, 1977). Any disagreements regarding eligibility were discussed and resolved between CGH and RD, with input from LF. ...
Article
Full-text available
Background A growing body of evidence demonstrates that school‐based mental health interventions may be potentially harmful. We define potential harm as any negative outcome or adverse event that could plausibly be linked to an intervention. In this scoping review, we examine three areas: the types of potential harms and adverse events reported in school‐based mental health interventions; the subgroups of children and adolescents at heightened risk; and the proposed explanations for these potential harms. Methods We searched eight databases (1960–2023), performed an author search and hand‐searched for published and unpublished studies that evaluated controlled trials of school‐based group mental health interventions based on cognitive‐behavioural therapy and/or mindfulness techniques, with the aim of reducing or preventing internalising symptoms or increasing wellbeing. Two independent raters screened studies for eligibility and assessed study quality using Cochrane tools. From eligible studies, we reviewed those that reported at least one negative outcome. Results Ten out of 112 (8.93%) interventions (described in 120 studies) reported at least one negative outcome such as a decrease in wellbeing or an increase in depression or anxiety. Three out of 112 interventions (2.68%) reported the occurrence of specific adverse events, none of which were linked to the intervention. Of the 15/120 studies rated as high quality (i.e. those with low risk of bias), 5/15 (33.33%) reported at least one negative outcome. Negative outcomes were found for a number of subgroups including individuals deemed at high risk of mental health problems, male participants, younger children and children eligible for free school meals. About half (54.5%) of the studies acknowledged that the content of the intervention itself might have led to the negative outcome. Conclusion To design and implement effective school‐based mental health interventions, the issues of potential harm and their related measurement and reporting challenges must be addressed.
... indicated substantial agreement, while 0.81-1.00 signaled almost perfect agreement [14]. Statistical analyses were performed using IBM SPSS Statistics 26.0 (IBM Corp., Armonk, NY). ...
Article
Full-text available
This study aimed to evaluate the utility of ANCA specificity as a primary criterion for classifying AAV subtypes to simplify the diagnostic process without compromising accuracy. A retrospective cohort study was conducted involving 310 patients diagnosed with AAV between January 2015 and December 2023 across three tertiary care centers affiliated with Peking University. Patients were reclassified using three methods: the European Medicines Agency (EMA) algorithm, the 2022 American College of Rheumatology/European Alliance of Associations for Rheumatology (ACR/EULAR) criteria, and ANCA specificity-based classification. Concordance between classification systems was assessed using Cohen’s kappa coefficients. ANCA specificity-based classification demonstrated substantial to almost perfect agreement with the 2022 ACR/EULAR criteria for MPA/MPO-AAV (kappa = 0.806) and GPA/PR3-AAV (kappa = 0.663). Many patients initially classified as GPA under the EMA algorithm were reclassified as MPA when using ANCA specificity. EGPA classification remained consistent across all methods (kappa = 0.725 between EMA and ACR/EULAR), suggesting that ANCA specificity is less critical for EGPA. The use of ANCA specificity simplified the classification process, aligning closely with the underlying pathophysiology of AAV subtypes. ANCA specificity serves as a valuable adjunct in the classification of AAV, particularly for distinguishing between MPA and GPA. Utilizing ANCA serotypes can simplify the diagnostic process, potentially facilitating earlier diagnosis and targeted treatment. For EGPA, traditional classification criteria remain effective. Incorporating ANCA specificity into clinical practice may enhance diagnostic accuracy and improve patient outcomes in AAV management. Key Points • ANCA-based classification aligns strongly with the 2022 ACR/EULAR criteria for MPA and GPA, providing a simplified diagnostic approach. • Adopting this approach can streamline the classification process, reduce invasive procedures, and enable earlier diagnosis while maintaining high concordance with established systems.
... Cohen's Kappa coefficient was calculated using the R statistical software and the irr package to assess the agreement between two evaluations performed by the same examiner [35]. The interpretation of the coefficient was based on the scale proposed by Landis and Koch [36]. The result of the intra-observer test was 1, reflecting a perfect level of agreement. ...
... Based on this argument each accuracy was calculated from generated error matrix (Appendix IV) as indicated in Table 16. (Landis & Koch, 1977). He noted that the interpretation of agreement for kappa statics 0.8 to 1.00 is almost perfect agreement. ...
Thesis
Full-text available
The landfill method is recognized as the cheapest and most widely used solid waste management (SWM) system. However, improper disposal of solid waste is a serious problem in urban areas of Ethiopia like Areka town due to the rapid growth of population and urbanization. The objective of this study is to identify suitable landfill sites for solid waste using a geospatial-based multi-criteria decision analysis technique. To achieve this objective data were collected from field observation, Focus Group Decision, Key informants’ interviews, Global Positioning System, various ministry offices, websites, as well as published and unpublished materials. River, Land use land cover (LULC), soil type, borehole, residential area, protected area, elevation, geology, slope, and road were used as criteria to identify suitable landfill sites ArcGIS, QGIS, and Erdas Imagine software were used to prepare the criteria maps. An analytic hierarchy process (AHP) was applied to derive the relative weights of criteria maps in Microsoft Excel and IDRIS software. All criteria maps were then combined with the weighted overlay tool in ArcGIS for the preparation of a final suitability map. The study result shows that the practice of SWM in Areka town is not well managed, thus illegal open dumping (41.6%) and burning (25%) were dominant SWM practices. The overall suitability result shows that of the total area of the study area, 5.4% was not suitable, 61.9% was less suitable, 31.4% was moderately suitable, and 1.3% is highly suitable for solid waste disposal. The final result shows that of the five candidate landfill sites, Landfill 1 (3.06 ha) and Landfill 5 (11.5 ha) were the 1st and 2nd most suitable sites for proposing a new landfill with the least negative impact on the environment and human health. In general, GIS-based MCDA is one of the most important technologies for landfill selection. Overall, the results of this study will help city planners, environmental managers, decision-makers, and sanitation and beautification offices to develop efficient waste management systems. Thus, the researcher recommends Areka town municipality use proposed landfill sites 1 and 2 for solid waste disposal and future studies should also consider various factors overlooked in this study. Keywords: Analytical Hierarchy Process, GIS, Landfilling, MCDA, Solid Waste Management
Article
Full-text available
Örnekler; öğrenme, öğretim ve düşünme süreçlerinde öğrencilerin soyut düşünceleri somutlaştırabilmesini ve anlamlandırabilmesini sağlayan temel bir içerik ögesidir. Kavramların öğretiminde sıklıkla öğrencilerin anlam oluşturmasını desteklemek için örnekler kullanılmaktadır. Bu araştırmanın amacı, 10. sınıf öğrencilerinin eşeyli ve eşeysiz üreme kavramları ile ilgili örnek ayrıntılama becerilerini çeşitli değişkenler açısından incelemektir. Bu çalışma betimsel tarama modeli kullanılarak tasarlanmıştır. Katılımcı öğrencilerin türdeş kavramlara örnek vermelerini ve verdikleri örnekler ile ilgili açıklamalarını incelemek amacıyla büyük bir gruptan bilgi elde etmek için tarama modeli tercih edilmiştir. Çalışmanın örneklemi, basit seçkisiz küme örnekleme ile belirlenmiştir. Örneklemi, 2019-2020 eğitim öğretim yılı Adana ili merkez ilçelerinde bulunan devlete bağlı 5 Anadolu Lisesinde 10. sınıfa devam eden 362 öğrenci oluşturmuştur. Veriler, Örnek Ayrıntılama Testi (ÖAT) ve Üst Bilişsel Farkındalık Ölçeği B formu (ÜBFÖ-B) kullanılarak toplanmıştır. Elde edilen veriler betimsel analiz, Kruskal Wallis H testi ve Mann Whitney U testi ile analiz edilmiştir. Bu çalışmada ÖAT’den elde edilen bulgular doğrultusunda katılımcı öğrencilerin çoğunun örnek verme ve örnek-terim ilişkisi kurmada ortalamanın üstünde puan aldığı buna karşılık örnek-özellik ilişkisini açıklamada ortalamanın altında puan aldığı tespit edilmiştir. Buna göre, öğrencilerin çoğunluğunun türdeş kavramlara verdikleri örneklerin gerekçelerini açıklamakta zorlandıkları belirtilebilir. Dersi çok sevdiğini ve derste zorlanmadığını belirten öğrencilerin ÖAT puanlarının diğer öğrencilere göre daha yüksek olduğu belirlenmiştir. Okul dışında özel bir kurum ya da öğretmenden destek eğitim almanın ise öğrencilerin ÖAT puan ortalamalarında anlamlı bir fark oluşturmadığı tespit edilmiştir. Üst bilişsel farkındalığı yüksek ve düşük olarak belirlenen öğrencilerin ÖAT puan oratlamaları arasında anlamlı bir farklılık bulunamamıştır. Kavram öğretiminde öğrencilerden örnek vermesini isterken örnek-özellik ilişkisini kurmayı sağlayan ayrıntılayıcı sorulara ve açıklamalara yer verilebilir.
Article
Background: Adolescence is a critical period for adopting lifestyle behaviors that influence long-term health. While dietary habits are well-documented, the broader socio-cultural and environmental factors impacting these behaviors are underexplored. This study aimed to develop a dietary adherence tool for adolescents that aligns with the Dietary Guidelines for Koreans, incorporating individual and environmental factors for a comprehensive understanding of dietary behaviors. Methods: A nationwide survey was conducted with 1010 adolescents in Korea to develop and validate a dietary adherence tool based on the Dietary Guidelines for Koreans. Factor analyses and structural equation modeling confirmed the construct validity of the tool, and a grading system was established to evaluate adherence based on survey responses. Results: The survey included participants from 17 regions across South Korea. The original 22 candidate items were revised through factor analysis, resulting in the deletion of 4 items and the addition of 6 new items, leading to a final 24-item tool encompassing three domains: food intake, dietary and physical activity behaviors, and dietary culture. The validity of the revised tool remained intact. The mean dietary guideline adherence score of the participants was 54.5 (SD = 12.1), with domain scores of 39.1 (SD = 14.4) for food intake, 51.6 (SD = 16.6) for dietary and physical activity behaviors, and 66.8 (SD = 15.4) for dietary culture. Conclusions: The dietary adherence tool offers a comprehensive framework for assessing adolescent dietary behaviors by integrating food intake, dietary and physical activity behaviors, and environmental factors. By considering sustainability and family support, it promotes healthier and more sustainable eating patterns among adolescents.
Article
Background/Objectives: Spinal cord injury (SCI) causes profound autonomic and endocrine dysfunctions, giving rise to adrenal insufficiency (AI), which is marked by a reduction in steroid hormone production. Left unaddressed, SCI-related AI (SCI-AI) can lead to life-threatening consequences such as severe hypotension and shock (i.e., adrenal crisis). However, symptoms are often non-specific, making AI challenging to distinguish from similar or overlapping cardiovascular conditions (e.g., orthostatic hypotension). Additionally, the etiology of SCI-AI remains unknown. This review aimed to synthesize the current literature reporting the prevalence, symptomology, and management of SCI-AI. Methods: A systematic search was performed to identify studies reporting AI following the cessation of glucocorticoid treatments in individuals with traumatic SCI. A random-effects meta-analysis was conducted to investigate the overall prevalence of SCI-AI. Results: Thirteen studies involving 545 individuals with traumatic SCI, most with cervical level injuries (n = 256), met the review criteria. A total of 4 studies were included in the meta-analysis. Primary analysis results indicated an SCI-AI pooled prevalence of 24.3% (event rate [ER] = 0.243, 95% confidence interval [CI] = 0.073–0.565, n = 4). Additional sensitivity analyses showed a pooled prevalence of 46.3% (ER = 0.463, 95%CI = 0.348–0.582, n = 2) and 10.8% (ER = 0.108, 95%CI = 0.025–0.368, n = 2) for case–control and retrospective cohort studies, respectively. High-dose glucocorticoid administration after SCI as well as the injury itself appear to contribute to the development of AI. Conclusions: The estimated prevalence of AI in people with traumatic SCI was high (24%). Prevalence was also greater among individuals with cervical SCI than those with lower-level lesions. Clinicians should be vigilant in recognizing the symptomatology and onset of SCI-AI. Further research elucidating its underlying pathophysiology is needed to optimize glucocorticoid administration for remediating AI in this vulnerable population.
Article
Assessing shoulder joint range of motion (ROM) is essential for diagnosing musculoskeletal disorders and optimizing treatments. This single-center pilot study evaluated the reliability and validity of iBalance, a single-camera markerless motion capture system, for measuring shoulder ROM. Forty participants (30 healthy individuals and 10 patients with adhesive capsulitis) underwent measurements of seven shoulder joint movements. Each movement was assessed three times by two raters using both iBalance and a goniometer, with measurements repeated after 1 week. The iBalance demonstrated excellent inter- and intra-rater reliability for flexion (ICC = 0.93 [0.91–0.95], 0.91 [0.88–0.94]), abduction (ICC = 0.97 [0.95–0.98], 0.93 [0.91–0.95]), and passive abduction (ICC = 0.97 [0.96–0.98], 0.98 [0.97–0.98]). The system also showed strong validity compared to the goniometer for flexion (ICC = 0.85 [0.68–0.92]), abduction (ICC = 0.95 [0.94–0.96]), and passive abduction (ICC = 0.97 [0.96–0.98]). Bland–Altman plots showed high consistency between the two devices for flexion, abduction, and passive abduction, with most data points falling within the limits of agreement. Patients with adhesive capsulitis exhibited greater variability than healthy individuals. No adverse events were reported, supporting the safety of the system. This study highlights the potential of a single-camera markerless motion capture system for diagnosing and treating shoulder joint disorders. The iBalance showed clinical applicability for measuring flexion, abduction, and passive abduction. Future enhancements to the algorithm and the incorporation of advanced metrics could improve its performance, facilitating broader clinical applications for diagnosing complex shoulder conditions.
Article
Purpose This study aims to improve the effectiveness of coaching by enhancing the quality of coaching sessions. We developed and validated a formative evaluation measurement scale to assess the quality of coaching based on coaches’ skills, attitudes, and approaches from both coach and coachee perspectives. Design/methodology/approach We developed this tool through two studies: In Study 1, we generated scale items and conducted item reduction. We evaluated the scale’s factor structure, reliability and validity using a sample of 478 coach respondents. In Study 2, we assessed the scale’s factor structure and validity with 284 coachee respondents. Findings The 31-item Coaching Session Evaluation Scale (CSES) showed good model fit and reliability. The validation and nomological network assessment found a positive correlation between CSES scores and coaching relationship quality, goal attainment and action clarity. Originality/value CSES enhances coaching evaluation by providing a formative approach assessment. Moreover, it enables evaluation from both parties of the coaching session, a coachee and a coach, based on the same evaluation framework. The scale contributes to the improvement of the session, which eventually results in a better or desired coaching outcome.
Article
Digital platforms offer unique opportunities for language learning beyond the physical classroom to online contexts, where learners’ autonomy is crucial for effective language learning. The development of one’s self-learning is addressed in literature through Self-Regulated Language Learning (SRLL), a multifaceted system in which learners actively monitor, regulate, and control their learning process, demonstrating autonomy and responsibility. Evidence of how technology enhances SRLL in online settings has been widely acknowledged but research on the positive effects of SRLL in online live-streaming platforms is still scarce. This study fills this gap by exploring the potential of Twitch. tv, a live streaming platform primarily known for gaming and discourse-analytic studies, as a dynamic and interactive linguistic environment in which learners engage in authentic language use, reflexive dialogue, and cultural immersion, thus fostering SRLL strategies. Drawing on technology-enhanced SRLL phases of forethought, performance, reflection, and evaluation, this study explores (a) how Twitch users engage in SRLL through live chat interactions and (b) what strategies are employed during the SRLL phases. Through a qualitative approach, chatlogs were reported using descriptive statistics to provide a clear overview and then coded for thematic analysis. Findings revealed that learners engage in collaborative learning via metalinguistic feedback and peer support by exchanging, for instance, pronunciation tips, and contextual knowledge. The affordances and challenges of leveraging Twitch for language learning are discussed, together with pedagogical implications for educators who seek scaffolding tactics for online language acquisition.
Article
Background/Objectives: This study investigates differences in craniofacial morphology including skull thickness, sella turcica morphology, nasal bone length, and posterior cranial fossa dimensions, as well as differences in head posture and deviations in upper spine morphology, in adult OSA patients compared to healthy controls with neutral occlusion. Methods: 51 OSA patients (34 men, 17 women, mean age 51.9 ± 11.3 years) and 74 healthy controls (19 men, 55 women, mean age 38.7 years ± 14.0 years) with neutral occlusion were included. Craniofacial morphology and head posture were investigated using cephalometric measurements on lateral cephalograms and morphological deviations in sella turcica and upper spine were assessed through visual description of lateral cephalograms. Results: OSA patients had significantly more retrognathic maxilla (p = 0.02) and mandible (p = 0.032 and p = 0.009), significantly larger beta-angle (p = 0.006), and significantly smaller jaw angle (p = 0.045) compared to controls. OSA patients had significantly larger length (p = 0.003, p = 0.001, p = 0.044) and depth of the posterior cranial fossa (p < 0.001) compared to controls. OSA patients had a significantly more extended (p < 0.001) and forward-inclined head posture (p < 0.001) and morphological deviations in the upper spine occurred significantly more often in OSA patients compared to controls (p = 0.05). No significant differences in skull thickness, nasal bone length, and morphological deviations in the sella turcica (p = 0.235) were found between the groups. Conclusions: Significant deviations were found in craniofacial morphology, head posture, and morphological deviations in the upper spine. The results may prove valuable in the diagnostics of OSA patients and in considerations regarding etiology and the phenotypic differentiation of OSA patients.
Article
Full-text available
ZET Bu çalışma, yenilikçi tasarım aracı olarak hikayeleştirme (storytelling) yönteminin konaklama işletmelerinde kullanımını incelemektedir. Hikayeleştirme, turizm sektöründe müşterilere unutulmaz deneyimler sunarak destinasyonların rekabet avantajını artıran etkili bir yöntemdir. Araştırma, Karaman ilinde faaliyet gösteren konaklama işletmeleri yöneticileri ve müşterilerinin hikayeleştirme uygulamalarına dair farkındalık ve tutumlarını anlamayı amaçlamaktadır. Yarı yapılandırılmış görüşmeler aracılığıyla elde edilen veriler, hikayeleştirmenin müşteri memnuniyetini artırma, tanıtım ve pazarlama stratejilerinde yenilikçi bir araç olma potansiyeline sahip olduğunu göstermiştir. Ancak, otel yöneticilerinin büyük bir kısmının bu yönteme ilişkin farkındalık düzeyinin sınırlı olduğu, maliyet ve uygulama zorlukları gibi nedenlerle bu stratejinin benimsenmesinde çekinceler yaşandığı tespit edilmiştir. Bölgenin tarihi ve kültürel potansiyelinin hikayeleştirme ile birleştirilmesi, hem yerel hem de uluslararası turizmde dikkat çekici sonuçlar yaratabilir. Araştırma bulguları doğrultusunda, dijital hikayeleştirme uygulamaları, artırılmış gerçeklik ve yerel kültürel unsurların entegre edilmesi gibi öneriler sunulmuştur. Ayrıca, bölgesel tanıtım ve destinasyon yönetimi açısından düşük maliyetli pilot projelerin başlatılması tavsiye edilmektedir. Hikayeleştirme, Karaman gibi kültürel zenginliğe sahip bölgelerde sürdürülebilir turizm stratejilerinin bir parçası olarak değerlendirilmelidir. Çalışmanın bulguları, hikayeleştirmenin konaklama işletmelerine yaratıcı bir farklılık kazandırarak müşteri bağlılığı, marka değeri ve bölgesel tanıtım açısından önemli katkılar sağlayabileceğini ortaya koymaktadır. ABSTRACT This study examines the use of storytelling as an innovative design tool in accommodation businesses. Storytelling is an effective method in the tourism sector, offering customers memorable experiences and enhancing the competitive advantage of destinations. The research aims to understand the awareness and attitudes of hotel managers and customers in Karaman towards storytelling practices. Data collected through semi-structured interviews revealed that storytelling has the potential to improve customer satisfaction and serve as an innovative tool in promotion and marketing strategies. However, it was found that many hotel managers have limited awareness of this method and expressed concerns regarding costs and implementation challenges. Integrating the region's historical and cultural potential with storytelling could yield remarkable results in both local and international tourism. Based on the research findings, suggestions such as adopting digital storytelling practices, augmented reality, and integrating local cultural elements were proposed. Additionally, initiating low-cost pilot projects was recommended for regional promotion and destination management. Storytelling should be considered a part of sustainable tourism strategies in culturally rich regions like Karaman. The study's findings highlight that storytelling can provide creative differentiation for accommodation businesses, contributing significantly to customer loyalty, brand value, and regional promotion.
Article
Objective: Besides clinical factors, decision-making process in daily practice is affected by non-clinical factors such as experience and educational levels that cause variations amongst clinicians. This study aimed to evaluate the accuracy and agreement in diagnosis and treatment planning process of the dental students and periodontal experts with different education levels and clinical experience. Material and Methods: An anonymous survey of 10 periodontitis cases was given to 15 participants (5 periodontal experts (PE), 5 postgraduate periodontology students (PS), and 5 undergraduate dental students (DS)) and asked them to classify each case and mark their diagnosis and treatment planning. Consensus diagnosis and treatment plan used as a gold-standard was prepared by two experienced periodontists. The accuracy of responses was detected by referring to the gold-standard and inter-examiner agreements were assessed. Results: Except for the diagnosis of grade in the PE group (p=0.012), no significant difference was found between groups in terms of periodontitis diagnosis. In treatment responses, PE group gave more accurate answers compared to others. The agreement levels of all examiners for stage, grade, and extent were fair (κ=0.366, 0.222, and 0.287, respectively). Treatment planning showed low agreement (κ
Article
Background/Objectives: Posture is a significant indicator of health status in older adults. This study aimed to develop an automatic posture assessment tool based on sagittal photographs by validating recognition models using convolutional neural networks. Methods: A total of 9140 images were collected with data augmentation, and each image was labeled as either Ideal or Non-Ideal posture by physical therapists. The hidden and output layers of the models remained unchanged, while the loss function and optimizer were varied to construct four different model configurations: mean squared error and Adam (MSE & Adam), mean squared error and stochastic gradient descent (MSE & SGD), binary cross-entropy and Adam (BCE & Adam), and binary cross-entropy and stochastic gradient descent (BCE & SGD). Results: All four models demonstrated an improved accuracy in both the training and validation phases. However, the two BCE models exhibited divergence in validation loss, suggesting overfitting. Conversely, the two MSE models showed stability during learning. Therefore, we focused on the MSE models and evaluated their reliability using sensitivity, specificity, and Prevalence-Adjusted Bias-Adjusted Kappa (PABAK) based on the model’s output and correct label. Sensitivity and specificity were 85% and 84% for MSE & Adam and 67% and 77% for MSE & SGD, respectively. Moreover, PABAK values for agreement with the correct label were 0.69 and 0.43 for MSE & Adam and MSE & SGD, respectively. Conclusions: Our findings indicate that the MSE & Adam model, in particular, can serve as a useful tool for screening inspections.
Article
Full-text available
A Antropologia Forense é um meio auxiliar de identificação muito importante para a estimativa da idade em casos de esqueletos em elevado grau de decomposição, mutilação, esqueletização ou fragmentação. Objetivo: Avaliar a aplicabilidade de métodos qualitativos para estimativa de idade em uma amostra brasileira moderna. As análises morfológicas das sínfises púbicas e das epífises mediais da clavícula foram feitas por meio de tomadas fotográficas (câmera digital Canon® e com uso da escala n° 2 da ABFO). Os dados coletados abrangeram informações como sexo, idade, ancestralidade e causa da morte. Os critérios de seleção da amostra consideraram a integridade dos esqueletos, a qualidade de preservação e a faixa etária entre 17 e 65 anos. A análise estatística incluiu testes qui-quadrado para avaliar a associação entre as fases morfológicas e as faixas etárias, além de intervalos de confiança, sensibilidade e especificidade. A amostra foi composta por 15 esqueletos femininos (22 a 65 anos) e 40 masculinos (17 a 65 anos), com média e mediana de idade de 37 anos. Os testes estatísticos apontaram significância apenas na amostra masculina referente à análise da sínfise púbica. Os resultados foram promissores, embora apenas a amostra masculina na análise da sínfise púbica tenha apresentado significância estatística. Este estudo ressalta a importância de validações regionais e da aplicação de metodologias específicas para amostras populacionais diversificadas.
Article
Aim To evaluate the prevalence of apical periodontitis (AP) and caries in subjects with psoriasis vulgaris. Methodology In total, 152 patients with psoriasis vulgaris were included in the study. The severity and extent of psoriasis were assessed according to the Psoriasis Area Severity Index (PASI), the Body Surface Area (BSA) and the Physician's Global Assessment Scale (PGA). Periapical status was assessed through dental examination and periapical radiographs. Data regarding the Periapical Index (PAI), caries experience expressed as the Decayed, Missing, Filled Teeth Index (DMFT) and psoriasis medications were recorded. A predictive logistic regression model for the presence of AP and a linear regression model were then built to relate the severity and extent of AP to the type of drug therapy taken for psoriasis and to the severity and extent of the skin disease. Results Subjects with severe/moderate psoriasis showed a significantly higher prevalence of AP ( p = .002) and a higher PAI score ( p = .0035) than subjects with mild psoriasis. No significant correlation was found between AP and caries experience ( p = .76). The logistic regression model showed that moderate/severe psoriasis increased the odds of having AP [odds ratio (OR) = 1.30 ± 1.088, 1.55]. A negative linear relationship between biological drug intake and PAI score value was observed (coefficient = −.54; p = .04). Conclusions The degree of severity of psoriasis is significantly associated with AP, suggesting that psoriasis may play a role in the pathogenesis of AP. However, no significant correlation was observed for caries experience. Furthermore, the immune‐modulating drugs taken by these patients did not seem to have different effects on the prevalence of AP.
Article
Objective Breast MRI affords high sensitivity with intermediate specificity for cancer detection. Ultrafast dynamic contrast-enhanced (DCE) MRI assesses early contrast inflow with potential to supplement or replace conventional DCE-MRI kinetic features. We sought to determine whether radiologist’s evaluation of ultrafast DCE-MRI can increase specificity of a clinical MRI protocol. Methods In this IRB-approved, HIPAA-compliant study, breast MRIs from March 2019 to August 2020 with a BI-RADS category 3, 4, or 5 lesion were identified. Ultrafast DCE-MRI was acquired during the first 40 seconds after contrast injection and before conventional DCE-MRI postcontrast acquisitions in the clinical breast MRI protocol. Three radiologists masked to outcomes retrospectively determined lesion time to enhancement (TTE) on ultrafast DCE-MRI. Interreader agreement, differences between benign and malignant lesion TTE, and TTE diagnostic performance were evaluated. Results Ninety-five lesions (20 malignant, 75 benign) were included. Interreader agreement in TTE was moderate to substantial for both ultrafast source images and subtraction maximum intensity projections (overall κ = 0.63). Time to enhancement was greater across benign lesions compared with malignancies (P <.05), and all lesions demonstrating no enhancement during the ultrafast series were benign. With a threshold TTE ≥40 seconds, ultrafast DCE-MRI yielded an average 40% specificity (95% CI, 30%-48%) and 92% sensitivity (95% CI, 81%-100%), yielding a potential reduction in 31% (95% CI, 23%-39%) of benign follow-ups based on conventional DCE-MRI. Conclusion Ultrafast imaging can be added to conventional DCE-MRI to increase diagnostic accuracy while adding minimal scan time. Future work to standardize evaluation criteria may improve interreader agreement and allow for more robust ultrafast DCE-MRI assessment.
Article
Full-text available
Purpose Integrated Palliative care Outcome Scale (IPOS) is a specific tool for assessing needs in palliative care, recording and monitoring physical symptoms, emotional concerns, and communication and practical issues. This study aimed to evaluate if the IPOS tool was able to assess the impact of at-home palliative care program on physical symptoms and psychosocial problems in advanced cancer patients. Methods This observational prospective longitudinal mixed-method study included advanced cancer patients assisted at home. IPOS questionnaire (patient version—7-day recall) was administered at the entry, after 2 and 4 weeks. A qualitative thematic analysis (TA) of the first open-ended question was performed. Change over time in IPOS scores was analyzed by Friedman’s test for repeated measures. Results Among the 60 patients included (29 men, 31 women; 68.2 ± 14.0 years), 47 completed the 4-week observation period. TA indicated that the 3 main themes running through the three surveys (at the entry, day 14, and day 28) relate patients’ concerns about symptoms and side effects of treatments, family members, the evolution of the disease, and the daily issues. Repeated measures test demonstrated that patients entering with medium–high IPOS total score (n = 27) showed a significant decrease in IPOS total score (p = 0.003), physical symptoms (p = 0.002), and communication and practice (p = 0.028) subscales after 2 and 4 weeks. Conclusion Patients entering in home care with higher burden of symptoms and psychosocial problems reported significant decrease in IPOS scores. In these patients, IPOS was responsive to change showing substantial clinical improvements after the activation of home assistance.
Article
Full-text available
Im Zuge der "Digitalen Schule"-Initiative des österreichischen Bundesministeriums für Bildung, Wissenschaft und Forschung und der damit verbundenen Ausstattung von Schüler*innen der Sekundarstufe I mit digitalen Geräten ist der Bedarf an hochwertigen digitalen Lernmaterialien, insbesondere im Fach Mathematik, stark gestiegen. Bisherige empirische Forschung zur optimalen Gestaltung digitaler Lernumgebungen (DLEs), speziell im Bereich Geometrie, ist jedoch begrenzt. Diese Studie untersucht zentrale Designelemente von DLEs aus Schüler*innenperspektive für das Geometrie-lernen in der Sekundarstufe I. In einer qualitativen Studie mit 17 Schüler*innen der fünften Schulstufe wurden die Nutzung und die Wahrnehmung von digitalen Geometrie-Lernmaterialien des Projekts FLINK in Mathe mithilfe der Think-Aloud-Methode analysiert. Die Analyse identifizierte drei zentrale Designelemente für DLEs im Bereich Geometrie: i) Navigation, ii) Kontextualisierung und Rahmung von Aufgaben sowie iii) Förderung selbstwirksamer Erfahrungen. Die Studie unter-streicht die Bedeutung einer aktiven Gestaltung von DLEs, die sowohl die Selbstständigkeit als auch die Lehrkräfte als unterstützende Instanz berücksichtigen. Die Ergebnisse liefern wertvolle Hinweise für die Entwicklung zukünftiger digitaler Lernmaterialien, die den spezifischen Bedürfnissen und Präferenzen von Schüler*innen entsprechen.
Article
Full-text available
Actualmente, uno de los problemas ambientales más recurrentes en la Amazonía peruana es la fragmentación de los bosques como consecuencia de la deforestación. En este contexto, la presente investigación tuvo como objetivos cuantificar la cobertura del bosque y demás usos de la tierra, estimar el cambio de la cobertura boscosa y calcular las métricas de fragmentación en cuatro zonas de amortiguamiento de áreas naturales protegidas ubicadas en la región San Martín, Perú durante los años 2017 y 2021. Se utilizó como insumo cartográfico la información espacial de uso y cambio de uso de la tierra proporcionada por Esri Land Cover y la información espacial de bosques y pérdida de bosques de la plataforma Geobosque. Como resultado, se consiguió cuantificar la cobertura boscosa y otros usos de la tierra de las cuatro zonas de amortiguamiento que corresponde a los años 2017 y 2021 con una exactitud temática considerable, teniendo en cuenta que en las cuatro zonas de estudio la cobertura boscosa es la que presenta mayor superficie. En cuanto a la tasa de cambio de bosque a no bosque para la zona de amortiguamiento del Parque Nacional del Río Abiseo fue de -0,71; del Parque Nacional Cordillera Azul fue de -1,30; y para el Bosque de Protección Alto Mayo fue de -1,87; mientras que para la Reserva Nacional Pacaya Samiria la superficie cubierta por bosque se mantuvo constante. Finalmente, las métricas calculadas nos indican que los bosques de las zonas de amortiguamiento de la región San Martín se encuentran moderadamente fragmentados.
Article
As manifestações clínicas da dermatite atópica (DA) canina são geralmente associadas a reações da imuno-globulina-E (IgE) contra alérgenos ambientais, principalmente ácaros domiciliares. Testes sorológicos e intradérmicos são indicados para o seu diagnóstico, para dire-cionar o controle de alérgenos e a escolha dos protocolos de imunoterapia. O objetivo desse estudo foi avaliar acurácia, valores preditivos e concordância do teste sorológico que utiliza fração Fc do mastócito e de um teste policlonal, comparados com teste intradérmico (ID) de alta especificidade. Foram realizados testes ID em 78 cães (55 atópicos e 23 saudáveis). Os testes sorológicos foram feitos em 20 cães atópicos com ID positivo e 19 cães saudáveis com ID negativo. O teste policlonal apresentou acurácia de 0,49, sensibilidade (S) de 74%, especificidade (E) de 26%, valor preditivo positivo (VPP) de 50% e valor preditivo negativo (VPN) de 45%. Para cada ácaro, a acurácia variou de 0,46 a 0,58, com S e E variando entre 36 e 78%. O teste FceR1-alfa apresentou acurácia de 0,46, S de 65%, E de 26%, VPP de 48% e VPN de 42%. A concordância entre os sorológicos avaliados foi moderada e não houve concordância entre estes e o teste ID. Os resultados indicam que os testes sorológicos avaliados não são indicados para a escolha de alérgenos para imunoterapia alérgeno-específica quando utilizado como referência um teste ID de alta especificidade.
Article
Full-text available
Objective This study evaluates the reliability, usefulness, quality, and readability of ChatGPT’s responses to frequently asked questions about scoliosis. Methods Sixteen frequently asked questions, identified through an analysis of Google Trends data and clinical feedback, were presented to ChatGPT for evaluation. Two independent experts assessed the responses using a 7-point Likert scale for reliability and usefulness. Additionally, the overall quality was also rated using the Global Quality Scale (GQS). To assess readability, various established metrics were employed, including the Flesch Reading Ease score (FRE), the Simple Measure of Gobbledygook (SMOG) Index, the Coleman-Liau Index (CLI), the Gunning Fog Index (GFI), the Flesch-Kinkaid Grade Level (FKGL), the FORCAST Grade Level, and the Automated Readability Index (ARI). Results The mean reliability scores were 4.68 ± 0.73 (Median: 5, IQR 4–5), while the mean usefulness scores were 4.84 ± 0.84 (Median: 5, IQR 4–5). Additionally the mean GQS scores were 4.28 ± 0.58 (Median: 4, IQR 4–5). Inter-rater reliability analysis using the Intraclass correlation coefficient showed excellent agreement: 0.942 for reliability, 0.935 for usefulness, and 0.868 for GQS. While general informational questions received high scores, responses to treatment-specific and personalized inquiries required greater depth and comprehensiveness. Readability analysis indicated that ChatGPT’s responses required at least a high school senior to college-level reading ability. Conclusion ChatGPT provides reliable, useful, and moderate quality information on scoliosis but has limitations in addressing treatment-specific and personalized inquiries. Caution is essential when using Artificial Intelligence (AI) in patient education and medical decision-making.
Thesis
Full-text available
The overall aim of this thesis was to report on the development and implementation of a standardized psychosocial autopsy to understand and prevent suicide in the Netherlands. Chapter 2 sets off with an exploration of stakeholder perceptions and needs concerning the implementation of a standardized psychosocial autopsy in the Netherlands. Standardized herein refers to a specific, predetermined set of guiding principles and conditions relating to processes involved with the psychosocial autopsy, ranging from the interview instrument to data collection and the translation into recommendations for prevention. In the second part of this dissertation, we present findings from psychosocial autopsy studies into adolescent suicides, and railway suicides. In chapter 3 we investigated the differences in suicide-related communication between young male and female (aged under 20 years old) suicide decedents. We used a qualitative analysis technique called the ‘Constant Comparative Method’101 to investigate 798 suicide-related communication events reported in interviews concerning 35 young male and female decedents. In chapter 4 we explored the meaning of social media in the lives of the adolescents who died by suicide. Interpretative Phenomenological Analysis was performed to assess the role social media had in the lives and to the deaths of these adolescents, with particular attention to the ways in which social media use affected their wellbeing and distress. Chapter 5 reports a mixed-methods psychosocial autopsy of railway suicides. In this study, we combined data detailing the sociodemographic characteristics of all railway suicide decedents in the Netherlands from 2017 and 2021, with data from in-depth psychosocial autopsy interviews concerning 39 railway suicide decedents. We started working towards a retrospective, dynamic, cross-sectional cohort that facilitates real-time monitoring of psychosocial characteristics of suicides and allows for an analysis of time-trends and clusters in the future. Chapter 6 describes findings from the pilot study with the new, mixed-methods psychosocial autopsy of suicide in young and middle-aged people. In chapter 7, the use of Large Language Models for automated deductive coding of interview data is explored and evaluated.In chapter 8, we discuss the findings from our research that have yet received little attention in scientific literature, reflect on lessons learned, and discuss the future of the psychosocial autopsy.
Article
Citriculture has worldwide importance, and monitoring the nutritional status of plants through leaf analysis is essential. Recently, proximal sensing has supported this process, although there is a lack of studies conducted specifically for citrus. The objective of this study was to evaluate the application of portable X-ray fluorescence spectrometry (pXRF) combined with machine learning algorithms to predict the nutrient content (B, Ca, Cu, Fe, K, Mg, Mn, P, S, and Zn) of citrus leaves, using inductively coupled plasma optical emission spectrometry (ICP-OES) results as a reference. Additionally, the study aimed to differentiate 15 citrus scion/rootstock combinations via pXRF results and investigate the effect of the sample condition (fresh or dried leaves) on the accuracy of pXRF predictions. The samples were analyzed with pXRF both fresh and after drying and grinding. Subsequently, the samples underwent acid digestion and analysis via ICP-OES. Predictions using dried leaves yielded better results (R2 from 0.71 to 0.96) than those using fresh leaves (R2 from 0.35 to 0.87) for all analyzed elements. Predictions of scion/rootstock combinations were also more accurate with dry leaves (Overall accuracy = 0.64, kappa index = 0.62). The pXRF accurately predicted nutrient contents in citrus leaves and differentiated leaves from 15 scion/rootstock combinations. This can significantly reduce costs and time in the nutritional assessment of citrus crops.
Article
Energy is one important concept in physics, but science education research has repeatedly shown that students struggle to develop a full understanding of energy. Especially challenging for students is the notion of potential energy. Overwhelmed by the sheer number of potential energy forms, students struggle to make connections between them. Students often struggle to develop a conceptual understanding of potential energy, resulting in difficulties in learning about energy in general and their continued learning about energy. To address this issue, scholars have proposed incorporating fields into energy instruction. Through fields, the various forms of potential energy can be connected and synthesized into two simple underlying principles: (1) fields mediate interaction‐at‐a‐distance and (2) the energy is stored in a field with the amount of energy depending on the configuration of the objects. Recent studies suggest that incorporating fields in middle school energy instruction is feasible and effective; however, little is known about whether and how middle school students connect energy and fields ideas to benefit their learning. In response to this research gap, we developed a unit on energy with fields and a comparable unit without fields and compared students' learning on energy in these two units. In a mixed‐methods approach, we examined students' learning on energy during an introductory and a continued learning unit on energy with N = 67 students from grade 7. Our findings suggest that students who learned about energy with fields outperformed students who learned about energy without fields. Furthermore, fields‐based energy instruction seemed to support students in developing better‐connected knowledge networks that reflect deeper conceptual understanding of energy. Our findings suggest that incorporating fields into energy instruction could help students to better understand energy and to better continue learning about energy.
Article
Full-text available
Notes that various procedures are available for measuring agreement among 2 or more os who classify responses among nominal categories, but that different problem situations require different measures. The general model of a contingency table with fixed margins is used to suggest (a) a measure of level of agreement among several os when compared internally, (b) a conditional measurement of agreement for several os compared internally, (c) a test for the joint agreement of several os when compared with a standard, and (d) a statistic for evaluating the pattern of agreement between 2 os. Illustrations are presented for each situation, and results of a monte carlo study of the behavior of the pattern agreement statistic are discussed. (19 ref.) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
2 statistics, kappa and weighted kappa, are available for measuring agreement between 2 raters on a nominal scale. Formulas for the standard errors of these 2 statistics are in error in the direction of overestimation, so that their use results in conservative significance tests and confidence intervals. Valid formulas for the approximate large-sample variances are given, and their calculation is illustrated using a numerical example. (PsycINFO Database Record (c) 2006 APA, all rights reserved). © 1969 American Psychological Association.
Article
Full-text available
Introduced the statistic kappa to measure nominal scale agreement between a fixed pair of raters. Kappa was generalized to the case where each of a sample of 30 patients was rated on a nominal scale by the same number of psychiatrist raters (n = 6), but where the raters rating 1 s were not necessarily the same as those rating another. Large sample standard errors were derived.
Article
Full-text available
A previously described coefficient of agreement for nominal scales, kappa, treats all disagreements equally. A generalization to weighted kappa (Kw) is presented. The Kw provides for the incorpation of ratio-scaled degrees of disagreement (or agreement) to each of the cells of the k * k table of joint nominal scale assignments such that disagreements of varying gravity (or agreements of varying degree) are weighted accordingly. Although providing for partial credit, Kw is fully chance corrected. Its sampling characteristics and procedures for hypothesis testing and setting confidence limits are given. Under certain conditions, Kw equals product-moment r. The use of unequal weights for symmetrical cells makes Kw suitable as a measure of validity.
Article
The statistical analysis of data in multi-dimensional contingency tables is discussed in terms of appropriate underlying probability models. Emphasis is placed on the distinction between `factors' (such as treatments or blocks) which have fixed marginal totals and `responses' (such as category of performance) which have random marginal totals. Hence, four principal cases arise: (i) the `multi-response, no factor' tables, (ii) the `multi-response, uni-factor' tables, (iii) the `multi-response, multi-factor' tables, (iv) the `uni-response, multi-factor' tables. For situations (i) and (ii), the concept of `no interaction' is related to questions regarding the pattern of association among responses. However, for situation (iv), it is related to how factors combine (e.g., additively) to determine the response distribution. Finally, for situation (iii), both types of questions arise. For each of the different types of tables, the problem of formulating appropriate hypotheses of `no interaction' is considered. The corresponding test statistics are based upon a general and computationally simple criterion of Wald [1943]. The suggested methods are illustrated with several numerical examples.
Article
This paper is concerned with contingency tables which are analogous to the well-known mixed model in analysis of variance. The corresponding experimental situation involves exposing each of n subjects to each of the d levels of a given factor and classifying the d responses into one of r categories. The resulting data are represented in an r r \cdots r contingency table of d dimensions. The hypothesis of principal interest is equality of the one-dimensional marginal distributions. Alternatively, if the r categories may be quantitatively scaled, then attention is directed at the hypothesis of equality of the mean scores over the d first order marginals. Test statistics are developed in terms of minimum Neyman χ2\chi^2 or equivalently weighted least squares analysis of underlying linear models. As such, they bear a strong resemblance to the Hotelling T2^2 procedures used with continuous data in mixed models. Several numerical examples are given to illustrate the use of the various methods discussed.
Article
When populations are cross-classified with respect to two or more classifications or polytomies, questions often arise about the degree of association existing between the several polytomies. Most of the traditional measures or indices of association are based upon the standard chi-square statistic or on an assumption of underlying joint normality. In this paper a number of alternative measures are considered, almost all based upon a probabilistic model for activity to which the cross-classification may typically lead. Only the case in which the population is completely known is considered, so no question of sampling or measurement error appears. We hope, however, to publish before long some approximate distributions for sample estimators of the measures we propose, and approximate tests of hypotheses. Our major theme is that the measures of association used by an empirical investigator should not be blindly chosen because of tradition and convention only, although these factors may properly be given some weight, but should be constructed in a manner having operational meaning within the context of the particular problem.
Article
The estimates of Koch [1967a] have the undesirable property that they may change in value if the same constant is added to each of the observations. In this paper, an alternative procedure based on the same generd principles is developed and applied to a variety of models. As before, the estimators obtained are unbiased and consistent. They are also reasonably easy to compute. Finally, in the case of balanced experiments, they coincide with those obtained from the analysis of variance. On the other hand, their structure is more complex than that of the estimators considered in the previous paper. In particular, the derivation of their covariance matrix is much more complicated, and hence no attempt has been made here to study its properties.
Article
A general method of estimation of variance components in random-effects models of the nested and/or classification type is considered. If a given parameter is estimable with respect to some particular experimental design (i.e., an unbiased estimate of the parameter may be obtained from the experiment), then the suggested estimator may be readily computed with only the aid of a desk calculator. The estimates are always unbiased and consistent (with respect to the structure of the experimental design); in the case of balanced experiments, they coincide with those obtained from the analysis of variance.
Article
A very important and yet widely misunderstood concept or problem in science and technology is that of precision and accuracy of measurement. It is therefore necessary to define the terms precision and accuracy (or imprecision and inaccuracy) clearly and analytically if possible. Also, we need to establish and develop appropriate statistical tests of significance for these measures, since generally a relatively small number of measurements will be made or taken in most investigations.In this paper a discussion is given of some of the pertinent literature for estimating variances in errors of measurement, or the “imprecisions” of measurement, when two or three instruments are used to take the same observations on a series of items or characteristics. Also, present techniques for comparing the imprecision of measurement of one instrument with that of a second instrument through the use of statistical tests of significance are reviewed, as well as procedures for detecting the significance of the difference in biases or levels of measurement of two instruments. Finally, we indicate methods of extending present theory to the case of three measuring instruments, for which rather sensitive statistical test of significance are developed for dealing with the precision and accuracy problem.An example for the three instrument case is given to illustrate the suggested methodology of analysis.
Article
The statistical analysis of multi-dimensional contingency tables is discussed from the point of view of the associated underlying model. Different formulations of hypotheses of ‘no interaction’ are considered. The corresponding test statistics are based on a general and computationally simple criterion originally due to Wald [1943]. The suggested methods are illustrated with several numerical examples.
Article
This paper deals with the theory of a proposed method for the statistical study of measuring processes. The practical aspects of the method, including computational details, are discussed in a companion paper published in the ASTM Bulletin. In the present article a theoretical framework is proposed for the mathematical expression of the sources of variation in measuring methods and a suitable method of statistical analysis is described. Particular attention is given, both here and in the companion paper, to interlaboratory studies of test methods. An illustration based on data taken from the chemical literature is appended.
Article
A generalization is given to the multivariate case of the linear model usually employed in the determination of accuracy of observations. Likelihood ratio tests are derived for testing hypotheses concerning systematic differences among observers, and a criterion is suggested for evaluating the magnitude of errors of measurement.
Article
2***Department of Biostatistics, School of Public Health, University of North Carolina, Chapel Hill, North Carolina 27514, U.S.A.
Article
Determining the extent of association between 2 ordinal variables is a recurrent problem in psychological research. Several statistics are available and include rho, gamma, and tau. The basic problem with these techniques is that they measure order rather than extent of agreement. As a consequence, 2 quite different sets of ordinal data will produce the same statistical results, providing only that the ordering of each set of rankings is a constant. A new statistic, which can be expressed as a simple percentage of agreement, is proposed as an alternative method, and applied to a hypothetical research problem. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Establishes the property that if Vij = (c-j)2 (Vij denotes the disagreement weight in the weighted Kappa formula) and if the variables can be scaled 1 and 2, then irrespective of the marginal distributions, weighted Kappa is identical with the intraclass correlation coefficient in which the mean differences between the raters is included as a component of variability. A discussion of this property is presented along with an example. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Summary This paper reviews research situations in medicine, epidemiology and psychiatry, in psychological measurement and testing, and in sample surveys in which the observer(rater or interviewer) can be an important source of measurement error. Moreover, most of the statistical literature in observer variability is surveyed with attention given to a notational unification of the various models proposed. In the continuous data case, the usual analysis of variance (ANOVA) components of variance models are presented with an emphasis on the intraclass correlation coefficient as a measure of reliability. Other modified ANOVA models, response error models in sample surveys, and related multivariate extensions are also discussed. For the categorical data case, special attention is given to measures of agreement and tests of hypotheses when the data consist of dichotomous responses. In addition, similarities between the dichotomous and continous cases are illustrated in terms of intraclass correlation coefficients. Finally, measures of agreement, such as kappa and weighted-kappa, are discussed in the context of nominal and ordinal data. A proposed unifying framework for the categorical data case is given in the form of concluding remarks.
Article
This paper is concerned with the analysis of multivariate categorical data which are obtained from repeated measurement experiments. An expository discussion of pertinent hypotheses for such situations is given, and appropriate test statistics are developed through the application of weighted least squares methods. Special consideration is given to computational problems associated with the manipulation of large tables including the treatment of empty cells. Three applications of the methodology are provided.
Article
This paper presents a general statistical methodology for the analysis of multivariate categorical data involving agreement among more than two observers. Since these situations give rise to very large contingency tables in which most of the observed cell frequencies are zero, procedures based on indicator variables of the raw data for individual subjects are used to generate first-order margins and main diagonal sums from the conceptual multidimensional contingency table. From these quantities, estimates are generated to reflect the strength of an internal majority decision on each subject. Moreover, a subset of observers who demonstrate a high level of interobserver agreement can be identified by using pairwise agreement statistics between each observer and the internal majority standard opinion on each subject. These procedures are all illustrated within the context of a clinical diagnosis example involving seven pathologists.
Article
GENCAT is a computer program which implements an extremely general methodology for the analysis of multivariate categorical data. This approach essentially involves the construction of test statistics for hypotheses involving functions of the observed proportions which are directed at the relationships under investigation and the estimation of corresponding model parameters via weighted least squares computations. Any compounded function of the observed proportions which can be formulated as a sequence of the following transformations of the data vector--linear, logarithmic, exponential, or the addition of a vector of constants--can be analyzed within this general framework. This algorithm produces minimum modified chi-square statistics which are obtained by partitioning the sums of squares as in ANOVA. The input data can be either: (a) frequencies from a multidimentional contingency table; (b) a victor of functions with its estimated covariance matrix; and (c) raw data in the form of integer-valued variables associated with each subject. The input format is completely flexible for the data as well as for the matrices.
Article
At least a dozen indexes have been proposed for measuring agreement between two judges on a categorical scale. Using the binary (positive-negative) case as a model, this paper presents and critically evaluates some of these proposed measures. The importance of correcting for chance-expected agreement is emphasized, and identities with intraclass correlation coefficients are pointed out.
Article
The minimum modified chi-square method of analyzing contingency tables is extended to compounded logarithmic, exponential and linear functions. These compounded functions allow one to consider the following practical situations in terms of a general technique: 1) Patterns of association in square contingency tables, as related to functions of diagonal totals, 2) Rank correlation coefficients, 3) "Ridits" 4) Partial association. The derivation of this procedure and examples showing its application are presented.
Article
For epidemiologic and comparative pathologic studies of cerebral atherosclerosis, assessment of reliability of measurements is necessary. Such a study is described along with the measurement method used. The development of the methodology for assessing reliability of data is presented. Within and between coder variability is estimated. For the biometrician, the salient feature is that several methods of determining reliability might have to be explored and tried before arriving at a method which is useful and acceptable to the clinician or clinical pathologist.
Article
This paper illustrates tests for some suitable hypotheses in analysis of contingency tables when some characters are quantitative. For a two-dimensional table tests are given for the hypothesis of homogeneity of mean scores, the hypothesis of linearity of regression of mean scores, and also for testing significance of regression of mean scores on the level of the other character. For a three-dimensional table some similar procedures are offered. It is briefly pointed out how such test criteria can be derived in a systematic manner by an application of a certain generalized least squares technique.
Article
Assume there are n i (i=1,2,⋯,s) samples from s multinomial distributions, each having r categories of response. Then define any u functions of the unknown true cell probabilities {π ij :i=1,2,⋯,s;j=1,2,⋯,r, where ∑ i=1 r π ij =1} that have derivatives of order up to the second with respect to π ij and for which the matrix of first derivatives is of rank u. A general noniterative procedure is described for fitting these functions to a linear model, for testing the goodness-of-fit of the model, and for testing hypotheses about the parameters in the linear model. The special cases of linear functions and logarithmic functions of the π ij are developed in detail, and some examples of how the general approach can be used to analyze various types of categorical data are presented.
The Analysis of Variance
  • H Scheff
Scheff6, H. [1959]. The Analysis of Variance. Wiley, New York.
Studies on multiple sclerosis in Winnipeg. Manitoba and New Orleans
  • K B Westlund
  • L T Kurland
Westlund, K. B. and Kurland, L. T. [1953]. Studies on multiple sclerosis in Winnipeg. Manitoba and New Orleans, Louisiana. American Journal of Hygiene 57, 380-396.
A general methodology for the measurement of observer agreement when the data are categorical
  • J R Landis
Landis, J. R. [1975]. A general methodology for the measurement of observer agreement when the data are categorical. Ph.D. Dissertation, University of North Carolina, Institute of Statistics Mimeo Series No. 1022.
Statistical Theory in Research A note on the equivalence of two test criteria for hypotheses in categorical data
  • R L Anderson
  • T A Bancroft
Anderson, R. L. and Bancroft, T. A. [1952]. Statistical Theory in Research. McGraw Hill, New York. Bhapkar, V. P. [1966]. A note on the equivalence of two test criteria for hypotheses in categorical data. Journal of the American Statistical Association 61, 228-235.