ArticlePDF Available

The reliability of difference scores: A re-examination

Authors:
... In the context of indirect change measurement, it is possible to compare the pre-and post-treatment measurements and study the relationship between them. For instance, it is assumed that change scores are usually negatively correlated with clients' pretreatment status on the given variable (Castonguay et al., 2021;Chiou & Spreng, 1996). This means that clients with more severe baseline distress tend to demonstrate greater change during treatment, while clients who are better off at baseline show smaller changes. ...
... This suggests that even in the context of a direct, retrospective measurement, clients can reliably compare their current status to their pretreatment status and assess the magnitude of change in a way that is similar to the pre-post change measurement. It also means that the same psychometric effect that can be found in the context of pre-post measurement (i.e., the higher the baseline severity, the more room for improvement; Chiou & Spreng, 1996) applies to retrospective measurement. This feature is important because it makes retrospective and pre-post change scores directly comparable. ...
... 차이점수의 신뢰도가 낮다는 주장(e.g., Cronbach & Furby, 1970;Linn & Slinde, 1977;Lord, 1963)은 식 (5)에 제시된 차이점수 신뢰 도 공식 (Gulliksen, 1950) (Chiou & Spreng, 1996;Gollwitzer et al., 2014;Lord, 1956). 이 두 가지 가정이 만족되지 않는 보다 일반적인 상황에 서 차이점수의 신뢰도는 식 (6)과 같이 나타 낼 수 있다 (Gollwitzer et al., 2014;Lord, 1963;Zimmerman & Williams, 1982, 1998. ...
Article
심리학 여러 분야에서 사전, 사후 시점에 반복측정한 자료에 기반하여 처치집단과 통제집단 간 변화의 차이를 살펴보는 연구를 자주 볼 수 있다. 이때 연구자들이 가장 널리 사용하는 분석 모 형은 차이점수 모형과 공분산분석 모형이다. 하지만, 이 두 모형은 때로 상이한 결과를 산출하기 때문에, 많은 연구자들은 언제 어떠한 방법을 사용해야 하는지 혼란을 겪고 있다. 이에, 본 연구 는 두 모형을 이론적, 경험적으로 비교한 연구를 개관하고, 이에 기반하여 언제 어느 모형을 사 용하는 것이 적절한지 가이드라인을 제시하고자 하였다. 이를 위해, 우선 두 모형을 각각 소개하 고, 예시 자료를 통해 두 모형이 서로 다른 분석 결과를 산출할 수 있음을 보였다. 다음으로, 차 이점수 사용과 관련된 논쟁을 살펴보고, 차이점수에 대한 전통적인 비판이 지나치게 단순화된 가정과 잘못된 믿음에 근거한 것임을 확인하였다. 이어서, 인과추론의 맥락에서 두 방법이 어떤 숨겨진 가정을 내포하고 있는지 이론적으로 살펴보고, 이러한 가정 및 시뮬레이션 연구 결과들 에 기반하여, 실험집단에 참여자를 할당하는 방법과 분석 목적에 따라 어떤 방법을 사용하는 것 이 적절한지 가이드라인을 제시하였다. 본 연구를 통해 연구자들이 보다 적절한 분석 방법을 선 택하고, 엄밀하고 효과적으로 분석을 수행하는 데 도움을 제공할 수 있을 것으로 기대된다.
... The "subtraction method" (Donders, 1868) has been a valuable tool for experimental researchers (Chiou & Spreng, 1996). Studies consistently show that participants are slower to respond to incongruent trials than congruent trials, suggesting that incongruent trials are more cognitively demanding than congruent trials (MacLeod, ...
Article
Full-text available
Individual differences in the ability to control attention are correlated with a wide range of important outcomes, from academic achievement and job performance to health behaviors and emotion regulation. Nevertheless, the theoretical nature of attention control as a cognitive construct has been the subject of heated debate, spurred on by psychometric issues that have stymied efforts to reliably measure differences in the ability to control attention. For theory to advance, our measures must improve. We introduce three efficient, reliable, and valid tests of attention control that each take less than 3 min to administer: Stroop Squared, Flanker Squared, and Simon Squared. Two studies (online and in-lab) comprising more than 600 participants demonstrate that the three "Squared" tasks have great internal consistency (avg. = .95) and test-retest reliability across sessions (avg. r = .67). Latent variable analyses revealed that the Squared tasks loaded highly on a common factor (avg. loading = .70), which was strongly correlated with an attention control factor based on established measures (avg. r = .81). Moreover, attention control correlated strongly with fluid intelligence, working memory capacity, and processing speed and helped explain their covariation. We found that the Squared attention control tasks accounted for 75% of the variance in multitasking ability at the latent level, and that fluid intelligence, attention control, and processing speed fully accounted for individual differences in multitasking ability. Our results suggest that Stroop Squared, Flanker Squared, and Simon Squared are reliable and valid measures of attention control. The tasks are freely available online: https://osf.io/7q598/. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
... 7 A major reason for this lack of between-participants variance is the subtraction methodology employed in calculating dependent variables in many experimental tasks. Difference scores and measures relying upon them have been criticized by psychometricians for their unreliability (Cronbach & Furby, 1970;Edwards, 2001;Paap & Sawi, 2016), which is a by-product of the correlation of their components (Chiou & Spreng, 1996). The components of an attention control task (e.g. ...
Article
Working memory refers to how we keep track of what we are doing moment to moment throughout our waking lives. It allows us to remember what we have just done, focus on what we are doing now, to solve problems, be creative, think about what we will be doing in the next few seconds, and continually to update in our mind changes around us throughout the day. This book brings together in one volume, state-of-the-science chapters written by some of the most productive and well-known working memory researchers worldwide. Chapters cover leading-edge research on working memory, using behavioural experimental techniques, neuroimaging, computational modelling, development across the healthy human lifespan, and studies of neurodegenerative disease and focal brain damage. A unique feature of the book is that each chapter starts with answers to a set of common questions for all authors. This allows readers very rapidly to compare key differences in theoretical assumptions and approaches to working memory across chapters, and to understand the theoretical context before going on to read each chapter in detail. All authors also have been asked to consider evidence that is not consistent with their theoretical assumptions. It is very common for authors to ignore contradictory evidence. This approach has led to new interpretations and new hypotheses for future research to greatly enhance our understanding of this crucial human ability.
... The subtraction methodology appears to be a great tool for experimental researchers (see Chiou & Spreng, 1996), but the use of difference scores in individual differences research has been denounced by psychometricians for over half a century (Cronbach & Furby, 1970). Many researchers have noted that difference scores are poorly suited for correlational work because they are often unreliable and minimize between-subjects variance (Ackerman & Hambrick, 2020;Draheim et al., 2016;Friedman & Miyake, 2004;Hedge et al., 2018). ...
Article
In this chapter, we discuss the measurement of working memory capacity and attention control. We begin by examining the origins of complex span measures of working memory capacity, which were created to better understand the cognitive processes underpinning language comprehension. We then review evidence for the executive attention theory of working memory, which places attention control at the center of individual differences in working memory capacity and fluid intelligence. Next, we describe the relationship between working memory capacity, attention control, and language comprehension, and discuss how maintenance and disengagement – two functions supported by the control of attention – contribute to performance across a range of cognitive tasks. We then identify factors that threaten the construct and criterion validity of measures of working memory capacity and attention control and detail the steps our laboratory has taken to refine these measures. We close by providing practical recommendations and resources to researchers who wish to use our new measures of working memory capacity and attention control in their work.
... Thus, it is not entirely clear whether difference scores contributed much to the lower reliability of the clock task. It has also recently been recognised that, in contrast to the popularly held belief that difference scores have low reliability (Thomas & Zumbo, 2012), this is not always the case when incorporating estimates of variability (Chiou & Spreng, 1996;Trafimow, 2015), supporting the findings of the current paper. In addition, many latent processes may contribute to the modelagnostic measures of which some may be reliable. ...
Thesis
Ketamine is a rapidly-acting antidepressant and has shown to be effective in depressed individuals who have previously failed to benefit from other available treatments. An important question is how ketamine works. Addressing this might help inform more targeted and efficient treatments in the future. The aim of this thesis was to examine the neural, cognitive, and computational mechanisms underpinning the antidepressant response to ketamine in treatment-resistant depression. The work has specifically focused on motivational processing, since ketamine is particularly effective in alleviating symptoms of anhedonia, which are thought to be related to impaired reward-related function. Following a general introduction (Chapter 1), the first experimental chapter (Chapter 2) focuses on identifying suitable reward and punishment tasks for repeated testing in a clinical trial. Test retest properties of various tasks are explored in healthy individuals, assessed by both traditional measures of task performance (e.g., accuracy) and computational parameters. Chapter 3 outlines a pilot simultaneous EEGfMRI study in healthy individuals probing the neural dynamics of the motivation to exert cognitive effort, an important but understudied process in depression. The third study (Chapter 4) uses resting-state fMRI to examine how ketamine modulates fronto-striatal circuitry, which is known to drive motivational behaviour, in depressed and healthy individuals. The final experimental chapter (Chapter 5) examines which cognitive and computational measures of motivational processing (using tasks identified in Chapter 2) change following a single dose of ketamine compared to placebo in depression, using a crossover design. Based on preliminary findings, it is tentatively proposed that ketamine might affect reward processing by enhancing fronto-striatal circuitry functional connectivity, as well as by increasing exploratory behaviours, and possibly punishment learning rates. The general discussion (Chapter 6) discusses these findings in relation to contemporary models of anhedonia and antidepressant action, considering both the limitations of the work presented and possible future directions.
... The "subtraction method" (Donders, 1868) has been a valuable tool for experimental researchers (Chiou & Spreng, 1996). Studies consistently show that participants are slower to respond to incongruent trials than congruent trials, suggesting that incongruent trials are more cognitively demanding than congruent trials (MacLeod, 1991). ...
Preprint
Full-text available
Individual differences in the ability to control attention are correlated with a wide range of important outcomes, from academic achievement and job performance to health behaviors and emotion regulation. Nevertheless, the theoretical nature of attention control as a cognitive construct has been the subject of heated debate, spurred on by psychometric issues that have stymied efforts to reliably measure differences in the ability to control attention. For theory to advance, our measures must improve. We introduce three efficient, reliable, and valid tests of attention control that each take less than three minutes to administer: Stroop Squared, Flanker Squared, and Simon Squared. Two studies (online and in-lab) comprising more than 600 participants demonstrate that the three “Squared” tasks have great internal consistency (avg. = .95) and test-retest reliability across sessions (avg. r = .67). Latent variable analyses revealed that the Squared tasks loaded highly on a common factor, which was strongly correlated with an attention control factor based on established measures (avg. r = .81). Moreover, attention control correlated strongly with fluid intelligence, working memory capacity, and processing speed, and helped explain their covariation. We found that the Squared attention control tasks accounted for 75% of the variance in multitasking ability at the latent level, and that fluid intelligence, attention control, and processing speed fully accounted for individual differences in multitasking ability. Our results suggest that Stroop Squared, Flanker Squared, and Simon Squared are reliable and valid indicators of attention control. The tasks are freely available online (https://osf.io/7q598/).
... Oliver R. L. (1980) studied the concept of satisfaction in the service industry and built the most popular model based on the "expectancy uncertainty model". The model argues that actual performance is measured according to the customer's initial expectations to assess satisfaction Chiou & Spreng (1996) defines "uncertainty" as the calculation of "difference scores" (specifically, the difference between expected performance evaluations and perceived performance evaluations). Anderson & Srinivasan, (2003) also ensures that it is not asserted as "a difference between postpurchase evaluation and post-use evaluation of product or service performance and pre-purchase expectations". ...
Article
Full-text available
Customers may switch to a different service provider if they are displeased with the standard, hence tracking service quality is crucial for a firm. The influence of Internet Service Quality (ISQ) on Customer Satisfaction will be studied in detail by the researcher. The relevance of this study is highlighted by the current COVID-19 scenario in Sri Lanka, where government laws and restrictions have been implemented to promote work from home, online learning, and online entertainment. Previous research in other countries have looked at the impact of ISQ on customer satisfaction; however, to our knowledge, no such study has been conducted in Sri Lanka. Furthermore, earlier study looked at a variety of scenarios, including the discovery of research gaps; however, in the context of COVID-19, the research gaps were identified. Past research identifies the following factors which impact ISQ: Tangibility, Assurance, Empathy, Reliability, Responsiveness and Price. This study was carried out as a deductive study and a quantitative method was employed. The convenience sample approach was used to collect 505 responses by distributing a questionnaire survey to consumers in Sri Lanka's Western region. Statistical Package for Social Science (SPSS) version 23 was used to analyze the data. The studies indicated that Internet Service Quality and Customer Satisfaction had a favorable association. The study was also able to provide insights for ISP management by emphasizing areas of ISQ that can satisfy their customer base, as well as actions that might be implemented in response to the observed practice gaps.
Article
Cognitive tasks are capable of providing researchers with crucial insights into the relationship between cognitive processing and psychiatric phenomena. However, many recent studies have found that task measures exhibit poor reliability, which hampers their usefulness for individual-differences research. Here we provide a narrative review of approaches to improve the reliability of cognitive task measures. Specifically, we introduce a taxonomy of experiment design and analysis strategies for improving task reliability. Where appropriate, we highlight studies that are exemplary for improving the reliability of specific task measures. We hope that this article can serve as a helpful guide for experimenters who wish to design a new task, or improve an existing one, to achieve sufficient reliability for use in individual-differences research.
Article
Bringing together cutting-edge research, this Handbook is the first comprehensive text to examine the pivotal role of working memory in first and second language acquisition, processing, impairments, and training. Authored by a stellar cast of distinguished scholars from around the world, the Handbook provides authoritative insights on work from diverse, multi-disciplinary perspectives, and introduces key models of working memory in relation to language. Following an introductory chapter by working memory pioneer Alan Baddeley, the collection is organized into thematic sections that discuss working memory in relation to: Theoretical models and measures; Linguistic theories and frameworks; First language processing; Bilingual acquisition and processing; and Language disorders, interventions, and instruction. The Handbook is sure to interest and benefit researchers, clinicians, speech therapists, and advanced undergraduate and postgraduate students in linguistics, psychology, education, speech therapy, cognitive science, and neuroscience, or anyone seeking to learn more about language, cognition and the human mind.
ResearchGate has not been able to resolve any references for this publication.