Article

Validity of the Quality Standards Assessment for Children’s Residential Care

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This article builds upon a proiminent definition of construct validity that focuses on variation in attributes causing variation in measurement outcomes. This article synthesizes the defintion and uses Rasch measurement modeling to explicate a modified conceptualization of construct validity for assessments of developmental attributes. If attributes are conceived as developmental, hypotheses about how new knowledge builds cumulatively upon the cognitive capacity afforded by prior knowledge can be developed. This cumulative ordering of knowledge required to accomplish test items constitutes evidence of a specific form of construct validity. Examples of cumulative ordering appear in the extant literature, but they are rare and confined to the early literature. Furthermore, cumulative ordering has never been explicated, especially its relationship to construct validity. This article describes three of the most complete examples of cumulative ordering in the literature. These examples are used to synthesize a method for assessing cumulative ordering, in which the Rasch model is used to assess the progression of item difficulties which are, in turn, used to review developmental theories and hypotheses, and the tests themselves. We discuss how this conceptualization of construct validity can lead to a more direct relationship between developmental theories and tests which, for practitioners, should result in a clearer understanding of what tests results actually mean. Finally, we discuss how cumulative ordering can be used to facilitate decisions about consequential validity.
Article
Full-text available
Background Dependent variables in health psychology are often counts, for example, of a behaviour or number of engagements with an intervention. These counts can be very strongly skewed, and/or contain large numbers of zeros as well as extreme outliers. For example, ‘How many cigarettes do you smoke on an average day?’ The modal answer may be zero but may range from 0 to 40+. The same can be true for minutes of moderate-to-vigorous physical activity. For some people, this may be near zero, but take on extreme values for someone training for a marathon. Typical analytical strategies for this data involve explicit (or implied) transformations (smoker v. non-smoker, log transformations). However, these data types are ‘counts’ (i.e. non-negative whole numbers) or quasi-counts (time is ratio but discrete minutes of activity could be analysed as a count), and can be modelled using count distributions – including the Poisson and negative binomial distribution (and their zero-inflated and hurdle extensions, which alloweven more zeros). Methods In this tutorial paper I demonstrate (in R, Jamovi, and SPSS) the easy application of these models to health psychology data, and their advantages over alternative ways of analysing this type of data using two datasets – one highly dispersed dependent variable (number of views on YouTube, and another with a large number of zeros (number of days on which symptoms were reported over a month). Results The negative binomial distribution had the best fit for the overdispersed number of views on YouTube. Negative binomial, and zero-inflated negative binomial were both good fits for the symptom data with over-abundant zeros. Conclusions In both cases, count distributions provided not just a better fit but would lead to different conclusions compared to the poorly fitting traditional regression/linear models.
Article
Full-text available
Background: The Work Stress Questionnaire (WSQ) was developed as a self-administered questionnaire with the purpose of early identification of individuals at risk of being sick-listed due to work-related stress. It has previously been tested for reliability and face validity among women with satisfying results. The aim of the study was to test reliability and face validity of the Work Stress Questionnaire (WSQ) among male workers. Method: For testing reliability, a test-retest study was performed where 41 male workers filled out the questionnaire on two occasions at 2 weeks intervals. For evaluating face validity, seven male workers filled out the questionnaire and gave their opinions on the questions, scale steps and how the items corresponded to their perception of stress at work. Results: The WSQ was, for all but one item, found to be stable over time. The item Supervisor considers one's views showed a systematic disagreement, i.e. there was a change common to the group for this item. Face validity was confirmed by the male pilot group. Conclusion: Reliability and face validity of the WSQ was found to be satisfying when used on a male population. This indicates that the questionnaire can be used also for a male target group.
Article
Full-text available
With the increased emphasis on accountability at the federal and state levels, efforts to identify and address issues impacting the quality and effectiveness of residential care programs are needed. Establishing quality practice standards with measurable performance indicators is a useful means for promoting and evaluating the quality of care in residential programs and informing a process of continuous quality improvement. In this article, we describe a state-wide initiative to enhance the quality of residential group care in Florida. Specifically, we describe efforts to establish a set of quality standards for residential programs and to operationalize the standards by developing and piloting an assessment designed to measure the extent to which group home practices align with the standards. In addition to describing steps taken, we highlight important conceptual and practical considerations for translating standards generated from extant research, best practices, and field experts into clearly defined domains and measurable standards that can be meaningfully transported into a complex practice setting. Finally, we discuss lessons learned and recommendations that may guide similar efforts beyond the state of Florida.
Article
Full-text available
Reliability and validity describe desirable psychometric characteristics of research instruments. The concept of validity is also applied to research studies and their findings. Internal validity examines whether the study design, conduct, and analysis answer the research questions without bias. External validity examines whether the study findings can be generalized to other contexts. Ecological validity examines, specifically, whether the study findings can be generalized to real-life settings; thus ecological validity is a subtype of external validity. These concepts are explained using examples so that readers may understand why the consideration of internal, external, and ecological validity is important for designing and conducting studies, and for understanding the merits of published research.
Article
Full-text available
There are three recent developments in the field of therapeutic residential care (TRC) which provide a major leap forward for new policy and practice directions. These developments further promote TRC as an essential component in the system of care for youth services. First, an international consensus statement provides a definition on what are key elements for the TRC level for the first time. Second, research reviews documenting the effectiveness of TRC practices and models provide a clear base for determining current program quality and establishing future research, program development, and policy directions. Third a public/private partnership involving providers, lead agencies, research leaders and state agencies is establishing new quality standards for out-of-home care in Florida. These standards draw on both the consensus statement and the TRC empirical base and seek to elevate the quality of individual TRC programs as well as out-of-home care statewide. We provide the experience of one agency, which is a national provider of TRC using a model of care with promising research evidence, to suggest that these three developments give practitioners, policy makers, and researchers a fresh perspective on how to fit TRC programs into an integrated continuum of care.
Article
Full-text available
Purpose: Service user involvement in instrument development is increasingly recognised as important, but is often not done and seldom reported. This has adverse implications for the content validity of a measure. The aim of this paper is to identify the types of items that service users felt were important to be included or excluded from a new Recovering Quality of Life measure for people with mental health difficulties. Methods: Potential items were presented to service users in face-to-face structured individual interviews and focus groups. The items were primarily taken or adapted from current measures and covered themes identified from earlier qualitative work as being important to quality of life. Content and thematic analysis was undertaken to identify the types of items which were either important or unacceptable to service users. Results: We identified five key themes of the types of items that service users found acceptable or unacceptable; the items should be relevant and meaningful, unambiguous, easy to answer particularly when distressed, do not cause further upset, and be non-judgemental. Importantly, this was from the perspective of the service user. Conclusions: This research has underlined the importance of service users' views on the acceptability and validity of items for use in developing a new measure. Whether or not service users favoured an item was associated with their ability or intention to respond accurately and honestly to the item which will impact on the validity and sensitivity of the measure.
Article
Full-text available
Social scientists frequently study complex constructs. Despite the plethora of measures for these constructs, researchers may need to create their own measure for a particular study. When a measure is created, psychometric testing is required, and the first step is to study the content validity of the measure. The purpose of this article is to demonstrate how to conduct a content validity study, including how to elicit the most from a panel of experts by collecting specific data. Instructions on how to calculate a content validity index, factorial validity index, and an interrater reliability index and guide for interpreting these indices are included. Implications regarding the value of conducting a content validity study for practitioners and researchers are discussed.
Article
Full-text available
This study aimed to review the literature describing and quantifying time lags in the health research translation process. Papers were included in the review if they quantified time lags in the development of health interventions. The study identified 23 papers. Few were comparable as different studies use different measures, of different things, at different time points. We concluded that the current state of knowledge of time lags is of limited use to those responsible for R&D and knowledge transfer who face difficulties in knowing what they should or can do to reduce time lags. This effectively 'blindfolds' investment decisions and risks wasting effort. The study concludes that understanding lags first requires agreeing models, definitions and measures, which can be applied in practice. A second task would be to develop a process by which to gather these data.
Article
Full-text available
Using the term ecological validity, in a recent issue of Ecological Psychology, Rogers, Kadar, and Costall (2005)19. Rogers , S. , Kadar , E. and Costall , A. 2005. Gaze patterns in the visual control of straight-road driving and braking as a function of speed and expertise. Ecological Psychology, 17: 19–38. [Taylor & Francis Online], [Web of Science ®]View all references discussed how the simulator they used could provide data by replicating natural road driving behaviors. However, ecological validity, as Brunswik (1956)3. Brunswik , E. 1956. Perception and the representative design of psychological experiments, , 2nd ed. Berkeley, CA: University of California Press.. View all references conceived it, refers to the validity of a cue (i.e., perceptual variable) in predicting a criterion state of the environment. Like other psychologists in the past, Rogers et al. (2005)19. Rogers , S. , Kadar , E. and Costall , A. 2005. Gaze patterns in the visual control of straight-road driving and braking as a function of speed and expertise. Ecological Psychology, 17: 19–38. [Taylor & Francis Online], [Web of Science ®]View all references confused this term with another of Brunswik's terms: representative design. In this comment, the authors clarify the distinction between these concepts and also discuss how Gibsonian ideas can strengthen understanding of the correspondence between experimental task constraints and behavioral settings outside the laboratory. The main implication of this theoretical rationalization is for the development of a measurable correspondence between experimental and behavioral contexts, enabling defensible generalization to both organisms and environments beyond the bounds of particular experiments.
Article
Ecological validity refers to how closely an experiment aligns with real‐world phenomena. In applied behavioral research, ecological validity may guide decisions about experimental settings, stimuli, people, and other design features. However, inconsistent use of the term ecological validity in the published literature has led to a somewhat disjointed technology. The purposes of this paper were to review current uses of the term “ecological validity” in the Journal of Applied Behavior Analysis, propose ways to make a study more ecologically valid, and develop a checklist to assist in identifying the type and degree of ecological validity in any given study.
Article
Cognitive neuroscience has traditionally focused on simple tasks, presented sparsely and using abstract stimuli. While this approach has yielded fundamental insights into functional specialisation in the brain, its ecological validity remains uncertain. Do these tasks capture how brains function 'in the wild', where stimuli are dynamic, multimodal, and crowded? Ecologically valid paradigms that approximate real life scenarios, using stimuli such as films, spoken narratives, music, and multiperson games emerged in response to these concerns over a decade ago. We critically appraise whether this approach has delivered on its promise to deliver new insights into brain function. We highlight the challenges, technological innovations, and clinical opportunities that are required should this field meet its full potential.
Article
Advanced practice registered nurses implement evidence-based care guidelines and assess the quality of care delivered to pediatric and adolescent populations to ensure that the highest standards of care are provided to the patients and their families. Standardized health care quality measures allow for assessment of clinical competence, monitoring of equitable health care distribution, improvement of provider/institutional accountability, development of standards for accreditation and certification, informing of quality improvement efforts, and creation of criteria for provider incentive payments. The purpose of this article is to explain why health care quality measures are established, what agencies oversee the development of meaningful pediatric quality measures, and how these measures inform and improve the care provided by pediatric-focused advanced practice registered nurses.
Article
There is a growing emphasis on the need for quality in residential care programs for children and adolescents. This present review is an effort to determine what practices have been identified in published sources specifically focused on quality standards for residential care for children and adolescents. Published quality standards for residential care from seven organizations or government agencies were identified and included in the review. Sixty-five quality standards within 8 domains were identified, and a crosswalk table linking each standard to the seven source documents was produced. Overall there was a 72.5% agreement across the seven sources for the quality standards. The identified quality standards clearly show the common elements and multidimensional nature of quality issues for residential care for children and adolescents. It is imperative that a comprehensive approach to, and measurement of, quality standards become pervasive and fully integrated into the services provided to troubled youth and their families.
Chapter
This chapter introduces the techniques of outcome assessment and program evaluation as they might be employed by nonprofit organizations. Outcome assessment is a goals&;#x02010;based process; programs are assessed relative to the goals they are designed to achieve. Once the goals have been defined, attention must turn to how to measure them. Before thinking about specific measures, decision makers should become familiar with some basic measurement concepts and with the various types of measures such as client questionnaire surveys. Evaluation of most nonprofit programs calls for multiple measures, including both quantitative and qualitative measures. The chapter discusses two approaches to program evaluation such as objective scientist approach and utilization&;#x02010;focused evaluation. Nonprofit executives should plan to involve themselves and the program staff extensively in analysis and review of evaluation findings. This involvement is necessary first for accuracy: staff review of data and reports minimizes the risk of outside evaluators reporting inaccurate conclusions.
Article
Research Findings: There is a growing need for accurate and efficient measures of classroom quality in early childhood education and care (ECEC) settings. Observational measures are costly, as their administration generally takes 3–5 hr per classroom. This article outlines the process of development and preliminary concurrent validity testing of the Assessment for Quality Improvement (AQI), a new measure of global quality. The AQI is a classroom-level measure of structural and process quality. It consists of 24 items on a 5-point scale designed for use in ECEC infant and toddler classrooms. At between 60 and 90 min per room, the AQI is a relatively efficient measure. Item response theory modeling was used to ensure logical and coherent ordering of subitems. Exploratory factor analysis supported the use of the AQI total score and the Interactions section as a stand-alone measure. Correlations between the Infant and Toddler versions of the AQI and the Infant/Toddler Environment Rating Scale–Revised were moderate, providing preliminary support for the concurrent validity of both versions. Practice or Policy: Our results suggest that the AQI is a promising, efficient measure of global quality in infant and toddler ECEC environments. This may be especially relevant for Quality Rating and Improvement Systems, for which the observational component is a major cost driver.
Article
This book provides an overview of scale and test development. From conceptualization through design, data collection, analysis, and interpretation, critical concerns are identified and grounded in the increasingly sophisticated psychometric literature. Measurement within the health, social, and behavioral sciences is addressed, and technical and practical guidance is provided. Acknowledging the increasingly sophisticated contributions in social work, psychology, education, nursing, and medicine, the book balances condensation of complex conceptual challenges with focused recommendations for conceiving, planning, and implementing psychometric study. Primary points are carefully referenced and consistently illustrated to illuminate complicated or abstract principles. Basics of construct conceptualization and establishing evidence of validity are complimented with introductions to concept mapping and cross-cultural translation. In-depth discussion of cutting edge topics like bias and invariance in item responses is provided. Exploratory and confirmatory factor analytic strategies are illustrated and critiqued, and step-by-step guidance is offered for anticipating elements of a complete data collection instrument, determining sampling frame and size, and interpreting resulting coefficients. Much good work has been done by RAI developers to date. Too often, practitioners or researchers either underestimate the skills and effort required, or become overwhelmed by the complexities involved.
Article
Psychology is one of the main disciplines that have been implied in the development of cognitive ergonomics. For a long time, at least from the 1960s, some researchers in psychology have contributed to research in cognitive ergonomics with the aim of elaborating basic psychological knowledge, (a) with high ecological validity, and (b) with clear relevance to application. This paper stresses the value of this perspective for psychology as well as cognitive ergonomics, and evaluates the results of such an enterprise. Ecological validity is considered as a particular aspect of external validity that enables researchers to transfer findings from experimental situations ('artificial' ones or designed for research purpose) to real work situations ('natural' (obviously, in this context 'natural' includes 'cultural') ones or imposed by comprehension needs). This aspect is discussed as regards classical distinctions like basic/applied research and research/practice. Attention is particularly devoted to the necessary (ecological) context needed by expert operators to implement their work expertise, which is the target of the comprehension aim of cognitive ergonomics. Conclusions are drawn in terms of methods to design and evaluate ecological validity, not only to understand cognitive mechanisms, but also to improve cognitive work conditions and the overall performance of human-machine systems.
Article
Purpose The purpose of this paper is to examine the performance of five alternative measures of service quality in the high education sector – service quality (SERVQUAL), importance‐weighted SERVQUAL, service performance (SERVPERF), importance‐weighted SERVPERF, and higher education performance (HEdPERF). Design/methodology/approach Data were collected by means of a structured questionnaire containing perception items enhanced from the SERVPERF and HEdPERF scales and expectation items from the SERVQUAL scale, modified to fit into the higher education (HE) sector. The first draft of the questionnaire was subject to a pilot testing through a focus group and an expert evaluation. Data were gathered from a sample of 360 students of a Portuguese University in Lisbon. Scales were compared in terms of unidimensionality, reliability, validity and explained variance. Findings It can be concluded that SERVPERF and HEdPERF present the best measurement capability, but it is not possible to identify which one is the best. Research limitations/implications Since the study only examined the measurement capabilities of the five instruments at a single faculty, the collection of more data in other institutions is required in order to provide more general results. Practical implications The current results do make available some important insights into how the five alternative instruments of service quality in an HE context compare with one another. Originality/value The paper attempts to develop insights into comparative evaluations of five measuring instruments of service quality in an HE setting.
Article
The arbitrary assertion of two of three experts does not establish content validity. Application of a two-stage process that incorporates rigorous instrument development practices and quantifies the aspects of content validity is required. In the first stage of this process, the content domain or dimensions are identified and items are generated to reflect the scope of the content domain of a cognitive variable or each of the dimensions of an affective variable. Once generated, the items are assembled in a usable, testable format. The instrument and domain or dimension specifications are then presented to a panel of experts, the size of which is an a priori decision, for their judgment of the items using a 4-point ordinal rating scale. Using the item evaluation, CVI calculations are applied to both the items and the entire instrument. The experts are asked, as a part of the content validity assessment, to identify areas of omission and to suggest areas of item improvement or modification. Admittedly, there are times when adherence to such rigor may not be feasible. When less stringent methods of determining validity are applied, it should not be said that content validity has been determined. Opponents of the process described in this article might argue that these applications and expectations exceed practical application and that this process is therefore too rigorous. Content validity, by its nature and definition, demands rigor in its assessment, and its assessment is, in fact, critical. Such a rigorous process for content validity determination is offered because content validity is an inexpendable form of validity which is rapidly losing credibility due to its less than standardized and rigorous assessments. Content validity, different from all other forms of validity, is crucial to the understanding of research findings and their practical or theoretical applications. It is worth the rigor.
Article
Measurement is necessary but not sufficient for quality improvement. Because the purpose of the national quality measurement and reporting system (NQMRS) is to improve quality, a discussion of the link between measurement and improvement is critical for ensuring an appropriate system design. To classify approaches to the use of measurement in improvement into two different--although linked and potentially synergistic--agendas, or "pathways." To discuss the barriers encountered in each of these pathways and identify steps needed to motivate improvement in both pathways. Descriptive, conceptual discussion. The barriers to the use of information to motivate change include, in Pathway I (selection), the lack of skill, knowledge, and motivation on the part of those who could drive change by using data to choose from among competing providers, and, in Pathway II (change in care delivery), the deficiencies in organizational and professional capacity in health care to lead change and improvement itself. Neither the dynamics of selection nor the dynamics of improvement work reliably today. The barriers are not just in the lack of uniform, simple, and reliable measurements, they also include a lack of capacity among the organizations and individuals acting on both pathways.
Article
In the past 50 years we have made substantial progress in understanding the biology of disease and in devising new ways to prevent or treat it. However, there has been a substantial lag in applying what we know to actual patient care. In this article, based on his Shattuck Lecture, Claude Lenfant outlines the magnitude of the problem of translating research knowledge into clinical practice and offers suggestions for closing this gap.
Article
Scale developers often provide evidence of content validity by computing a content validity index (CVI), using ratings of item relevance by content experts. We analyzed how nurse researchers have defined and calculated the CVI, and found considerable consistency for item-level CVIs (I-CVIs). However, there are two alternative, but unacknowledged, methods of computing the scale-level index (S-CVI). One method requires universal agreement among experts, but a less conservative method averages the item-level CVIs. Using backward inference with a purposive sample of scale development studies, we found that both methods are being used by nurse researchers, although it was not always possible to infer the calculation method. The two approaches can lead to different values, making it risky to draw conclusions about content validity. Scale developers should indicate which method was used to provide readers with interpretable content validity information.
Establishing effective quality improvement systems for children’s residential care programs
  • S Boel-Studt
  • J C Huefner
  • A Greenwald
Scale development: Theory and applications
  • R F Devellis
  • C T Thorpe
  • DeVellis R. F.
Excerpt from validity: Theory into practice
  • C A Dwyer
  • Dwyer C. A.
  • Piedmont R. L.
The meaning of state supervision in the social protection of children
  • K H Welch
  • Welch K. H.
MPlus statistical analysis with latent variables: User’s guide
  • L K Muthén
  • B O Muthén
  • Muthén L. K.