Article

Supervising intake diagnosis. A psychiatric 'Rashomon'

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Psychiatric diagnoses based on data collected during routine clinical intake evaluations done by trainees are often later used in research studies and in program evaluation. It is commonly assumed that the supervisory process can effectively overcome errors that trainees make in diagnosis. We designed a study to assess the adequacy of patient-in-absentia supervision for ensuring accurate psychiatric diagnoses. In 30% of the cases there were major diagnostic disagreements between the supervised diagnoses and consensus diagnoses based on information provided by both the trainee and an experienced clinician who sat in on the trainee's initial interview. These findings have implications for clinical care, training, and research.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... These data do not provide any direct indication of the validity of the ratings, but the base rates of certain key symptoms (e.g., Schneiderian first-rank symptoms, depressed mood, and catatonic symptoms) are generally what would be expected in a sample of young psychotic patients, and this provides an index of-comparative validity. Spitzer and Williams (1985) emphasize the importance of demonstrating the procedural validity of a new procedure for arriving at psychiatric diagnoses, whether or not the ultimate purpose of the new procedure is to replace an existing procedure. Procedural validity "concerns the extent to which the new diagnostic procedure yields results similar to the results of an established diagnostic procedure that is used as a criterion" (Spitzer and Williams 1985, p. 595). ...
... This interpretation is supported by the excess in these categories in the consensus data. Some of these factors are beginning to be recognized as general threats to the reliable and valid use of diagnostic criteria (Spitzer et al. 1982;Jampala et al. 1988;Winokur et al. 1988), and their influence needs to be considered in the design and interpretation of the present type of procedural validity study. It needs to be remembered that there is indeed no gold standard, that one is probably comparing a better diagnosis with a worse one in this type of study, and that as a result moderate levels of agreement are to be expected. ...
Article
Full-text available
The Royal Park Multidiagnostic Instrument for Psychosis is a validity-oriented assessment procedure developed for the acute psychotic episode using serial interviews and multiple information sources. This article describes the development and structure of the RPMIP and reports the findings of an interrater reliability study (n = 50). In addition, results are presented from a study that examined aspects of the procedural validity of the instrument when contrasted with consensus diagnoses made by a team of clinicians applying operational criteria in a less formal way to a common sample of patients (n = 87). Finally, the role of assessment procedures of this type in research into psychiatric disorders is briefly discussed.
... DSM-III diagnoses of the patients were made or supervised by members of the Biometrics Department at New York State Psychiatric Institute using clinical records and unstructured clinical interviews. A study on supervising intake diagnosis (Spitzer, Skodal, Williams, Gibbon, & Kass, 1982) determined that the biometrics group used DSM-III reliably, with demonstrated agreement between pairs of team members on the diagnostic class of patients' principal diagnoses in 21 of 24 (87%) jointly interviewed cases. In contrast, the rate of agreement between a biometrics rater and a clinic rater was only 66% over 50 cases. ...
Article
Full-text available
This study suggests that the inconsistent findings from previous research on the relation between self-disclosure and adjustment may be due to the inclusion of items on the Jourard Self-Disclosure Questionnaire (JSDQ) that are confounded with symptoms of poor adjustment or psychopathology. One-hundred-twenty-two cases of Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association, 1980) major depression were compared with 197 well controls on the degree to which they reported self-disclosure on a 27-topic version of the JSDQ. JSDQ items were separated into symptom-independent and symptom-dependent subscales on the basis of ratings by clinical experts. Results confirmed the initial hypothesis, demonstrating that well controls were more likely than depressed cases to disclose symptom-independent topics but were not more likely to disclose symptom-dependent topics.
... On the other hand, patients with similar symptom profiles will not receive the same diagnosis if they fall on opposite sides of the arbitrary cutoff point that defines presence versus absence of the disorder. To be sure, clinicians have the option of noting the presence of personality disorder features, but this is seldom done in clinical practice (Spitzer, Skodol, Williams, Gibbon, & Kass, 1982). ...
Article
Full-text available
Examines the reliability and validity of 2 dimensional methods for the assessment of personality disorder symptoms and traits. In Study 1, 3 groups that varied in personality pathology level completed the Schedule for Nonadaptive and Adaptive Personality (SNAP; L. A. Clark, 1993), a self-report questionnaire that measures traits relevant to Axis II pathology. Differences among the groups, which were patterned in theoretically interesting ways, are discussed. In Study 2, 2 independent judges rated 22 clusters of Axis II symptoms in 56 state hospital inpatients based on chart information. Good interrater reliability was obtained (median coefficient = .71), and personality-related pathology was quite prevalent. Relations among symptom ratings, SNAP scores, and chart diagnoses were generally systematic, but anomalous findings also emerged. Implications for the dimensional assessment of personality-related pathology are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
... Los trabajos llevados a cabo por distintos autores han demostrado que, aun utilizando criterios diagnós ticos precisos y sistemas nosológicos lo bastante elabo rados, la fiabilidad diagnóstica se reduce notablemen te cuando no se utilizan entrevistas psiquiátricas estructuradas o semiestructuradas (Spitzer y cols., 1982;Lipton y Simon, 1985). Estos hallazgos justifican el esfuerzo que se está realizando por estructurar, en la entrevista psiquiátrica, las distintas fases del proceso diagnóstico. ...
... Standard interviews were not used in clinical settings. Without standard interviews, Spitzer (23) has suggested that residents' diagnoses are probably not accurate enough for research purposes, and it is likely this is also true for practising psychiatrists (24). Some of the clinicians preferred diagnoses, assumed to be Axis I, could actually have been ICD-9 diagnoses. ...
Article
Full-text available
The authors report on the use of the DSM-III, several years after its introduction, in the clinical diagnosis of 154 subjects with first onset psychosis. Clinicians usually assigned Axis I diagnoses but used the remainder of the multiaxial system less than one time in three; if a standard recording form was in place, the multiaxial system was used more often. Trainees used the DSM-III most, followed by psychiatrists affiliated with a university and community based clinicians. Agreement between researchers and clinicians on diagnoses was fair to poor. The authors discuss the implications of the acceptance of the complex diagnostic system in routine clinical practice.
... Traditional methods of determining IRR typically have trainees watch the same video and then compare ratings. This methodology is flawed, as two people passively watching a video done by a third person (usually an expert) artificially inflates inter-rater reliability by reducing the ''information variance'' that would occur if the two raters had to interview the patient independently (Spitzer & Williams, 1980;Spitzer et al., 1982). We determined IRR using videoconferencing, which allows pairs of trainees at different locations to independently interview the same patient at a third (central) location for purposes of determining inter-rater reliability. ...
Article
Poor inter-rater reliability is a major concern, contributing to error variance, which decreases power and increases the risk for failed trials. This is particularly problematic with the Hamilton Depression Scale (HAMD), due to lack of standardized questions or explicit scoring procedures. Establishing standardized procedures for administering and scoring the HAMD is typically done at study initiation meetings. However, the format and time allotted is usually insufficient, and evaluation of the trainee's ability to actually conduct a clinical interview is limited. To address this problem, we developed a web-based, interactive rater education program for standardized training to diverse sites in multi-center trials. The program includes both didactic training on scoring conventions and live, remote observation of trainees applied skills. The program was pilot tested with nine raters from a single site. Results found a significant increase in didactic knowledge pre-to-post testing, with the mean number of incorrect answers decreasing from 6.5 (S.D.=1.64) to 1.3 (S.D.=1.03), t(5)=7.35, P=0.001 (20 item exam). Seventy-five percent of the trainees' interviews were within two points of the trainer's score. Inter-rater reliability (intraclass correlation) (based on trainees actual interviews) was 0.97, P<0.0001. Results support the feasibility of this methodology for improving rater training. An NIMH funded study is currently underway examining this methodology in a multi-site trial.
Article
Full-text available
Complex issues are inherent in psychology training. This bibliography addresses diverse aspects of supervision in psychotherapy and more generally in professional psychology, especially during internship. English-language articles, books, and chapters are listed in eight categories: books; administrative, ethical, and legal issues; evaluation; internship; professional standards and training; supervisee development, perspectives, and issues; supervisor issues and the supervisory relationship; and supervision approaches, issues, research, techniques, and theories. Citations from the counselor education, marital/family therapy, psychiatry, psychoanalysis, school psychology, and social work literatures are also included. Most references are from the 1970s and 1980s; older citations are included to provide historical perspectives on supervision.
Article
Examined the impact of the Diagnostic and Statistical Manual of Mental Disorders (DSM-III) on residency and undergraduate training in psychiatry by surveying 208 directors of residency training and 94 directors of medical student education in psychiatry. Questionnaires examined the use and teaching of the DSM-III, positive and negative effects of the DSM-III, and usefulness of DSM-III criteria. Results indicate that the DSM-III has had a major and generally positive impact on psychiatric training. Most Ss believed that DSM-III offers a common language for diagnostic discussions, facilitates learning basic psychopathology, encourages paying attention to specific patient behaviors, and helps in formulating a differential diagnosis; negative consequences of using the DSM-III noted include its atheoretical approach and a perceived overemphasis on signs and symptoms. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
our central question: is environmentally induced stress a causal factor in the occurrence of episodes of major depression and schizophrenia / if so, what are the processes involved / [includes discussion questions and responses by Bruce Dohrenwend and Patrick Shrout] the most direct evidence . . . comes from a handful of retrospective case-control case studies of stressful life events and these two disorders [schizophrenia and depression] / review what we regard as the most important studies before reporting results from our own research on 122 major depressives, 65 persons with schizophrenia or schizophrenia-like disorders [between 19 and 59 yrs of age] (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
In this taxonomic article we explore the metaphor of comparing a psychiatric classification to a psychological test. Structurally, diagnostic criteria are like test items; diagnostic categories are like scales; and classification are like tests. Analytically, the ideas of reliability and validity are the primary concepts invoked in the empirical evaluation of both classifications and tests. However, when the metaphor is explored in more detail, the differences between classifications and tests become clear. These differences are discussed in terms of the structural and analytical relations between tests and classifications. This metaphorical analysis of classifications as tests suggests that certain issues that have been discussed in regard to psychological tests, particularly reliability and validity, may require modification when applied to psychiatric classification.
Article
This paper reviews recent progress in the assessment of the psychopathology of psychosis and focuses upon a number of persistent limitations. In particular the over-emphasis upon reliability at the expense of validity is highlighted and the relationship between reliability and validity is reconsidered. A number of strategies aimed at improving the validity of psychopathological measures are outlined. Proposals to deploy such variables differently in research strategies and analyses are endorsed.
Article
This is a progress report on some of the research that was planned and begun with Barbara Dohrenwend before she died in 1982. The main focus is on two of the studies. One was conducted in New York City; the other is still underway in Israel. The New York study is a retrospective case/control study of social and psychological factors that may put people at risk for developing schizophrenic episodes and episodes of major depression. The Israel research consists of epidemiological, case/control, and family studies of these two disorders together with other types of psychopathology that are inversely related to social class. Preliminary findings from both studies are reported, and their implications for primary prevention are discussed.
Article
Agreement between two structured, criterion-based, diagnostic interviews was determined for 86 psychiatric inpatients. The patients were given the Diagnostic Interview Schedule and the Psychiatric Diagnostic Interview on two separate occasions by two teams of trained interviewers who were blind to the patients' hospital diagnoses. Agreement between the two procedures was quite high with respect to (a) total number of positive syndromes, (b) identification of specific syndromes, (c) final diagnoses, and (d) ratings of clinical compatibility by experienced psychiatrists. Diagnostic discrepancies were primarily due to differences in the way the two instruments approach patients who fulfilled inclusive diagnostic criteria for two or more psychiatric syndromes. Although both interviews provide similar diagnostic information, the relative ease of administration and scoring of the PDI recommend it for clinical use.
Article
We studied the ability of psychiatric practitioners to recognize and treat major depression in standard clinical practice in Finland. A questionnaire with 18 items (including, e.g. physicians characteristics, two case reports and diagnostic and treatment proposals for both of them) was sent to 255 physicians in communal psychiatric outpatient care, 216 physicians responded (85%). Results suggest that diagnostic accuracy was good. Treatment proposals showed high sensitivity and lower specificity when the use of antidepressive medication was examined. This may reflect increased education concerning the illness and the effect of the new antidepressants, which are probably considered easier to initiate, or may be partly due to systematic error. Physicians characteristics determined neither diagnostic nor treatment decisions.
Article
Full-text available
This study examined the validity and utility of two types of computer-administered versions of a screening interview, PRIME-MD (Primary Care Evaluation of Mental Disorders), in a mental health setting: one administered by desktop computer and one by computer using a touch-tone telephone and interactive voice response (IVR) technology. Fifty-one outpatients at a community mental health clinic were given both IVR and desktop PRIME-MD and the Structured Clinical Interview for DSM-IV (SCID-IV), which was administered by a clinician, in a counterbalanced order. Diagnoses were also obtained from charts. Prevalence rates found by both computer interviews were similar to those obtained by the SCID-IV for the presence of any diagnosis, any affective disorder, and any anxiety disorder. Prevalence rates for specific diagnoses were also similar to those found by the SCID-IV except for dysthymia, obsessive-compulsive disorder, and panic disorder; the first two conditions were found to be more prevalent by the computer, and panic disorder was more prevalent by the SCID. Compared with the prevalence rates in the charts, the rates found by the computer were higher for anxiety disorders, particularly for obsessive-compulsive disorder and social phobia. Using the SCID-IV as the criterion, both computer-administered versions of PRIME-MD had high sensitivity, specificity, and positive predictive value for most diagnoses. No significant difference was found in how well patients liked each form of interview. Results support the validity and utility of both desktop and IVR PRIME-MD for gathering information from mental health patients about certain diagnoses.
Article
Full-text available
The aim of this study was to determine if the diagnostic profile of inpatients of a psychiatric unit in a general hospital influences the length of stay. The results of a retrospective survey comprising the first 16 years of operation of the Psychiatric Unit of the Ribeirão Preto General Hospital (PURP) showed that the progressive increase observed in the length of stay correlated with the increase in percentage of schizophrenia diagnosis, after the 8th year of hospital operation, and of affective disorders, after the 12th year. The length of hospitalization kept increasing until the 16th year, even though there was no change in the diagnostic profile of the patients admitted to the unit. In a prospective study encompassing the next six months, 61 inpatients were evaluated with the Structured Clinical Interview for DSM-III-R and the Brief Psychiatric Rating Scale (BPRS). The results showed that 82% of the inpatients fulfilled the diagnostic criteria for the schizophrenic or affective disorder spectrum at admission, with a discharge rate slower than for other diagnoses, although the length of hospitalization did not significantly differ among diagnostic categories. The results further demonstrated that in every diagnostic category more than 50% of the patients stayed in hospital for more than one week after reaching a BPRS score equal to 6, indicative of discharge. Overall, these data suggest that the increase in length of hospitalization may be due to a higher percentage of patients with a diagnosis of schizophrenia and affective disorder admitted to the PURP. In addition, patients with low symptomatic levels remained in hospital longer than they should have.
Article
Clinical Computing is a natural tool for evidence-based practice. Automated self-report produces accurate clinical assessments both in research and clinical settings, thus assuring that patients in each satisfy the same symptom criteria. The Electronic Medical Record (EMR) eventually will form a real-time Information bridge between research and clinical settings. Despite substantial literature demonstrating the efficacy of clinical computing in psychiatric care and research, however, psychiatrists have been slow to adopt computers, and research has dwindled. The steady emergence of system-wide EMRs, will spark a resurgence.
Article
Full-text available
This work aimed at comparing the accuracy of the psychiatric diagnoses made under indirect supervision to the diagnoses obtained through Structured Clinical Interview for DSM-III-R (SCID). The study was conducted in 3 university services (outpatient, inpatient and emergency). Data from the emergency service were collected 3 years later, after changes in the training process of the medical staff in psychiatric diagnosis. The sensitivity for Major Depression (outpatient 10.0%; inpatients 60.0%, emergency 90.0%) and Schizophrenia (44.4%; 55.0%; 80.0%) improved over time. The reliability was poor in the outpatient service (Kw = 0.18), and at admission to the inpatient service (Kw = 0.38). The diagnosis elaborated in the discharge of the inpatient service (Kw = 0.55) and in the emergency service (Kw = 0.63) was good. Systematic training of supervisors and residents in operational diagnostic criteria increased the accuracy of psychiatric diagnoses elaborated under indirect supervision, although excellent reliability was not achieved.
ResearchGate has not been able to resolve any references for this publication.