Test Review: GARS-2: Gilliam Autism Rating Scale-Second Edition
ABSTRACT The Gilliam Autism Rating Scale–Second Edition (GARS-2) is a screening tool for autism spectrum disorders for individuals between the ages of 3 and 22. It was designed to help differentiate those with autism from those with severe behavioral disorders as well as from those who are typically developing. It is a norm-referenced instrument that reflects the conceptualizations of autism per the Diagnostic and Statistical Manual of Mental Disorders (fourth edition, text revision; American Psychiatric Association, 2000) and the Autism Society of America (1994).[AUTHOR:PLEASE INCLUDE IN LIST. ASA 1994 CITED TWICE IN ARTICLE.] The GARS-2 was designed to be a supplementary tool for the diagnosis of autism, and it is intended to be used with a variety of diagnostic tools and relevant information in comprehensive assessment protocols for autism. Furthermore, it is intended to facilitate data-driven approaches to assessment and intervention by incorporating a resource that links instructional objectives to assessment findings. General Description Administration Screens and diagnostic instruments for autism spectrum disorders are increasingly required by professionals in various fields to enable timely and appropriate interventions (National Resource Council, 2001). There is a need for easy-to-use instruments that pos-sess good psychometric properties. Moreover, clinicians benefit from using tools that pro-vide an obvious and explicit link to intervention. However, few existing instruments incorporate this much-needed feature. The GARS-2 is a measure that has potential to link assessment to intervention. It can be completed by parents and professionals in a variety of settings, including the home and school. The examiner is responsible for selecting the most appropriate person to complete the rating. The rater should be someone who knows the individual well and has had sus-tained contact with him or her for at least 2 weeks. Furthermore, it is helpful if the rater is privy to early development information; as such, a parent is often the most appropriate rater for this measure. In cases where a professional regularly works with the individual for a significant part of each day, he or she may be the most appropriate rater to summa-rize current behaviors and functioning. In this type of scenario, one may want to include ratings from professionals and parents to enable the most comprehensive and accurate descriptions. However, Gilliam suggests that if this approach is adopted, the examiner may need to average the ratings to resolve discrepant reports. In addition, raters should be instructed to reserve ratings on items on which they believe they may not have enough information to complete.
- SourceAvailable from: link.springer.com[show abstract] [hide abstract]
ABSTRACT: The Gilliam Autism Rating Scale was developed to identify individuals with autism in research and clinical settings. It has benefited from wide use and acceptance but has received little empirical attention. The purpose of this study was to evaluate the construct and diagnostic validity, interrater reliability, and effects of participant characteristics of the GARS in a large and heterogeneous sample of children and adolescents with autism spectrum disorders. 360 parent and teacher ratings were submitted to factor analysis. A three-factor solution explaining 38% of the variance was obtained. Almost half of all items loaded on a Repetitive and Stereotyped Behavior factor. The Developmental Disturbance subscale did not contribute to the Autism Quotient (AQ) and was poorly related to other subscales. Internal consistency for the three behavioral subscales was good but low for the Developmental Disturbance subscale. The average AQ was significantly lower than what was reported in the test manual, suggesting low sensitivity with the current cutoff criteria. Interrater reliability was also much lower than originally reported by the instrument's developer. No significant age or gender effects were found. Level of impairment, as measured by adaptive behavior, was negatively related to total and subscale scores. The implications of these findings were discussed, as was the use of diagnostic instruments in the field in general.Journal of Autism and Developmental Disorders 01/2006; 35(6):795-805. · 3.34 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: The Gilliam Autism Rating Scale (GARS) was developed as a relatively easy, inexpensive aid in the surveillance and diagnosis of autism. This study examined the validity of the GARS when used with a sample of 119 children with strict DSM-IV diagnoses of autism, ascertained from both clinical and research settings. The GARS consistently underestimated the likelihood that autistic children in this sample would be classified as having autism. The sample mean for the Autism Quotient, a hypothesized index of the likelihood of having autism, was 90.10, significantly below the reference mean of 100. Diagnostic classification according to criteria specified by the GARS resulted in a sensitivity of only .48. Limitations of rating scales in general and of the GARS specifically are discussed. It is recommended that clinicians and researchers using or considering using the GARS for autism diagnosis or ratings of autism severity recognize the need for further research regarding its use.Journal of Autism and Developmental Disorders 01/2003; 32(6):593-9. · 3.34 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: Recent years have seen a surge of interest in assessment instruments for diagnosing autism in children. Instruments have generally been developed and evaluated from a research perspective. The Autism Diagnostic Observation Schedule-Generic (ADOS-G), Autism Diagnostic Interview-Revised (ADI-R), and Gilliam Autism Rating Scale (GARS) have received considerable attention and are widely used. The objective of this study was to explore the diagnostic utility and discriminative ability of these tools using a clinical population of children referred to a specialty diagnostic clinic over a 3 year time span. The results indicated that the ADOS-G and ADI-R led to approximately 75 percent agreement with team diagnoses, with most inconsistencies being false positive diagnoses based on the measures. The GARS was generally ineffective at discriminating between children with various team diagnoses and consistently underestimated the likelihood of autism. The findings have important implications for the use of these measures in both research and clinical practice.Autism 12/2006; 10(6):533-49. · 2.27 Impact Factor
Gilliam, J. (2006). GARS-2: Gilliam Autism Rating Scale–Second
Edition. Austin, TX: PRO-ED.
The Gilliam Autism Rating Scale–Second Edition (GARS-2) is a screening tool for
autism spectrum disorders for individuals between the ages of 3 and 22. It was designed to
help differentiate those with autism from those with severe behavioral disorders as well as
from those who are typically developing. It is a norm-referenced instrument that reflects the
conceptualizations of autism per the Diagnostic and Statistical Manual of Mental Disorders
(fourth edition, text revision; American Psychiatric Association, 2000) and the Autism
Society of America (1994).[AUTHOR:PLEASE INCLUDE IN LIST. ASA 1994 CITED
TWICE IN ARTICLE.] The GARS-2 was designed to be a supplementary tool for the
diagnosis of autism, and it is intended to be used with a variety of diagnostic tools and
relevant information in comprehensive assessment protocols for autism. Furthermore, it is
intended to facilitate data-driven approaches to assessment and intervention by incorporating
a resource that links instructional objectives to assessment findings.
Screens and diagnostic instruments for autism spectrum disorders are increasingly
required by professionals in various fields to enable timely and appropriate interventions
(National Resource Council, 2001). There is a need for easy-to-use instruments that pos-
sess good psychometric properties. Moreover, clinicians benefit from using tools that pro-
vide an obvious and explicit link to intervention. However, few existing instruments
incorporate this much-needed feature.
The GARS-2 is a measure that has potential to link assessment to intervention. It can
be completed by parents and professionals in a variety of settings, including the home and
school. The examiner is responsible for selecting the most appropriate person to complete
the rating. The rater should be someone who knows the individual well and has had sus-
tained contact with him or her for at least 2 weeks. Furthermore, it is helpful if the rater
is privy to early development information; as such, a parent is often the most appropriate
rater for this measure. In cases where a professional regularly works with the individual
for a significant part of each day, he or she may be the most appropriate rater to summa-
rize current behaviors and functioning. In this type of scenario, one may want to include
ratings from professionals and parents to enable the most comprehensive and accurate
descriptions. However, Gilliam suggests that if this approach is adopted, the examiner
may need to average the ratings to resolve discrepant reports. In addition, raters should be
instructed to reserve ratings on items on which they believe they may not have enough
information to complete.
Journal of Psychoeducational
Volume XX Number X
Month XXXX xx-xx
© 2008 Sage Publications
According to the manual, the GARS-2 can be completed in the absence of the examiner.
However, Gilliam advises that raters should be trained on how to use the instrument, which
may in many cases preclude completion away from the examiner. Furthermore, if required,
the examiner can administer the test using a structured interview format in which the par-
ent or professional is asked how each item should be scored.
Although the test is divided into nine sections, it has three key components: subscale and
composite scores, a parent interview, and key questions to enable diagnostic accuracy. The
three subscales of the GARS-2 measure a series of negative behaviors reflecting the three
primary areas of the DSM-IV-TR criteria for the diagnosis of autism. Scores are generated
for stereotyped behaviors, communication, and social interaction. In addition, an autism
index provides a composite indication of autism severity. Respondents are required to
choose from one of the four possible choices provided for each of 42 Likert-type items.
(0 = never observed, 3 = frequently observed).
The last two sections of the GARS-2 are completed via an interview with a parent or
caregiver who has had sustained contact with the individual. In the first part of the inter-
view, the respondent is asked to answer yes or no to a series of questions pertaining to the
child’s development in his or her first 3 years. In the final section of the GARS-2, the
respondent is prompted to answer a series of open-ended questions regarding medical his-
tory, behavior, symptoms of autism spectrum disorders, and parental concerns. The total
time for administration of the test is approximately 5–10 min. However, the user’s manual
for the measure does not clarify whether the stated completion time is for only the Likert-
type items or all nine sections of the measure.
Finally, a unique aspect of the GARS-2 is the inclusion of a companion resource entitled
Instructional Objectives for Children Who Have Autism. This resource is intended to pro-
vide teachers, parents, and other professionals with a method to link areas of need (partic-
ularly with reference to behavioral concerns) to instructional objectives. Samples are
provided to guide the school team in planning goals and objectives for students. These
examples can be modified to suit the needs of the individual child. This aspect of the tool
translates easily to individualized education plan goals and objectives, and it will be espe-
cially helpful to school teams who plan and monitor educational goals. The inclusion of cri-
teria for acceptable performance for many of the objectives can be used to simplify the
process of measuring growth. This aspect is especially useful for monitoring and updating
individualized education plans to enable efficient instructional programs. Moreover,
researchers evaluating interventions may wish to use these guidelines to objectively quan-
tify growth in response to treatment.
According to Gilliam, scoring and interpretation of the GARS-2 should be completed by
a professional trained in psychometrics and test analysis. The GARS-2 uses a standardized
score referred to as the autism index. It has a mean of 100 and a standard deviation of 15.
The autism index is calculated by first calculating the raw scores of each subscale and then
converting them into derived standard scores. The derived standard scores for each subscale
have a mean of 10 and a standard deviation of 3. The total is then calculated by adding the
sum of the standard scores of the three subscales and then converting those scores into an
2 Journal of Psychoeducational Assessment
autism index using the table found in the appendix of the examiner’s manual. Scores must
be produced in this way because many individuals with autism lack verbal communication
skills; therefore, the Communication subscale of the test must often be omitted, leaving
only two scores. The table in the appendix conveniently adjusts the scores into an appro-
priate autism index based on all three subscales or on the two that are available.
Scores of 85 or higher on the autism index indicate that an individual is likely to have
autism. Scores of 70 to 84 indicate that an individual may have autism, and any score of 69
or less suggests that it is unlikely that the individual has autism. This is a major improve-
ment from the original Gilliam Autism Rating Scale, in which individuals with scores of 90
or less were classified as having a below-average chance of having autism. Several studies
have shown that the original rating system can result in a large number of false negatives
when diagnosing for autism. For instance, South et al. (2002) and Lecavalier (2005) tested
a sample of individuals diagnosed with autism using the GARS, and both samples produced
means less than 90. It is clear that this was taken into consideration in the development of
the GARS-2, and the appropriate changes have been made.
Normative scores are not available for the parent interview section of the GARS-2.
However, the more no responses that appear in this section, the more likely it is that the
individual in question has autism. To be diagnosed with autism, an individual has to show
sings of delays or abnormal functioning in one of the following areas before the age of 3
(American Psychiatric Association, 2000): social interaction, language use in social com-
munication, and symbolic or imaginative play. The parent interview addresses each area by
prompting the parent or primary caregiver to recall the developmental milestones and
anomalies in the child’s first 3 years.
Development and Standardization
The first version of the GARS was published in 1995. The norms were obtained using
data collected from 1,092 children, adolescents, and young adults from the United States
and Canada. Each participant in the norm-referenced sample was reported to have autism.
This information, however, was reported by the individuals’ parents or teachers, and at no
time was it verified by a clinician (South et al., 2002).
As mentioned, since the release of the original GARS, several studies have challenged the
normative sample and claimed that the test scores resulted in too many false negatives
(Lecavalier, 2005; Mazefsky & Oswald, 2006; South et al., 2002). In response to these crit-
icisms, the GARS-2 was created with a new set of norms based on 1,107 children and young
adults. As specified by the selection criteria, all participants in the normative sample were
between the ages of 3 and 22, had a diagnosis of autism, and resided in the United States. It
is interesting to note that although the target group comprised those with autism, a Web-
based Asperger’s group was also targeted for recruiting members of the normative sample.
For this subset of the sample, group members completed a secure form of the questionnaire
for their children. Additional data were collected from children and young adults who had
disabilities other than autism, as well as from children and young adults who were nondis-
abled. The data from these two groups were used only to study the discriminative ability of
the GARS-2 and were thus not used as part of the normative sample.
The first version of the GARS contains four subscales used to produce a total autism
quotient: Stereotyped Behaviors, Communication, Social Interaction, and Developmental
Disturbances. Although significant correlations exist between the three subscales that eval-
uate current behavior, the Developmental Disturbances subscale is not significantly corre-
lated with any other subscale in the GARS (South et al., 2002). Furthermore, the
Developmental Disturbances subscale has only weak-to-moderate correlations (r = .34)
with the total autism quotient. The other three subscales demonstrate moderate-to-high cor-
relations with the total score (South et al., 2002). Consequently, the Developmental
Disturbances subscale was dropped from the autism index in the latest version but has sub-
sequently been revised and now appears in the GARS-2 in the form of a parental interview.
The internal consistency of each subscale as well as the GARS-2 test as a whole was
determined via Cronbach’s coefficient alpha. Coefficients reveal that each subtest, as well
as the total autism index, is highly consistent and thus sufficient for contributing to the
diagnosis of autism. To determine scale stability, 37 individuals with autism were tested
twice with the GARS-2 over a 1-week period. Test–retest coefficients for each subscale and
the total score are all beyond the .01 significance level, and age-corrected coefficients range
from .70 to .90. Correlation coefficients were corrected for the restriction in range. The
results show fairly high correlation for the autism index. Likewise, the Stereotyped
Behavior and Social Interaction subscales are beyond sufficient. The correlation coefficient
for the Communication subscale, however, although still significant, is in the moderate
range, suggesting that caution be used when interpreting its results. Gilliam suggests that
more than one rater complete this measure for a child; however, there is no information
included about interrater reliability. Gilliam further suggests a procedure to average the
scores of multiple informants but, again, provides no information about reliability in using
As mentioned earlier, the three subtests of the GARS-2 reflect two of the most com-
monly referenced definitions of autism, taken from the DSM-IV-TR and the Autism Society
of America (1994). Both sources suggest that children with autism display deficits in com-
munication and social interaction while demonstrating stereotypical behaviors. The sub-
scales included in the GARS-2 intend to test for the presence of each criterion. In addition,
item discrimination coefficients were analyzed for each test item. To be included, each item
needed to be statistically significant at the .05 level. An item discrimination coefficient of
.35 was set to ensure validity of each. Furthermore, at least half the correlation coefficients
had to reach or exceed this level, as recommended by Hammil et al. in A Consumer’s Guide
to Tests in Print (as cited by Gilliam, 2006). Median item discrimination coefficient scores
4 Journal of Psychoeducational Assessment
for each subscale are as follows: .53 for Stereotyped Behaviors, .53 for Communication,
and .55 for Social Interaction (Gilliam, 2006).
Concurrent validity was analyzed by correlating GARS-2 scores with the Autism
Behavior Checklist (ABC), a 57-item checklist used for screening individuals with autism
and for differentiating them from individuals with other severe behavioral disorders. Sixty-
three children were rated by their parents on both the GARS-2 and the ABC. The subscales
on the GARS-2 were matched to five subtests on the ABC. Correlations on matched sub-
test pairings (GARS-2 subscales and corresponding ABC subtests) and on total scores are
all significant. According to Hopkins (as cited by Gilliam, 2006), all matched subtests have
correlations that were large to very large. Moderate-to-strong correlations are evident and
range from .56 for Social Interaction (GARS-2) and Social and Self-Help (ABC) to a high
of .78 for Stereotyped Behaviors (GARS-2) and Body/Object Use (ABC).
Additionally, analyses of relationships between subscales within the GARS-2 are sig-
nificant at the .01 level, as are correlations between each subscale and the autism index. The
test also revealed that individuals with autism scored significantly higher than did individ-
uals in the control group as well as those with other disabilities, suggesting that the GARS-
2 has a high discriminative ability.
Few changes were made to GARS test items in the creation of the GARS-2. Psychometric
properties of the test remain acceptable, and any changes that did occur appear to offer
improved clarity of certain test items. However, it is noteworthy that many items reflect
behaviors characteristic of younger children with autism. As such, parents of older children
may experience greater difficulty answering these questions. In the our experience, many par-
ents of youth and young adults with autism spectrum disorders reply that their children may
have engaged in these behaviors at one time but no longer exhibit them. Furthermore, Gilliam
suggests that exposure to appropriate interventions can lead to improved scores for individu-
als with autism. Individuals aged 16 years and older have likely experienced many educa-
tional interventions addressing learning and behavioral issues at school. As such, the GARS-2
may not be as sensitive to changes in these individuals’behaviors.
Commentary and Recommendations
The GARS-2 is intended as a screening tool for autism spectrum disorders in individu-
als between the ages of 3 and 22. Based on a solid theoretical foundation, the GARS-2 was
standardized using an updated set of norms provided by a sample of individuals diagnosed
with autism. Since its original publication in 1995, several revisions have been made in
response to earlier criticisms; namely, the revision now includes a chapter discussing the
application of GARS-2 test items for applied behavior analysis and research. In an attempt
to link assessment to intervention, the manual clarifies test items on each subscale
(Stereotyped Behaviors, Communication, Social Interaction) by providing detailed behav-
ioral descriptors. Finally, the revision addresses the potential for high rates of false-negative
autism diagnoses on the GARS by lowering the cutoff scores on the autism index.
Aside from being relatively simple and quick to complete, the GARS-2 has the added
advantage of a flexible format. Parents need not be the sole raters; ratings can be provided
by anyone who knows the individual well. Furthermore, the instrument can be completed
in the absence of the examiner. The manual provides detailed information regarding the
psychometric properties, administration, and scoring aspects of the test, as well as a nicely
written overview of autism and issues related to differential diagnosis.
Despite the improvements, several disadvantages remain with the revision of this test.
Although it claims to be useful as an autism-screening tool for individuals up to the age of
22, older age groups constituted only a small portion of the standardization sample (i.e.,
only 9% of the sample was between the ages of 16 and 22). Although Gilliam states that
age does not affect the characteristics of autism as measured by the GARS-2, this assertion
may be difficult to support given the small sample size of older individuals. Although lim-
ited information is available describing developmental changes in behaviors characteristic
of autism, some literature has suggested that the severity of symptoms decreases with age
(see Shea & Mesibov, 2005, for a review).
According to the manual, individuals with autism were selected for inclusion in the stan-
dardization sample, but further clarification regarding the characteristics of these individu-
als would be beneficial. Specifically, it would be useful to know whether individuals without
a history of language delay were excluded. Further to this concern, the manual describes that
some participants for the standardization study were recruited through an online organiza-
tion for Asperger’s disorder. This is concerning because it suggests that the sample includes
individuals along the autism spectrum, as opposed to only those meeting the DSM-IV-TR’s
criteria for autistic disorder. Additional information about the inclusion of individuals who
fall along the autism spectrum, with respect to differences in IQ and language development,
is important to the utility of the GARS-2 as a screening instrument for autism.
Finally, Gilliam suggests that when one rater is unable to provide complete information
about a child, other raters may complete the forms and then average the scores of duplicate
items. However, psychometric information about this suggested procedure is absent from
the manual at this time. Thus, caution is warranted when using this approach.
Overall, the GARS-2 offers a number of improvements over the original edition. Given
its technical adequacy, it appears that the GARS-2 can be a useful screening tool for autism
when used as part of a comprehensive assessment or for use in evaluating treatment.
Furthermore, the addition of behavioral descriptors for test items and the inclusion of the
Instructional Objectives for Children Who Have Autism manual promotes linking assess-
ment to intervention, facilitates goal setting and program development in home and school
settings, and provides opportunities for researchers interested in evaluating interventions.
In addition to this instrument’s respectable psychometric properties, its flexibility and
unique elements make it a practical and welcome addition to the field.
Janine Marie Montgomery
University of Manitoba
University of Saskatchewan
University of Manitoba
6 Journal of Psychoeducational Assessment
American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text
rev.). Washington, DC: Author.
Gilliam, J. (2006). GARS-2: Gilliam Autism Rating Scale–Second Edition. Austin, TX: PRO-ED.
Lecavalier, L. (2005). An evaluation of the Gilliam Autism Rating Scale. Journal of Autism and Developmental
Disorders, 35(6), 795-805.
Mazefsky, C. A., & Oswald, D. P. (2006). The discriminative ability and diagnostic utility of the ADOS-G,ADI-R,
and GARS for children in a clinical setting. Autism: The International Journal of Research & Practice,
National Research Council. (2001). Education children with autism. Washington, DC: National Academy Press.
Shea, V., & Mesibov, G. B. (2005). Adolescents and adults with autism. In F. Volkmar, R. Paul, A. Klin, &
D. Cohen (Eds.), Handbook of autism and pervasive developmental disorders (Vol. 1, pp. 288-311). New York:
South, M., Williams, B. J., McMahon, W. M., Owley, T., Filipek, P. A., Shernoff, E., et al. (2002). Utility of the
Gilliam Autism Rating Scale in research and clinical populations. Journal of Autism and Developmental
Disorders, 32(6), 593-599.