Content uploaded by Bruce D. Homer
Author content
All content in this area was uploaded by Bruce D. Homer on Jan 03, 2018
Content may be subject to copyright.
Content uploaded by Bruce D. Homer
Author content
All content in this area was uploaded by Bruce D. Homer on Dec 13, 2017
Content may be subject to copyright.
British Journal of Developmental Psychology (2017)
©2017 The British Psychological Society
www.wileyonlinelibrary.com
Brief report
Reliability and validity of advanced theory-of-mind
measures in middle childhood and adolescence
Elizabeth O. Hayward
1
* and Bruce D. Homer
2
1
New York University, New York, USA
2
The Graduate Center, City University of New York, New York, USA
Although theory-of-mind (ToM) development is well documented for early childhood,
thereisincreasing research investigating changes in ToM reasoning in middle childhood and
adolescence. However, the psychometric properties of most advanced ToM measures for
use with older children and adolescents have not been firmly established. Wereport on the
reliability and validity of widely used, conventional measures of advanced ToM with this age
group. Notable issues with both reliability and validity of several of the measures were
evident in the findings. With regard to construct validity, results do not reveal a clear
empirical commonality between tasks, and, after accounting for comprehension,
developmental trends were evident in only one of the tasks investigated.
Statement of contribution
What is already known on this subject?
!Second-order false belief tasks have acceptable internal consistency.
!The Eyes Test has poor internal consistency.
!Validity of advanced theory-of-mind tasks is often based on the ability to distinguish clinical from
typical groups.
What does this study add?
!This study examines internal consistency across six widely used advanced theory-of-mind tasks.
!It investigates validity of tasks based on comprehension of items by typically developing individuals.
!It further assesses construct validity, or commonality between tasks.
Despite consensus regarding measurement of theory of mind (ToM) in preschool, how
best to assess mental state reasoning beyond preschool is a topic of debate (Miller, 2012).
Two widely used measures of advanced ToM, second-order false belief and interpretive
tasks, assess an individual’s ability to reconcile multiple beliefs (Astington, Pelletier, &
Homer, 2002; Carpendale & Chandler, 1996). Second-order false belief tasks are concerned
with the understanding that a particular belief can motivate behaviour. In these tasks,
children are asked to predict a character’s actions based on thatcharacter’s false belief about
another character’s belief. Perner and Wimmer (1985) first developed this type of task,
extending the false belief paradigm from beliefs about locations to beliefs about beliefs.
In an interpretive task, a child is shown two interpretations of ambiguous stimuli and
then asked to judge others’ interpretations of those stimuli. Interpretive tasks are
*Correspondence should be addressed to Elizabeth O. Hayward, New York University, New York, NY, USA (email:
elizabeth.hayward@nyu.edu).
DOI:10.1111/bjdp.12186
1
concerned with the understanding that multiple people can have many different beliefs.
Carpendale and Chandler (1996) hypothesized that children achieve false belief
understanding several years before developing an appreciation of the interpretive nature
of the knowing process. These measures introduce stimuli that provide equal support for
two distinct interpretations.
Advanced ToM has also been assessed using the Strange Stories task, which requires
interpreting non-literal statements, such as ironic jokes, lies, and gaffes, in the context of
social narratives (Happ!
e, 1994). Similarly, the Faux-pas Recognition task assesses whether
children can accurately recognize when a social faux pas has occurred (Baron-Cohen,
O’Riordan, Stone, Jones, & Plaisted, 1999).In both tasks, given a short story, participants
are asked to make an inference about the beliefs of the characters. The Strange Stories and
Faux Pas tasks were both originally developed to illuminate the difference in ToM
reasoning between children with autism and those who are typically developing.
The Reading-the-Mind-in-Eyes test seeks to measure ToM ability through accuracy in
reading states of the mind from images of eyes (Baron-Cohen, Wheelwright, Spong, Scahill, &
Lawson, 2001). As a measure of advanced ToM (Baron-Cohen et al.,2015),theEyesTestis
concerned with children’s ability to infer mental states, typically affective states, given
minimal information about the individual or social context. In this 28-item test, participants
must select the appropriate affective term matching an image of eyes. This task was originally
designed to capture subtle deficits in social cognition in individuals on the autism spectrum.
The internal consistency reliability of second-order false belief tasks, similar to those
designed by Perner and Wimmer (1985), has been assessed and found to be acceptable
(Hughes et al.,2000).Thereisascarcityofresearchestablishinginternalconsistencyinother
advanced ToM measures. The internal consistency of the Eyes Test has been found to be poor
(Harkness, Jacobson, Duong, & Sabbagh, 2010; Olderbak et al.,2015;Vellanteet al.,2013;
Voracek & Dressler, 2006). Regarding validity, recent work has made a case for the validity of
the Strange Stories in middle childhood (Devine & Hughes, 2016). Although several of these
tasks identify social-cognitive deficits in clinical populations, little research has explored the
validity of most of these measures when used with typically developing groups.
Therefore, this study aimed to evaluate the reliability and validity of widely used
measures of advanced ToM with typically developing children ages 7–13. Although other
measures of advanced ToM have been developed, and in some cases validated (e.g., Bosco,
Gabbatore, Tirassa, & Testa, 2016; Devine & Hughes, 2013; Hayward, Homer, & Sprung,
2016; Hutchins, Prelock, & Bonazinga, 2012; Sivaratnam, Cornish, Gray, Howlin, &
Rinehart, 2012), second-order false belief tasks, interpretive tasks, the Strange Stories, the
Faux Pas test, and the Reading-the-Mind-in-the-Eyes task remain some of the most broadly
used ToM measures. The overall comprehension of these tasks was also examined, as valid
measurement hinges on the assumption that participants comprehend the materials
(Fantuzzo, McDermott, Manz, & Hampton, 1996). Regarding construct validity, as evident
in associations between tasks, we predict that these tasks will be moderately intercor-
related. Finally, developmental changes in performance on the advanced ToM tasks were
examined, and we predicted task performance would reflect age-related trends.
Materials and method
Participants
Children (N=112) aged 7:5–13:5 years were recruited from an independent school and
summer day camp programme in New York City. There were 64 (57%) females and 48
2Elizabeth O. Hayward and Bruce D. Homer
(43%) males. The population from which the sample was recruited is primarily middle-
and upper-class. Demographic data on the participants’ race and ethnicity were not
collected; school-level data indicate that the student population is 72% White/Caucasian,
10% Black/African American, 6% Asian/Pacific Islander, and 4% Hispanic. Two of the 112
participants were lost due to attrition after a single testing session, resulting in partial data.
Measures
The measures for this study were commonly used advanced ToM tasks. Except where
noted, all tasks were administered and scored as described in the original studies. The
measures were as follows: two-second-order false belief tasks (Astington et al., 2002);
two interpretive ambiguous figure tasks, in which the child is asked what a character will
think in response to an ambiguous line drawing (Carpendale & Chandler, 1996); two
interpretive restricted-view tasks, in which the child is asked to guess what a character
will think of a restricted-view picture (Lalonde & Chandler, 2002); the original 24 Strange
Stories vignettes (Happ!
e, 1994); the 10 Faux Pas vignettes (Baron-Cohen et al., 1999);
and the 28-item children’s Reading-the-Mind-in-the-Eyes test (Baron-Cohen et al., 2001).
The Astington et al. (2002) second-order false belief tasks were used in an attempt to
adequately capture age-related variation in second-order reasoning in children over age 7,
while also minimizing information processing demands (Sullivan, Zaitchik, & Tager-
Flusberg, 1994). For all tasks, participants were awarded 1 point for each item answered
correctly, as scored by their original authors. This resulted in one total score for each task,
with the exception of the Strange Stories, for which an additional ‘mentalizing’ score was
included. This score reflects the presence or absence of mental states employed to justify
the utterances characters in each story, as outlined by Happ!
e (1994).
Comprehension scores were calculated for those tasks that included control
questions. The range for comprehension scores varied by task, as follows: second-order
false belief tasks, 0–9; interpretive restricted-view tasks, 0–2; the Strange Stories, 0–24;
and the Faux Pas task, 0–20. The interpretive ambiguous figure tasks and Eyes Test did not
include comprehension questions.
Procedure
Children were tested individually (ages 7–8) or in a small group (ages 9–13) in a quiet room
in their school or camp by one of three researchers. Each participant received a packet
containing two-second-order false belief tasks, two interpretive ambiguous figure tasks,
two interpretive restricted-view tasks, the Strange Stories test, the Faux Pas vignettes, and
the Eyes Test, which was completed over the course of two 30- to 45-minute sessions.
Task order was counterbalanced. Materials were read aloud to all participants.
Participants ages 9–13 were randomly assigned to groups.
Results
Reliability
An alevel of .70 or above is recognized as indicating acceptable internal consistency,
while those between .60 and .70 are considered undesirable or minimally acceptable, and
those below .60 are unacceptable (Devellis, 2012). Cronbach’s alpha coefficients for the
tasks were as follows: second-order false belief, a=.53; interpretive ambiguous figure,
Reliability and validity of advanced theory of mind 3
a=.77; the interpretive restricted-view, a=.62; the Strange Stories, a=.65; the Strange
Stories mentalizing, a=.73; the Faux Pas, a=.78; and the Eyes, a=.41.
To examine whether internal consistency varied by age, we calculated Cronbach’s
alpha coefficients for three age groups (7–8 years, n=37; 9–10 years, n=37; and 11–
12 years, n=38) for each measure. Results are presented in Table 1. Among the 10-year-
olds, there was insufficient variance to calculate the Cronbach’s alpha coefficient for the
second-order false belief, as all but one response to one item were correct.
Task comprehension
Proficiency on the comprehension or memory questions was assessed for those tasks that
included such questions. For the second-order false belief tasks, participants performed
well on comprehension questions (M=8.64, SD =0.63), as most participants (71.4%)
responded correctly to all nine questions. On the interpretive restricted-view task,
performance was also strong (M=1.92 SD =0.32), with 93% of participants answering
both memory questions correctly.
Notable issues with comprehension were evident for both the Strange Stories and the
Faux Pas tasks. On the Strange Stories task, which included a single comprehension
question per story, the average score for the comprehension questions was 21.84
(SD =2.4). However, only 12.5% of participants answered all comprehension questions
correctly, suggesting significant issues of comprehension. On one item, the percentage of
participants who correctly answered the comprehension question was as low as 56.8%,
suggesting that many participants were not recognizing the non-literal statement. On the
Faux Pas task, which included two comprehension questions per story, responses to
comprehension questions were variable (M=18.8, SD =1.6), with 49.1% of participants
answering all comprehension questions correctly.
To circumvent these comprehension concerns in assessing associations between
tasks, items for which the comprehension questions were answered correctly by <95% of
the sample were excluded. This resulted in an 11-story Strange Stories set, excluding two
each of the Pretend, Joke, Misunderstanding, and Double Bluff stories, and one each of
the Figure of Speech,Persuade,Contrary Emotions,Appearance/Reality, and Forget
stories; the remaining stories showed strong comprehension (M=10.78, SD =0.44). An
abbreviated six-vignette Faux Pas task was formed, excluding the Story Competition,
Table 1. Descriptive statistics and Cronbach’s alpha coefficients by task and age
7–8 years 9–10 years 11–12 years
M(SD)aM(SD)aM(SD)a
Second-Order FB (2) 1.81 (0.52) .72 1.89 (0.31)
a
1.89 (0.39) .66
Ambiguous Figure (2) 1.84 (1.32) .68 1.78 (1.33) .81 1.51 (1.43) .81
Restricted-view Task (2) 1.05 (0.91) .77 1.53 (0.70) .51 1.38 (0.76) .47
Strange Stories (24) 17.76 (3.34) .73 19.08 (1.92) .25 19.92 (2.63) .65
Strange Stories Mental (24) 18.03 (3.86) .79 19.36 (3.16) .70 18.57 (3.30) .70
Faux Pas (10) 6.30 (2.59) .76 6.83 (2.55) .77 6.37 (2.91) .82
Eyes Task (28) 19.28 (3.00) .44 19.69 (2.98) .44 20.29 (2.65) .33
a
Insufficient variance in responses to calculate Cronbach’s alpha coefficient.
4Elizabeth O. Hayward and Bruce D. Homer
Lunch Lady, Sally-Mary, and Surprise Party stories, again with strong comprehension
(M=11.64, SD =0.96).
Task associations
Correlational analyses were conducted to investigate associations between tasks (see
Table 2). Three outliers (2 SD <M) were excluded, resulting in a sample of 107 for these
analyses. Bivariate correlation analyses revealed a significant association between age and
the interpretive restricted-view task, r(106) =.19, p=.048. Partial correlation analyses
between the six tasks, controlling for age, revealed associations between the second-order
false belief task and the interpretive restricted-view task, the abbreviated Strange Stories
mentalizing score and the Eyes Test, and the abbreviated Strange Stories total score and
the abbreviated Strange Stories mentalizing score. Spearman rho correlations were also
conducted to account for the varied range of the measures (i.e., from 0–2 to 0–28); results
confirmed similar associations.
Developmental change
To further examine whether there were significant developmental changes in perfor-
mance on the tasks, a series of one-way ANOVAs were conducted with age group (7–
8 years, n=37; 9–10 years, n=37; and 11–12 years, n=38) as the independent
variable and scores on the advanced ToM tasks as dependent variables. The abbreviated
Strange Stories and abbreviated Faux Pas sets were employed in place of the original tasks.
Age differences were found only for the interpretive restricted-view task, F(2,
107) =3.40, p=.037. Planned contrasts (one-tailed) indicated that the youngest age
group (7–8 years; M=1.05, SD =.91) was weaker in their performance as compared to
both the middle age group (9–10 years; M=1.53, SD =.70) and the oldest age group
(11–12 years; M=1.38, SD =.76), which were statistically equivalent.
Discussion
These results raise questions about the reliability and validity of several measures of
advanced ToM. The internal consistency for these six tasks ranged widely. The second-
order false belief task was found to have unacceptable internal consistency, although this
is likely due to lack of variance because of a ceiling effect. Two measures, the interpretive
ambiguous figures and the Faux Pas test, demonstrated acceptable internal consistency.
The Strange Stories mentalizing score also demonstrated acceptable internal consistency.
The interpretive restricted-view task and the Strange Stories task had undesirable levels of
internal consistency; in the case of the interpretive restricted-view task, internal
consistency decreased with age across the three groups. The Eyes Test demonstrated
unacceptably low internal consistency across age groups, confirming previous findings
(Olderbak et al., 2015). No clear age-related trends emerged, casting doubt on the notion
that any single measure is particularly more reliable with younger versus older children, or
vice versa.
Comprehension performance on the Strange Stories and the Faux Pas vignettes raised
concerns around the validity of these widely used measures (Fantuzzo et al., 1996). The
current data suggest these tasks in their original form may not be appropriate with this
Reliability and validity of advanced theory of mind 5
Table 2. Descriptive statistics, correlations with age, and partial correlations between tasks
M(SD) Age (r)
Ambig
Figure (pr)
Restricted-
view task (pr)
Strange
stories (pr)
Strange stories
mental (pr) Faux pas (pr) Eyes task (pr)
Second-Order FB (2) 1.86 (0.42) .09 .05 .22* .12 .06 ".002 .12
Ambig Figure (2) 1.71 (1.36) ".06 –.11 .07 ".05 .002 .002
Restricted-view Task (2) 1.32 (0.81) .19* –".02 ".03 ".09 ".07
Abbr. Strange Stories (11) 9.06 (1.34) .18 –.22* .03 .12
Abbr. Strange Stories
Mentalizing (11)
9.00 (1.67) ".04 –.09 .19*
Abbr. Faux Pas (6) 4.23 (1.77) ".08 –.02
Eyes Task (28) 19.68 (2.90) .17 –
*p<.0.5
6Elizabeth O. Hayward and Bruce D. Homer
population. Future research with older children and adolescents should employ only
those items with the highest rates of comprehension.
Previous research presents mixed results on associations between these types of tasks
(Brent, Rios, Happ!
e, & Charman, 2004; Mitroff, Sobel, & Gopnik, 2006). When
considering only those tasks with sound comprehension, the current results fail to
provide evidence of validity of a unified advanced ToM construct in children between the
ages of 7 and 13. It is possible that the underlying abilities assessed by these measures form
at best a constellation of loosely related social-cognitive skills.
Developmental trends were found only for the interpretive restricted-view task:
Performance on the other advanced ToM tasks did not improve with age. Older children
evidently approach ceiling on second-order false belief and interpretive ambiguous figure
tasks, limiting the utility of these as measures of advanced ToM with older groups.
Previous research has demonstrated an effect for age on the Strange Stories and Faux Pas
task (Banerjee, 2000; Banerjee & Watling, 2005; Baron-Cohen et al., 1999; O’Hare,
Bremner, Nash, Happ!
e, & Pettigrew, 2009). However, not all research has identified age-
related trends in Strange Stories performance during adolescence (Bosco, Gabbatore, &
Tirassa, 2014). Given the comprehension difficulties documented here, the variability that
was attributed to age in some research may have been related to issues of comprehension.
Consistent with the current results, typically developing children tend to perform well on
the Eyes Test from 7 years of age, limiting variability due to age in older groups (Brent
et al., 2004; Dorris, Espie, Knott, & Salt, 2004). However, Baron-Cohen et al. (2001) did
find an effect for age in performance on the Eyes Test, such that 8- to 10-year-olds
outperformed 6- to 8-year-olds, in contrast to the current findings.
The socioeconomic and cultural homogeneity of the current sample is a limitation with
regard to the generalizability of the findings. However, the minimal diversity in the current
sample does ensure that these findings can be interpreted in the context of previous work
in this field (Miller, 2012). Nonetheless, future work should employ these tasks with more
heterogeneous populations. Despite these limitations, the current study highlights the
issues with both reliability and validity of several advanced ToM tasks. These results
emphasize the need for advanced ToM measures that accurately capture developments in
social cognition beyond early childhood.
Acknowledgements
The authors would like to thank the individuals who participated in this research. They would
also like to express their gratitude to Yolanta Kornak and Seamus Donnelly for their assistance
with data collection.
References
Astington, J. W., Pelletier, J., & Homer, B. (2002). ToM and epistemological development: The
relation between children’s second-order false-belief understanding and their ability to reason
about evidence. New Ideas in Psychology. Special Issue: Folk Epistemology,20(2–3), 131–144.
https://doi.org/10.1016/s0732-118x(02)00005-3
Banerjee, R. (2000). The development of an understanding of modesty. British Journal of
Developmental Psychology,18(4), 499–517. https://doi.org/10.1348/026151000165823
Banerjee, R., & Watling, D. (2005). Children’s understanding of faux pas: Associations with peer
relations. Hellenic Journal of Psychology,2(1), 27–45.
Reliability and validity of advanced theory of mind 7
Baron-Cohen, S., Bowen, D. C., Holt, R. J., Allison, C., Auyeung, B., Lombardo, M. V., ... Lai, M.
(2015). The “reading the mind in the eyes” Test: complete absence of typical sex difference in
~400 men and women with autism. PLoS ONE,10(8), e0136521. https://doi.org/10.1371/
journal.pone.0136521
Baron-Cohen, S., O’Riordan, M., Stone, V., Jones, R., & Plaisted, K. (1999). Recognition of faux pas by
normally developing children with Asperger syndrome or high-functioning autism. Journal of
Autism and Developmental Disorders,29(5), 407–418. https://doi.org/10.1023/A:10230
35012436
Baron-Cohen, S., Wheelwright, S., Spong, A., Scahill, V., & Lawson, J. (2001). Studies of ToM: Are
intuitive physics and intuitive psychology independent? Journal of Developmental and
Learning Disorders,5(1), 51–82.
Bosco, F. M., Gabbatore, I., & Tirassa, M. (2014). A broad assessment of theory of mind in
adolescence: The complexity of mindreading. Consciousness and Cognition,24, 84–97.
https://doi.org/doi.org/10.1016/j.concog.2014.01.003
Bosco, F. M., Gabbatore, I., Tirassa, M., & Testa, S. (2016). Psychometric properties of the theory of
mind assessment scale in a sample of adolescents and adults. Frontiers in Psychology,7, 566.
https://doi.org/10.3389/fpsyg.2016.00566
Brent, E., Rios, P., Happ!
e, F., & Charman, T. (2004). Performance of children with autism spectrum
disorder on advanced ToM tasks. Autism: the International Journal of Research and Practice,8
(3), 283–299. https://doi.org/10.1177/1362361304045217
Carpendale, J. I., & Chandler, M. J. (1996). On the distinction between false belief understanding and
subscribing to an interpretive ToM. Child Development,67, 1686–1706. https://doi.org/10.
2307/1131725
Devellis, R. F. (2012). Scale development (3rd ed.). Thousand Oaks, CA: Sage Publications.
Devine, R. T., & Hughes, C. (2013). Silent films and strange stories: Theory of mind, gender, and
social experiences in middle childhood. Child Development,84, 989–1003. https://doi.org/10.
1111/cdev.12017
Devine, R. T., & Hughes, C. (2016). Measuring theory of mind across middle childhood: Reliability
and validity of the Silent Films and Strange Stories tasks. Journal of Experimental Child
Psychology,149, 23–40. https://doi.org/10.1016/j.jecp.2015.07.011
Dorris, L., Espie, C. A. E., Knott, F., & Salt, J. (2004). Mind-reading difficulties in the siblings of people
with Asperger’s syndrome: Evidence for a genetic influence in the abnormal development of a
specific cognitive domain. Journal of Child Psychology and Psychiatry,45, 412–418. https://d
oi.org/10.1111/j.1469-7610.2004.00232.x
Fantuzzo, J. W., McDermott, P. A., Manz, P. H., & Hampton, V. R. (1996). The pictorial scale of
perceived competence and social acceptance: Does it work with low-income urban children?
Child Development,67, 1071–1084. https://doi.org/10.2307/1129772
Happ!
e, F. G. (1994). An advanced test of ToM: Understanding of story characters’ thoughts and
feelings by able Autistic, Mentally Handicapped, and normal children and adults. Journal of
Autism and Developmental Disorders,24(2), 129–154. https://doi.org/10.1007/BF02172093
Harkness, K. L., Jacobson, J. A., Duong, D., & Sabbagh, M. A. (2010). Mental state decoding in past
major depression: Effect of sad versus happy mood induction. Cognition & Emotion,24(3),
497–513. https://doi.org/10.1080/02699930902750249
Hayward, E. O., Homer, B. D., & Sprung, M. (2016). Developmental trends in flexibility and
automaticity of social cognition. Child Development. Advance online publication. https://doi.
org/10.1111/cdev.12705
Hughes, C., Adlam, A., Happ!
e, F., Jackson, J., Taylor, A., & Caspi, A. (2000). Good test—retest
reliability for standard and advanced false-belief tasks across a wide range of abilities. Journal of
Child Psychology and Psychiatry,41(4), 483–490. https://doi.org/10.1017/s00219630
99005533
Hutchins, T. L., Prelock, P. A., & Bonazinga, L. (2012). Psychometric evaluation of the theory of mind
inventory (ToMI): A study of typically developing children and children with autism spectrum
8Elizabeth O. Hayward and Bruce D. Homer
disorder. Journal of Autism and Developmental Disorders,42, 327–341. https://doi.org/0.
1007/s10803-011-1244-7
Lalonde, C. E., & Chandler, M. J. (2002). Children’s understanding of interpretation. New Ideas in
Psychology Special Issue: Folk Epistemology,20, 163–198. https://doi.org/10.1016/S0732-
118X(02)00007-7
Miller, S. A. (2012). Theory of mind: Beyond the preschool years. New York, NY: Psychology Press.
Mitroff, S. R., Sobel, D. M., & Gopnik, A. (2006). Reversing how to think about ambiguous figure
reversals: Spontaneous alternating by uninformed observers. Perception,35, 709–715. https://d
oi.org/10.1167/6.6.52
O’Hare, A. E., Bremner, L., Nash, M., Happ!
e, F., & Pettigrew, L. M. (2009). A clinical assessment tool
for advanced ToM performance in 5 to 12 year olds. Journal of Autism and Developmental
Disorders,39, 916–928. https://doi.org/10.1007/s10803-009-0699-2
Olderbak, S., Wilhelm, O., Olaru, G., Geiger, M., Brenneman, M. W., & Roberts, R. D. (2015). A
psychometric analysis of the reading the mind in the Eyes Test: Toward a brief form for research
and applied settings. Frontiers in Psychology,6, 1503. https://doi.org/10.3389/fpsyg.2015.
01503
Perner, J., & Wimmer, H. (1985). “John thinks that Mary thinks that.”: Attribution of second-order
beliefs by 5- to 10-year-old children. Journal of Experimental Child Psychology,39, 437–471.
https://doi.org/10.1016/0022-0965(85)90051-7
Sivaratnam, C. S., Cornish, K., Gray, K. M., Howlin, P., & Rinehart, N. J. (2012). Brief report:
Assessment of the social-emotional profile in children with autism spectrum disorders using a
novel comic strip task. Journal of Autism and Developmental Disorders,42, 2505–2512.
https://doi.org/10.1007/s10803-012-1498-8
Sullivan, K., Zaitchik, D., & Tager-Flusberg, H. (1994). Preschoolers can attribute second-order
beliefs. Developmental Psychology,30(3), 395–402. https://doi.org/10.1037/0012-1649.30.3.
395
Vellante, M., Baron-Cohen, S., Melis, M., Marrone, M., Petretto, D. R., Masala, C., & Preti, A. (2013).
The “Reading the Mind in the Eyes” test: Systematic review of psychometric properties and a
validation study in Italy. Cognitive Neuropsychiatry,18, 326–354. https://doi.org/10.1080/
13546805.2012.721728
Voracek, M., & Dressler, S. G. (2006). High (feminized) digit ratio (2D: 4D) in Danish men: A question
of measurement method? Human Reproduction,21, 1329–1331. https://doi.org/10.1093/
humrep/dei464
Received 16 August 2016; revised version received 13 March 2017
Reliability and validity of advanced theory of mind 9