ArticlePDF Available

A Mega-Analysis of the Effects of Feedback on the Quality of Simulated Child Sexual Abuse Interviews with Avatars

Authors:

Abstract and Figures

The present study aimed to test the effectiveness of giving feedback on simulated avatar interview training (Avatar Training) across different experiments and participant groups and to explore the effect of professional training and parenting experience by conducting a mega-analysis of previous studies. A total of 2,208 interviews containing 39,950 recommended and 36,622 non-recommended questions from 394 participants including European and Japanese students, psychologists, and police officers from nine studies were included in the mega-analysis. Experimental conditions were dummy-coded, and all dependent variables were coded in the same way as in the previously published studies. Professional experience and parenting experience were coded as dichotomous variables and used in moderation analyses. Linear mixed effects analyses demonstrated robust effects of feedback on increasing recommended questions and decreasing non-recommended questions, improving quality of details elicited from the avatar, and reaching a correct conclusion regarding the suspected abuse. Round-wise comparisons in the interviews involving feedback showed a continued increase of recommended questions and a continued decrease of non-recommended questions. Those with (vs. without) professional and parenting experience improved faster in the feedback group. These findings provide strong support for the efficacy of Avatar Training.
Content may be subject to copyright.
Vol.:(0123456789)
1 3
Journal of Police and Criminal Psychology
https://doi.org/10.1007/s11896-022-09509-7
A Mega‑Analysis oftheEffects ofFeedback ontheQuality ofSimulated
Child Sexual Abuse Interviews withAvatars
FrancescoPompedda1 · YikangZhang2 · ShumpeiHaginoya3,5 · PekkaSanttila4
Accepted: 23 March 2022
© The Author(s) 2022
Abstract
The present study aimed to test the effectiveness of giving feedback on simulated avatar interview training (Avatar Training)
across different experiments and participant groups and to explore the effect of professional training and parenting experience
by conducting a mega-analysis of previous studies. A total of 2,208 interviews containing 39,950 recommended and 36,622
non-recommended questions from 394 participants including European and Japanese students, psychologists, and police
officers from nine studies were included in the mega-analysis. Experimental conditions were dummy-coded, and all dependent
variables were coded in the same way as in the previously published studies. Professional experience and parenting experi-
ence were coded as dichotomous variables and used in moderation analyses. Linear mixed effects analyses demonstrated
robust effects of feedback on increasing recommended questions and decreasing non-recommended questions, improving
quality of details elicited from the avatar, and reaching a correct conclusion regarding the suspected abuse. Round-wise
comparisons in the interviews involving feedback showed a continued increase of recommended questions and a continued
decrease of non-recommended questions. Those with (vs. without) professional and parenting experience improved faster
in the feedback group. These findings provide strong support for the efficacy of Avatar Training.
Keywords Child sexual abuse (CSA)· Investigative interviewing· Simulation training· Feedback· Serious gaming
Child sexual abuse (CSA) is prevalent in all societies, with
prevalence estimates ranging from 8 to 31% for girls and 3
to 17% for boys (Barth etal.2013). It is also clear that CSA
is associated with a plethora of negative psychological, rela-
tional, and somatic health consequences (Hailes etal.2019).
It is, therefore, important to investigate suspected CSA cases
effectively. Unfortunately, these investigations present spe-
cial challenges. In almost 70% of suspected CSA cases, the
child’s statement is the only available evidence in these cases
(Elliott and Briere1994; Herman, 2009). Due to the usual
lack of corroborating evidence, investigative interviews with
alleged victims are of central importance in CSA investiga-
tions. While most experts agree that children can in princi-
ple provide accurate reports, there is also little doubt that
accounts can be distorted both by improper interviewing
and by normal memory decay (Ceci and Bruck1995). It is,
therefore, worrying that the quality of investigative inter-
views in these cases is still poor in many regions across
the world: Closed questions are still the most common type
of questions, regardless of expert warnings about the use
of this type of questions (Cederborg etal.2000; Korkman
etal.2008; Sternberg etal.2001).
Francesco Pompedda and Yikang Zhang are shared first authors.
* Yikang Zhang
kang.y.zhang@outlook.com
Francesco Pompedda
fpompedda@glos.ac.uk
Shumpei Haginoya
haginoya@psy.meijigakuin.ac.jp
Pekka Santtila
pekka.santtila@nyu.edu
1 School ofNatural & Social Sciences, University
ofGloucestershire, Cheltenham, UK
2 Faculty ofPsychology andNeuroscience, Maastricht
University, Maastricht, theNetherlands
3 Faculty ofPsychology, Meiji Gakuin University, Tokyo,
Japan
4 NYU Shanghai andNYU-ECNU Institute forSocial
Development, Shanghai, China
5 Mykolas Romeris University, Vilnius, Lithuania
Journal of Police and Criminal Psychology
1 3
Recommendations forInvestigative
Interview
To improve interview quality, a lot of effort has been
directed at training interviewers. The following main rules
are recommended when interviewing children: First, ques-
tions should be non-leading. Leading questions can have a
negative impact on children, creating less accurate state-
ments and contaminated memories (Bruck and Ceci1999;
Ceci and Bruck1993, 1995). For example, using a realistic
mock event, Finnilä etal. (2003) showed that children who
had been visited at daycare by a clown gave false affirmative
responses at alarming rates (e.g., a 20% false-positive rate
to the question “He told you that what you did together was
a secret and that you couldn’t tell anyone, didn’t he?”). Sec-
ond, open-ended rather than option-posing questions (i.e.,
asking for a yes/no-response or providing a list of alterna-
tives) should be used. Whereas option-posing questions tap
less accurate recognition processes, open-ended questions
rely on recall memory and are, therefore, more likely to elicit
useful answers (Lamb etal.2003; 2008; Lyon, 2014). Even
though they have been widely disseminated over the years
in theoretical training programs targeting professionals in
the field, the recommendations are not always followed in
practice (e.g., Johnson etal.2015).
Implementation Through Training Programs
Despite the lack of proven efficacy of most training pro-
grams, Lamb etal. (2002) have shown that the use of a
structured protocol giving the interviewers clear guid-
ance on the questions to use in the different phases of the
interview coupled with extensive feedback has resulted
in improvements. Unfortunately, performance deteriorates
rapidly after feedback is discontinued (Lamb etal.2002a,
b). Another limitation is that, as it is usually impossible
to know what actually happened in real cases, feedback
can only be given on the questions used by the inter-
viewer but not on whether the elicited responses are true
or false, that is, there is no outcome feedback. This means
that interviewers are not alerted to when their questions
have resulted in a false allegation of CSA; therefore, they
may not experience the need to change their interview-
ing behaviors. The training program Specialist Vulner-
able Witness Forensic Interview Training (Benson and
Powell2015; Powell etal.2016) has also shown promis-
ing results in improving CSA interview quality in terms
of question use. However, it is a comprehensive training
project that includes 15 modules and takes months to be
implemented. The feedback provided usually consists
of process feedback on questions used and behaviors
employed during the interviews, with no feedback on
whether the interviewer reached the correct conclusion.
Some trainings employ actors who play the role of the
allegedly abused child, which, while introducing the inter-
active components of the training, may not be optimal to
mimic the behavior of actual children in interview situa-
tions in terms of memory recall and suggestibility. The
reliance on actors and experts also poses difficulties for
the scalability of the program.
The Simulated Avatar Interview
withFeedback Approach
To address the situation illustrated above, a series of experi-
ments exploring the efficacy of simulated avatar interview
training programs have been conducted, where individuals
have interviewed child avatars and received feedback on
the questions used and the correctness of elicited informa-
tion (Haginoya etal.2020,2022b; Pompedda etal.2020;
Krause etal.2017; Pompedda etal.2015). The algorithms
in the program were set to mimic the behavioral pattern of
real children during interviews. That is, the avatars have
predefined “memories” of an event of interest and respond
to questions in a way that is consistent with research on
suggestibility of children of different ages (4- and 6-year-
olds). For example, if the interviewer asks a question about
a detail that is absent from the avatar’s memory, the avatar
responds “No.” But if the question is repeated, a 4-year-old
avatar will change the response to “Yes” with a probability
of 0.50. This way, suggestive questions (and other types of
non-recommended questions) can lead to inaccurate details
being contained in the avatars’ responses, similar to what
may happen in actual CSA interviews. The correlations
between question types and types of details elicited and the
correctness of the conclusions are used to inspect whether
the algorithms function as expected. Consistent with analy-
ses of real-life interviews in previous studies (e.g., Lamb
etal.2007), correct conclusions were positively predicted
by recommended questions as well as relevant details and
negatively predicted by non-recommended questions and
wrong details in the avatar interviews. Evidence has repeat-
edly shown that CSA Avatar Training coupled with feedback
on the interviewers’ performance results in improvement
of interview quality compared to controls who receive no
feedback (Haginoya etal.2020, 2021; Krause etal.2017;
Pompedda etal.2015, 2020). Subsequently, additional stud-
ies have focused on further factors that have been expected
to improve the training effect by incorporating new features
in the program.
Journal of Police and Criminal Psychology
1 3
Specifically, Pompedda et al. (2017) compared the
effects of types of feedback on interview quality improve-
ment. Compared with outcome feedback (i.e., feedback on
the correctness of the elicited information from the avatar)
and process feedback (i.e., feedback on the usage of recom-
mended and non-recommended questions), the combined
feedback with both outcome and process information pro-
duced the largest improvement. Krause etal. (2017) further
examined whether instructions of reflection could contribute
to the improvement above and beyond combined feedback.
Though empirical evidence supports the efficacy of reflec-
tion on task performance in other contexts such as educa-
tion (Espinet etal.2013), military leadership (Matthew and
Sternberg2009) and aircraft navigation (Ron etal.2006),
Krause etal. (2017) did not find clear evidence favoring the
reflection design in CSA interview training.
More recently, Haginoya etal. (2020) extended this line
of research to an Asian population and online context with
results showing that Avatar Training with feedback improves
interview quality across cultural contexts and implemen-
tation settings. After establishing the effectiveness of the
approach, Haginoya etal. (2021) further examined the effect
of behavioral modeling and its combination with feedback.
Behavioral modeling training (BMT) originates from social
learning theory (Bandura and McClelland 1977). This
approach includes five components: (1) identifying well-
defined behaviors, (2) showing the effective use of those
behaviors through model(s), (3) giving opportunities to prac-
tice those behaviors, (4) providing feedback and social rein-
forcement, and (5) taking measures of maximizing the trans-
fer of those behaviors to practical tasks (Taylor etal.2005),
the latter three of which have been the integral part of the
Avatar Training approach used in the earlier studies. By also
incorporating the first and second component into the Avatar
Training and providing the participants both negative and
positive models and the consequences of these behaviors,
Haginoya etal. (2021) showed that the combination of feed-
back and modeling improves interview quality more than
feedback alone. In addition to improving interview quality
directly, Haginoya etal.(2022a) also tested whether add-
ing feedback on supportive statements could be done while
also improving the use of appropriate question types. The
use of supportive statements would be helpful in enhanc-
ing rapport-building and, consequently, abuse disclosure by
reluctant children (Blasbalg etal.2018). The results con-
firmed that it is possible to improve the use of recommended
questions while also improving the use of supportive state-
ments by providing feedback.
Potential Moderators oftheTraining Effect
Whether experience of interacting with children, either as
a parent, babysitter, or as a professional child interviewer
can have an impact on training, has not been exhaustively
examined before. In many of the situations in which children
interact with adults, closed questions are used to ask children
about topics that adults are already aware, for example, a
teacher assessing knowledge after teaching (e.g., Pate2012).
It has been proposed that past experience may thus nega-
tively affect training outcomes as the interviewers could have
trouble refrain from using non-recommended questions or
be reluctant to change (Pompedda2018). According to the
proactive interference theory, previously learned information
might interfere with newly learned information. In the case
of investigative interviews, the previously learned use of
closed questions might interfere with the use of open ques-
tions (Powell etal.2014). In addition, the frequent lack of
knowledge regarding the ground truth in alleged CSA cases,
and the use of the judicial outcome as proof of the quality
of the interview, might exacerbate the effects of proactive
interference (Jacoby etal.2001). However, the literature
shows mixed results between field studies where there is
no association between experience and use of open ques-
tions (e.g., Wolfman etal.2016), with some exceptions (e.g.,
Lafontaine and Cyr2017), and simulated interviews, where
there is evidence of a negative association between experi-
ence and the use of open questions (e.g., Powell etal.2014),
with some studies showing no differences after training (e.g.,
Benson and Powell2015). Moreover, experienced interview-
ers, while potentially more incline to use closed questions,
could also have other skills that can help them during the
interview, for example, better ability in creating rapport
(Hershkowitz etal.2017) or better communication skills
(e.g., MacDonald etal.2017).
Rationale forConducting aMega‑Analysis
Though each single study has already provided evidence for
the efficacy of the program, the current research intended
to offer more insight through a systematic mega-analysis of
all the studies conducted so far. Mega-analysis integrates
raw data from individual studies followed by new analyses
using multilevel models accounting for the heterogeneity
among studies to reach robust conclusions. Compared with
meta-analyses where summary statistics from the original
studies are integrated, mega-analyses allow evidence syn-
thesis using the raw data, avoiding potential bias and errors
related to the analyses in the reports of the original studies.
Moreover, as the original studies employing the avatars were
limited by the individually low sample sizes and only includ-
ing particular participant types, previous examinations of the
effect of potential moderators of the reported training effects
have been underpowered. In the present analyses, we were
able to explore the effects of individual differences, research
design, and relevant demographic variables such as profes-
sional training or parenting experience on the outcomes
Journal of Police and Criminal Psychology
1 3
with the pooled data providing adequate power to do so.
Importantly, this research also allows a better estimate of the
effects of providing a combination of outcome and process
feedback across different professional groups and countries.
Method
Participants
The nine studies included in this mega-analysis collected
training data using participant samples of European and Jap-
anese students, psychologists, and police officers. Detailed
information regarding all of the samples can be found in
Table1.
Materials
Avatar Training
Simulated interviews with avatars were conducted using dif-
ferent languages based on the country where the interviews
took place. The different language versions were identical
with the exception of a small number of cultural adaptations
(e.g., religious settings and some games played by the child
avatar were changed). The simulation comprised 16 different
avatars equally divided between age (4 vs 6), ground truth
abused vs not abused, and emotions displayed (crying vs
no crying). For each case, a series of details both related
to the alleged abuse (e.g., details that describe the abuse
or that provide an alternative explanation), but also details
that are not relevant for the investigation (e.g., favorite toy
or other activities the avatar had experienced) were created.
Interviewers faced a screen where one of the avatars was
presented and vocally asked a question to the avatar like in a
real interview. Meanwhile, an operator listened to the ques-
tions asked and categorized them in real time by clicking the
appropriate button in the simulation interface. The catego-
rization triggered the algorithms (different between 4-year-
old and 6-year-old avatars) that were based on research in
children’s memory and suggestibility, as well as the avail-
able details in the memory of the avatar and resulted in the
launch of the appropriate video clip containing the avatar’s
response. In each study, a randomized selection of avatars
was used (providing an equal balance between abuse vs not
abused avatars and 4years old vs 6years old) and provided
in a random order.
Data Coding
Experimental Conditions All experimental conditions were
dummy-coded (see TableS1). Feedback referred to the
Table 1 Descriptive statistics of the nine data sets included in the mega-analysis
Study Experimental conditions Rounds of
interviews
Participant
population n Female Mage SDage
Pompedda etal. (2017) Control (n = 12)
Outcome feedback (n = 12)
Process feedback (n = 12)
Feedback (n = 12)
4 Europe;
students
38 27.9 9.1
Krause etal. (2017) Control (n = 19)
Feedback (n = 19)
Feedback + reflection (n = 21)
8 Europe;
students
35 24.4 3.7
Haginoya etal. (2020) Control (n = 15)
Feedback (n = 17)
6 Japan;
students
23 20.5 0.6
Haginoya etal. (2022a)
Study 1
Control (n = 20)
Feedback (n = 20)
6 Europe;
psychologists
37 27.4 2.2
Haginoya etal. (2022a)
Study 2
Control (n = 32)
Feedback (n = 32)
6 Europe;
students
44 23.1 3.6
Haginoya etal., (2021) Modeling (n = 11)
Feedback (n = 10)
Feedback + modeling (n = 11)
5 Japan;
psychologists
22 35.1 8.7
Haginoya etal., (2022b) Control (n = 10)
Control + feedback (n = 11)
4/8 Japan;
police
8 35.5 5.4
Kask etal. (2022) Control (n = 11)
Control + feedback (n = 11)
4/8 Europe;
police
3 41.2 6.2
Haginoya etal. (2022a) Control (n = 20)
Feedback (n = 20)
Supportive (n = 20)
Feedback + supportive (n = 20)
4 Japan;
mixed
53 35.6 9.9
Journal of Police and Criminal Psychology
1 3
condition where participants received both process feedback
(i.e., which of the questions they had used were appropri-
ate and which were inappropriate and why) and outcome
feedback (whether they had reached the correct conclusion
after the interview; they were explained what had really
happened in the case, i.e., which memory contents did the
avatars have). The supportive statement manipulation was
not aimed at improving recommended question use or con-
clusion accuracy, and it was therefore ignored in current
analyses.
Interview Round In all but two police studies (Kask
et al. 2022b, Haginoya etal. 2022a), participants were
assigned either to a feedback condition or a no feedback con-
dition, with interview rounds ranging from 4 to 8. Instead,
to maximize the utility of simulation training in the police
force, in the two studies using police samples, participants
were either assigned to a condition where they finished four
rounds of interviews with feedback or a condition where they
first finished four rounds of interviews without feedback and
then finished four rounds of interviews with feedback. In cur-
rent analyses, we recoded the latter condition so that the first
four rounds without feedback were coded as belonging to the
no feedback condition, and the second four rounds with feed-
back were coded as belonging to the feedback condition. The
5–8 rounds of interviews in the control + feedback condition
were thus coded as rounds 1–4 in the feedback condition.
Recommended and Non‑Recommended Questions Rec-
ommended and non-recommended questions were coded as
continuous variables with the value indicating number of
questions asked in an interview. The current analysis coded
the data in the same way as in the previously published
studies (reference eliminated due to blind peer review; see
Table2).
Relevant, Neutral, and Wrong Details Relevant, neutral, and
wrong details were coded as continuous variables with the
value indicating number of details elicited from the ava-
tar in an interview. Relevant details were the forensically
relevant details that related to the alleged abusive situation
(e.g., details that would clarify if the abuse happened or
not). Neutral details were details related to other situations
that the avatar had experienced but that were not forensi-
cally relevant to the investigation of the alleged abuse (e.g.,
games played with other persons). Finally, wrong details
were details that contradicted the pre-defined memories of
each avatar (e.g., using repeated suggestive questions, the
interviewer found that the dad would have touched the child,
while in reality it was the uncle) There were one hundred
and twenty cases missing the number of wrong details in the
combined data set.
Conclusion Content Correctness In all the studies, partici-
pants were deemed as having reached a correct conclusion
only if they (1) first provided a correct answer regarding
the presence (or absence) of an abuse and then (2) offered a
correct account of the sexual abuse (who, when, where, and
what transpired) in the former case or an explanation of what
happened instead of an abuse in the latter case. Conclusion
Table 2 Question-type coding used in the studies
Category Definition
Recommended questions
Facilitators Open-ended and non-suggestive questions that encourage the child to continue with the previous answer
Invitations Open-ended and non-suggestive questions. They are broad and let the child talk freely
Directive Open-ended and non-suggestive questions that focus the child attention on a previously mentioned detail asking
for a specific explanation
Not recommended questions
Option-posing Closed-ended questions that focus on unmentioned detail (without implying a particular type of response) or on a
mentioned detail asking the child to provide a yes/no answer
Specific suggestive Open-ended or closed-ended questions that are based on an unmentioned detail and express the expected response
Unspecific suggestive Open-ended or closed-ended questions that are not based on an unmentioned detail but express the expected
response
Repetitions Repetitions of a previous recommended or non-recommended question
Too-long/unclear Questions that use a logical structure that is too complicated for the cognitive level of the child and/or are
formulated in a haphazard manner and/or contains more than one concept at the time
Multiple choice Questions that provide a predetermined list which the child is requested (explicitly or implicitly) to pick from
Time Open-ended or closed-ended questions that require the child to provide or recollect precise time-related
information
Fantasy Open-ended or closed-ended questions that move the discussion from the reality to the fantasy level
Feelings Open-ended or closed-ended questions that require the child to provide accounts regarding own or other’s feelings
Journal of Police and Criminal Psychology
1 3
content correctness was coded as dichotomous variable in
all except for two data sets (Study 2 from Haginoya etal.
2022b; Kask etal.2022). In these two studies, conclusion
content correctness was coded with three categories: cor-
rect, incorrect, and not enough information to reach a con-
clusion. Therefore, in the current mega-analysis, we coded
conclusion content correctness with two categories: correct
and not correct, with the latter including both incorrect and
fail-to-reach-conclusion cases. As the data set containing the
Japanese police response did not record conclusions, there
were one hundred and twenty-four cases missing the infor-
mation on conclusion correctness in the combined data set.
Professional Experience Regarding Child Interview The data
sets included in the current mega-analysis employed several
non-identical measures to assess participants’ professional
training and/or experiences with child interview. Haginoya
etal. (2021) and Haginoya etal. (2022a) employed three
questions about child interview training, experience, and child
sexual abuse interview experience, specifically. Krause etal.
(2017), Pompedda etal. (2020), and Haginoya (2022a) used
one item to assess child sexual abuse interview experience.
Kask etal. (2022b) documented years of conducting child
interview, years of conducting child sexual abuse interview,
and number of interviews conducted in a continuous manner.
Pompedda etal. (2017) did not report child interview experi-
ence data in their published manuscript nor documented the
information in the data set available to us. We obtained the
child interview experience information through private com-
munication with the authors. Therefore, in current analysis,
we coded child interview experience as a dichotomous vari-
able, with yes indicating having interviewing experience with
children. The combined data set contained 33 participants
(resulting in 149 interviews) in the no feedback condition and
62 participants (resulting in 276 interviews) in the feedback
condition who had child interview experience before partici-
pating in one of the included studies.
Parenting Experience All data sets except for the data set of
Kask etal. (2022b) contained parenting experience, though
the operationalization was not always the same. Pompedda
etal. (2017), Krause etal. (2017), and Pompedda etal. (2020)
asked participants whether they had children or not. Haginoya
etal. (2020; 2021) and Haginoya etal. (2022a, b) asked
participants whether they had child-rearing experience. In
current mega-analysis, we coded parenting experience as
dichotomous variable, with yes either indicating having
children or having child-rearing experience. There were 32
participants with parenting experience (150 interviews) in
the no feedback condition and 45 participants with parenting
experience (158 interviews) in the feedback condition in the
combined data set.
Statistical Analyses
All statistical analyses were conducted in R (version 4.05).
We first employed correlational analysis to investigate the
validity of the algorithms used in the studies. Then we used
lme4 (Bates etal.2014) to perform a series of linear mixed
effects analyses to examine the efficacy of simulation train-
ing on CSA interview quality. As fixed effects, we entered
interview round, feedback, process feedback, outcome feed-
back, reflection, modeling, and the interaction term between
interview round and feedback into the model. As random
effects, we had intercepts for participants and studies. In
subsequent analyses, we also examined potential moderat-
ing effects of the demographic variables, professional train-
ing and experience, and parenting experience by including
their interaction terms with interview round as fixed effects
and study, feedback condition, and participants as random
effects. Since professional experience and parenting expe-
rience were correlated, χ2 (1) = 238.04, p < 0.001, separate
models were run for the two potential moderators. Confi-
dence intervals (95%) of the parameters in the linear mixed
models were calculated using bootstrap method with 5,000
draws. The 95% confidence intervals of the parameters in
the generalized linear mixed model were calculated using
the Wald method.
To make the results more interpretable, interview round
in the analyses was coded starting from zero, that is, the
first round was designated with the value of 0, the second
round with value of 1, and so on. This way, the intercepts
of the models represented the estimates of the first-round
performance in the baseline conditions.
In addition, we calculated a Reliable Change Index (RCI)
for each participant in the feedback condition for question
use and details elicited to provide more nuanced informa-
tion regarding individual differences in the training effect as
well as how training design could have an impact on reliable
change. As there is no established norm for these measure-
ments, the reliability of the measurement (r) was operation-
alized as the correlation between the first-round performance
and the last-round performance in the no feedback condition
while excluding cases who received modeling instructions or
process feedback instructions. The standard deviation (SD)
of the measure was operationalized as the standard deviation
of the first-round performance in the feedback group, which
was used to calculate the standard error of the difference.
The RCI formula is as follows:
RCI greater than 1.96 (the z score corresponding to a
distance of 2 standard deviations from the mean) in the case
RCI = (Performance
last round
Perf or mance
1st round
)
∕[
2
×(
1
r
SD2
]
12
Journal of Police and Criminal Psychology
1 3
of recommended questions or smaller than 1.96 in the case
of non-recommended questions would indicate that the par-
ticipant had a reliable change in their interview quality.
Results
Correlations Between Questions, Details,
andConclusion Correctness
Descriptive statistics for interview quality indicators are pre-
sented in supplementary materials (TableS2). The number of
recommended questions was positively correlated with num-
ber of relevant details elicited (r = 0.79, p < 0.001), number
of neutral details elicited (r = 0.79, p < 0.001), and negatively
correlated with number of wrong details elicited (r = − 0.25,
p < 0.001). The number of recommended questions was also
positively correlated with conclusion correctness (r = 0.36,
p < 0.001). The number of non-recommended questions was
negatively correlated with number of relevant details elic-
ited (r = − 0.69, p < 0.001), number of neutral details elicited
(r = − 0.20, p < 0.001), and positively correlated with number
of wrong details elicited (r = 0.57, p < 0.001). The number of
non-recommended questions was negatively correlated with
conclusion correctness (r = − 0.13, p < 0.001). The correla-
tional structure provided robust evidence that the algorithms
used in this series of studies functioned as expected (for the
correlation matrix, see TableS3 in supplementary materials).
The Effect ofSimulation Training withFeedback
onInterview Quality
The complete results of the linear mixed models for ques-
tion use, details elicited, and conclusion correctness can be
accessed in supplementary materials (TablesS4, S5, and S6,
respectively). There were considerable individual differences
and within-study variation of interview quality as indicated
by the random effects and intraclass correlations (ICCs) of
the models. Simulation with feedback had robust effects on
increasing recommended question use and decreasing non-
recommended question use, improving details retrieval from
the avatar, and finally reaching a correct conclusion regarding
the suspected abuse. For models predicting question use, our
main interest, the interaction term between feedback condi-
tion and round significantly predicted increased interview
quality (recommended questions: B = 2.03, SE = 0.16, 95%
CI [1.71, 2.34]; non- recommended questions: B = − 2.37,
SE = 0.17, 95% CI [− 2.68, 2.05]; percentage of recom-
mended questions: B = 5.34, SE = 0.32, 95% CI [4.74, 5.95]).
Similar patterns emerged for the details elicited during inter-
views, with increasing improved interview quality in the
feedback condition (relevant details: B = 0.40, SE = 0.05,
95% CI [0.30, 0.50]; neutral details: B = 0.30, SE = 0.05, 95%
CI [0.20, 0.40]; wrong details: B = − 0.40, SE = 0.05, 95%
CI [− 0.50, 0.30]). In the generalized linear mixed model
predicting conclusion correctness, the significant interaction
between feedback and round showed increased correct rate in
the feedback condition as the training progressed (interaction
term: odds ratio = 1.39, SE = 0.11, 95% CI [1.20, 1.62]). The
trends of interview quality improvement in the feedback con-
dition can be seen in Figs.1, 2, and 3. Note that these plots
are not estimates from the mixed models but the actual data.
As for the effects of the other experimental conditions on
interview quality, outcome feedback did not seem to have a
robust influence on question use, details elicited, or conclu-
sion correctness and neither did reflection (see TableS4,
S5, and S6 in supplementary materials). All eight 95% CIs
of outcome feedback indicated that no significant effect was
found. Reflection only had a positive effect on neutral detail
elicitation. Process feedback had positive effects on recom-
mended question use (B = 6.97, SE = 2.58, 95% CI [1.88,
11.92]), relevant detail elicitation (B = 1.61, SE = 0.65,
95% CI [0.35, 2.85]), percentage of recommended ques-
tions (B = 15.56, SE = 5.26, 95% CI [5.05, 25.99]), and per-
centage of relevant details (B = 26.98, SE = 7.40, 95% CI
[12.73, 41.40]), but no significant effect was detected on
the conclusion correctness (odds ratio = 1.44, SE = 0.78,
95% CI [0.49, 4.19]). More importantly, modeling had sig-
nificant effects on all interview quality indicators except
for number of wrong details elicited. Modeling increased
recommended question use (B = 14.70, SE = 2.65, 95% CI
[9.47, 20.04]) while decreasing non-recommended ques-
tion use (B = − 6.46, SE = 2.89, 95% CI [− 12.22, − 0.82]),
leading to a higher percentage of recommended questions
(B = 27.06, SE = 5.35, 95% CI [16.34, 37.45]). Modeling
also significantly increased the number of relevant (B = 2.72,
SE = 0.63, 95% CI [1.49, 3.99]) and neutral details (B = 2.86,
SE = 0.61, 95% CI [1.64, 4.04]), without increasing the num-
ber of wrong details in the meantime, resulting in higher
percentage of relevant details (B = 22.15, SE = 7.17, 95%
CI [8.07, 36.38]). As for the conclusions, modeling signifi-
cantly increased conclusion correctness above and beyond
the provision of feedback (odds ratio = 5.05, SE = 2.81, 95%
CI [1.69, 15.05]).
We also did round-wise comparisons (i.e., compare each
round’s performance with the performance of the previ-
ous round) in the interviews that received either combined
feedback or process/outcome feedback to examine the trend
of training. The data contained 247 participants and 1307
interviews in total, and the results are presented in supple-
mentary materials (TableS7, S8, and S9). Overall, round-
wise increase of recommended question and decrease of
non-recommended question were, for several comparisons,
significant, suggesting continued improvement. The training
Journal of Police and Criminal Psychology
1 3
effect did not seem to continue to improve in terms of details
elicited, especially for wrong details. Round-wise difference
of conclusion correctness was only significant in the first
comparison (round 1 vs. round 2). Notably, all the compari-
sons between the 8th round and the 7th round were not sig-
nificant. Whether this is an indicator of reaching plateau or
a result of insufficient power demands further investigation.
Individual Differences inInterview Training:
Reliable Change
The RCI results showed that only a minority of partici-
pants in the feedback group exhibited reliable change at
the end of the training (see Table3). For recommended
question, 41.7% (93/223) of participants had a RCI greater
than 1.96. But only 18.8% (42/223) had an RCI smaller
than 1.96 in the case of non-recommended questions.
Similar patterns emerged when using the details elicited
to examine reliable change: 26.0% (58/223) of participants
achieved reliable change in terms of relevant detail elicita-
tion and 30.5% (68/223) for neutral detail elicitation. As
for wrong detail elicitation, only 8.5% (19/223) of partici-
pants had a RCI smaller than 1.96.
A closer examination between training design and RCI
showed that number of interviews had an impact on reli-
able change percentage. As shown in Table3, participants
who participated in a greater number of interviews were
more likely to achieve a reliable change. These results
corresponded to the round-wise comparison analyses,
showing continuous improvement at the individual and
the group level. Note that the higher percentage in the
5-round design from Haginoya etal. (2021) could be a
result of small sample size and the added feature of mod-
eling instead of indicating a non-linear trend for improve-
ment. Combined, these results suggest that there are great
individual differences in training effect, but with more
practice, it is possible to improve the interview quality
even among those who learn at a relatively slow pace.
Fig. 1 Effects of simulated interviews with feedback on recommended and non-recommended question use. Note. Numbers in parentheses refer
to the numbers of interviews in each condition. Error bars show standard error
Journal of Police and Criminal Psychology
1 3
Professional andParenting Experience Moderates
Improvements overInterviews
To examine whether professional experience and parenting
experience moderated the improvement, additional mixed
models with professional experience, parenting, and the
interaction term with round were run for all quality indica-
tors. In terms of professional experience (for detailed results,
see TableS10, S11, and S12 in supplementary materials),
having professional experience positively predicted relevant
details (B = 0.82, SE = 0.35, 95% CI [0.12, 1.53]) and neutral
details elicited (B = 0.72, SE = 0.34, 95% CI [0.05, 1.40]),
that is, all else being equal, individuals with professional
experiences were better at eliciting information from the
avatars in the first round of the interview. More importantly,
the interaction term between professional experience and
round was a significant predictor of number of recommended
questions (B = 0.96, SE = 0.25, 95% CI [0.43, 1.45]), number
of non-recommended questions (B = − 0.68, SE = 0.27, 95%
CI [− 1.21, 0.17]), percentage of recommended questions
(B = 2.47, SE = 0.52, 95% CI [1.40, 3.49]), relevant details
elicited (B = 0.18, SE = 0.08, 95% CI [0.02, 0.33]), and
neutral details elicited (B = 0.26, SE = 0.08, 95% CI [0.11,
0.42]). Professional experiences also predicted higher cor-
rect rate (odds ratio = 2.59, SE = 0.99, 95% CI [1.22, 5.49]),
but the interaction with round was not significant (odds
ratio = 0.88, SE = 0.10, 95% CI [0.70, 1.09]). After control-
ling for study-level, condition-level, and individual level
variances, compared with those who had no experience in
interviewing children, individuals with professional experi-
ence improved more over rounds of practice, as suggested
by the significant interaction terms between professional
experience and practice round.
Individuals with parenting experiences asked more non-
recommended questions (B = 3.13, SE = 1.28, 95% CI [0.63,
5.68]) and obtained more wrong details at the beginning of
the training (B = 0.97, SE = 0.30, 95% CI [0.38, 1.55]). Parent-
ing experience also interacted with practice round to predict
interview quality (for detailed results, see TableS13, S14,
and S15 in supplementary materials). The 95% CI of the esti-
mate of interaction term between parenting experience and
round did not include zero for, number of non-recommended
questions (B = − 0.73, SE = 0.29, 95% CI [− 1.30, − 0.15]),
relevant details elicited (B = 0.20, SE = 0.09, 95% CI [0.02,
Fig. 2 Effects of simulated interviews with feedback on details elic-
ited. Note. The percentage of relevant details elicited is calculated
using the following formula: number of relevant details / (number of
relevant details + number of wrong details). Numbers in parentheses
refer to the numbers of interviews in each condition. Error bars show
standard error
Journal of Police and Criminal Psychology
1 3
0.36]), neutral details elicited (B = 0.24, SE = 0.08, 95% CI
[0.07, 0.40]), wrong details elicited (B = − 0.30, SE = 0.08,
95% CI [− 0.46, 0.14]), and percentage of relevant details
elicited (B = 3.16, SE = 1.09, 95% CI [1.02, 5.28]). All sig-
nificant effects were in the expected direction. Parental expe-
rience had no impact on conclusion correctness regardless
of training rounds. While all experimental conditions were
controlled for, compared with those without parenting experi-
ence, individuals with parenting experiences showed greater
improvement of interview quality over rounds of practices,
using less non-recommended questions, eliciting more rel-
evant and neutral but less wrong details during interviews.
Importantly, though professional and parenting experiences
interacted with interview rounds to positively predict inter-
view quality, the effect sizes were smaller compared to those
of experimental manipulations, as suggested by the smaller
estimates of the fixed effects of the moderation models com-
pared with models having all experimental manipulations as
fixed effects.
Discussion
Through a systematic mega-analysis, the current research
first examined the efficacy of the avatar training with
feedback in terms of recommended question use, details
Fig. 3 Effects of simulated interviews with feedback on conclusion correctness. Note. Numbers in parentheses refer to the numbers of interviews
in each condition. Error bars show standard error
Table 3 Percentage of reliable change in designs with different rounds
Training design Recommended questions Non-recommended questions Relevant details Neutral details Wrong details
4 rounds 22.6% (21/93) 12.9% (12/93) 17.2% (16/93) 14.0% (13/93) 9.7% (9/93)
5 rounds 71.4% (15/21) 14.3% (3/21) 47.6% (10/21) 71.4% (15/21) 9.5% (2/21)
6 rounds 36.2% (25/69) 17.4% (12/69) 23.2% (16/69) 36.2% (25/69) 2.9% (2/69)
8 rounds 80.0% (32/40) 37.5% (15/40) 40.0% (16/40) 37.5% (15/40) 15% (6/40)
χ2 statistics χ2(3) = 46.60, p < .001 χ2(3) = 11.64, p = .009 χ2(3) = 13.20, p = .004 χ2(3) = 30.57, p < .001 χ2(3) = 5.14, p = .162
Journal of Police and Criminal Psychology
1 3
elicited, and most of all, conclusion correctness confirm-
ing a strong and clear effect of feedback. Round-wise com-
parisons suggested continuous improvement in the use of
recommended questions, while plateaus were reached in
accurate information elicitation. Reliable change analy-
ses offered insights on individual difference in training
efficacy but also pointed out the potential of achieving
reliable change by more practice. Moderation analysis also
revealed that both professional experiences and parenting
experiences may be conducive to the learning process.
The Efficacy ofSimulated Interviews withFeedback
Interviewers, both legal and psychological professionals
and university students, showed significant improvement in
interview quality when undergoing the simulated training
combined with feedback on their question use and interview
outcome. Interviewers increased the use of recommended
questions and reduced the use of non-recommended ques-
tions, which then led to better information gathering, that
is, more relevant and neutral details and fewer wrong details
being elicited from the avatars. Importantly, even though
the program did not include explicit instructions on how
correctly utilizing the elicited details to draw a conclusion,
these variables had robust associations with reaching correct
conclusion (see TableS2 in the supplementary materials).
Note that in all the studies included in this mega-analysis,
the criteria for a correct conclusion were very strict. The
interviewers not only had to reach a correct judgment of
whether an abuse had taken place but also had to provide a
coherent account of how the abuse took place that was con-
sistent with the ground truth of the case. Under this stringent
standard, interviewers in the feedback condition were able to
reach a correct conclusion at 22% rate on average and 50%
rate in the 8th round.
The results are in line with previous research on the
beneficial effect of feedback on the outcomes of child sex-
ual abuse interviews (Benson and Powell2015; Cederborg
etal.2013; Lamb etal.2002a, b). While a perfect com-
parison is not possible, Benson and Powell (2015) showed
on average between 57 and 79% of recommended question
use. Within one training session that took 1–2h, the ava-
tar training program achieved improvement of interview
quality comparable to other successful programs that last
at least for a few days in terms of proportions of recom-
mended questions.
How Many Interviews Will Be Enough andWhat Is
theGoal?
Round-wise comparisons showed different patterns of devel-
opment in question use, details elicited, and conclusion
correctness. The recommended (vs. non-recommended)
question use continued to improve, while the improvement
of details elicited reached plateau earlier. This is not surpris-
ing as there is no limit to the number of questions a partici-
pant can ask (within the 10-min timeframe), while relevant
and neutral details are finite within each avatar. As shown in
Fig.2, the average numbers of relevant and neutral details
are close to 7 (the maximum is 9). The average number of
wrong details was close to 1 in the later rounds of the train-
ing in the feedback condition suggesting a floor effect. This
means that the interviewers used very few non-recommended
questions that could have elicited wrong details at this point.
Most of the round-wise comparisons for conclusion correct-
ness were also not significant, which may suggest there is
room for improvement in terms of using elicited information
appropriately. However, it is also of note that all the estimates
of the odds ratio were greater than 1, indicating a trend for
continued improvement over interviews.
Though none of the round-wise comparisons between
the 7th and 8th interview were significant, we should be
cautious to draw the conclusion that the training reached
plateau given the small number of interviews available in
these comparisons. Also, it is important to note that even
if the training effect did not reach plateau, it may not be
necessary nor optimal to add more interviews to a training
session. Instead, a more appropriate next step would be to
focus on the relationship between learning during the simu-
lated interviews and how this relates to transfer to actual
interviews and then develop training plans also possibly
including refresher training sessions (Cyr etal.2021).
From the reliable change analyses, it is clear that there
are large individual differences in the training effect. Only a
minority of participants in the feedback group achieved reli-
able change. But it is also important to note that as the num-
ber of interviews increase, so do the percentages of reliable
change. Therefore, by incorporating RCI into the design and
offer individuals trainings with different lengths and intensi-
ties, future training programs can have greater impact.
Experience withChildren ontheTraining Effect
The use of questions, both recommended and non-recom-
mended, did not differ between those who had previous
experience in interviewing children and those who did not
in the first round of the simulated interview. However, indi-
viduals with interview experience were better at eliciting
relevant and neutral information at the beginning. Individu-
als with parenting experience were more likely to use non-
recommended questions and elicited more wrong details at
the beginning. More importantly, both professional expe-
rience of child interview and parenting experience inter-
acted with interview round to predict interview quality.
Interviewers who had experience with children improved
Journal of Police and Criminal Psychology
1 3
faster compared with those who did not in terms of ques-
tion use or information elicitation. Additional analyses
were also run to probe if there were three-way interactions
between experience, feedback, and interview round, but the
results did not support the existence of a three-way interac-
tion (see TableS16–S21 supplementary materials). This is
a surprising result and goes against previous literature sug-
gesting that professional experience is negatively associ-
ated with the use of open questions in simulated interviews
and also goes against field studies that shows no relation-
ship (for a review, see Lamb etal.2018). However, this is
in line with other studies that show how the type of training
can overcome the effect of some a priori characteristics
(e.g., Benson and Powell2015). A possible explanation
for this result is that parents and experienced interviewers,
while might not have better ability in using open questions,
might possess superior communication skills. The interac-
tive nature of the training might have had a role in boosting
the improvements in these groups of participants.
Limitations andFuture Directions
Notwithstanding the strength of the mega-analysis approach,
the current study has several limitations. First, we only
examined the training efficacy within the Avatar Training
system. That is, the scope of the current analysis did not
include training efficacy in improving interview outside of
the simulated environment. The reasons for not analyzing
the transfer effect are, first, to keep the analysis concise and
focused, and, second, as a result of lack of enough data for
providing reliable results. At the moment of this analysis,
only two studies have examined the transfer effect (Kask
etal.2022; Haginoya etal.2022a). Secondly, also out of
the scope of this article is how interview performance of
previous rounds can influence learning of the subsequent
rounds, and whether professionals and lay people respond to
interview failures in the same way. Despite these limitations,
the current study not only re-examined previous conclusions
with a large sample offering more reliable estimates of the
effects of combining process and outcome feedback, but also
advanced our understanding of CSA interview training by its
novel round-wise, RCIs, and moderation analyses.
The focus of this research has been on testing its effi-
cacy in improving interview quality in a variety of differ-
ent samples. However, this approach has the potential to be
applied in other research areas in investigative psychology
given that its algorithms are based on empirical research
and have been proven to work as intended. The effects of
contextual factors such as time pressure and fatigue as well
as individual difference factors on interview quality can
be examined within this training. That is, when not pro-
viding the interviewers with feedback, the avatar system
could also function as a standardized assessment tool for
the impact of contextual factors and individual’s interview
style and quality.
An interview is a dynamic interactive process between the
interviewer and the interviewee; therefore, one other direc-
tion is to further develop the avatar training by tailoring the
response patterns of the avatars based on family background,
mental ability, and other factors that could make the children
more or less vulnerable to suggestibility or compliance.
Conclusion
The present research demonstrated the robustness of the
Avatar Training program in improving interview quality
in interviewers with different backgrounds (e.g., working
experience and specialty) and different training environ-
ments (face-to-face and remote online). This allows train-
ers in various fields to integrate Avatar Training into their
interviewer training program flexibly. Moreover, this flex-
ibility may imply successful training even when all proce-
dures of the Avatar Training are automated to scale it to a
large number of potential trainees such as police officers,
clinical psychologists, child support center staff, and even
school teachers.
Findings regarding interviewer background provided
encouraging knowledge for interviewers who have experi-
ence with children. Experienced interviewers may be more
likely improve faster than those without experience under
the interactive training environment. Although potentially
relevant factors (e.g., motivation to improvement) need to
be investigated, this suggests that providers of training pro-
grams may need to consider an environment that promotes
trainees to make the best use of their abilities.
Supplementary Information The online version contains supplemen-
tary material available at https:// doi. org/ 10. 1007/ s11896- 022- 09509-7.
Funding Yikang Zhang’s work is supported by the China Scholarship
Council (No. 202106140025). Shumpei Haginoya’s work is partially
funded by the European Regional Development Fund (No 01.2.2-LMT-
K-718-03-0067) under grant agreement with the Research Council of
Lithuania (LMTLT).
Data and Code Availability Data are available upon request to respec-
tive authors.Code can be access at Open Science Framework (https://
osf.io/hx2dr/).
Declarations
Ethics Approval The study utilized published data. All studies included
in this mega-analysis obtained ethical approval from respective institu-
tions.
Consent to Participate All studies included in this mega-analysis
obtained informed consent from their participants before collecting
data.
Journal of Police and Criminal Psychology
1 3
Conflict of Interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attri-
bution 4.0 International License, which permits use, sharing, adapta-
tion, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are
included in the article's Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in
the article's Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a
copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
References
Bandura A, McClelland DC (1977) Social learning theory, vol 1. Pren-
tice Hall, Englewood cliffs
Barth J, Bermetz L, Heim E, Trelle S, Tonia T (2013) The current
prevalence of child sexual abuse worldwide: a systematic review
and meta-analysis. Int J Public Health 58(3):469–483
Bates D, Mächler M, Bolker B, Walker S (2014) Fitting linear mixed-
effects models using lme4.arXiv preprint arXiv:1406.5823
Benson MS, Powell MB (2015) Evaluation of a comprehensive inter-
active training system for investigative interviewers of children.
Psychol Public Policy Law 21(3):309–322
Blasbalg U, Hershkowitz I, Karni-Visel Y (2018) Support, reluctance,
and production in child abuse investigative interviews. Psychol
Public Policy Law 24(4):518–527
Bruck M, Ceci SJ (1999) The suggestibility of children’s memory.
Annu Rev Psychol 50(1):419–439
Ceci SJ, Bruck M (1993) Suggestibility of the child witness: a histori-
cal review and synthesis. Psychol Bull 113(3):403–439
Ceci SJ, Bruck M (1995) Jeopardy in the courtroom: a scientific analy-
sis of children’s testimony. American Psychological Association,
Washington, DC, US
Cederborg A-C, Alm C, da Silva L, Nises D, Lamb ME (2013) Inves-
tigative interviewing of alleged child abuse victims: an evaluation
of a new training programme for investigative interviewers. Police
Pract Res 14(3):242–254
Cederborg AC, Orbach Y, Sternberg KJ, Lamb ME (2000) Investiga-
tive interviews of child witnesses in Sweden. Child Abuse Negl
24(10):1355–1361
Cyr M, Dion J, Gendron A, Powell M, Brubacher S (2021) A test of
three refresher modalities on child forensic interviewers’ post-
training performance. Psychol Public Policy Law 27(2):221–230
Elliott DM, Briere J (1994) Forensic sexual abuse evaluations of
older children: disclosures and symptomatology. Behav Sci Law
12(3):261–277
Espinet SD, Anderson JE, Zelazo PD (2013) Reflection training
improves executive function in preschool-age children: behavioral
and neural effects. Dev Cogn Neurosci 4:3–15
Finnilä K, Mahlberg N, Santtila P, Sandnabba K, Niemi P (2003) Valid-
ity of a test of children’s suggestibility for predicting responses to
two interview situations differing in their degree of suggestive-
ness. J Exp Child Psychol 85(1):32–49
Hailes HP, Yu R, Danese A, Fazel S (2019) Long-term outcomes of
childhood sexual abuse: an umbrella review. The Lancet Psychia-
try 6(10):830–839
Haginoya S, Yamamoto S, Pompedda F, Naka M, Antfolk J, Santtila P
(2020) Online simulation training of child sexual abuse interviews
with feedback improves interview quality in Japanese university
students. Frontiers in Psychology: Forensic and Legal Psychol-
ogy 11:998
Haginoya S, Yamamoto S, Mizushi H, Yoshimoto N, Santtila P (2022a)
Improving supportiveness and questioning skills using online sim-
ulated child sexual abuse interviews with feedback. [Unpublished
manuscript]
Haginoya S, Yamamoto S, Santtila P (2021) The combination of feed-
back and modeling in online simulation training of child sexual
abuse interviews improves interview quality in clinical psycholo-
gists. Child Abuse Negl115:105013
Haginoya S, Yamamoto S, Santtila P (2022b) Improvement of inter-
view quality in police officers using simulated child sexual abuse
interviews with feedback. [Unpublished Manuscript]
Herman S (2009) Forensic child sexual abuse evaluations: accuracy,
ethics and admissibility. In: Kuehnle K, Connell M (eds) The
evaluation of child sexual abuse allegations: a comprehensive
guide to assessment and testing. Wiley, Hoboken, NJ, pp 247–266
Hershkowitz I, Ahern EC, Lamb ME, Blasbalg U, Karni-Visel Y,
Breitman M (2017) Changes in interviewers’ use of supportive
techniques during the revised protocol training. Appl Cogn Psy-
chol 31(3):340–350
Jacoby LL, Debner JA, Hay JF (2001) Proactive interference, accessibility
bias, and process dissociations: valid subject reports of memory. J
Exp Psychol Learn Mem Cogn 27(3):686–700
Johnson M, Magnussen S, Thoresen C, Lonnum K, Burrell LV,
Melinder A (2015) Best practice recommendations still fail to
result in action: a national 10-year follow-up study of investiga-
tive interviews in CSA cases. Appl Cogn Psychol 29(5):661–668
Kask, K., Pompedda, F., Palu, A., Schiff, K., Mägi, M., & Santtila,
P. (2022). Avatar training effects transfer to investigative field
inter-views of children conducted by police officers [Manuscript
submitted for publication]
Korkman J, Santtila P, Westeråker M, Sandnabba NK (2008) Inter-
viewing techniques and follow-up questions in child sexual abuse
interviews. European Journal of Developmental Psychology
5(1):108–128
Krause N, Pompedda F, Antfolk J, Zappalá A, Santtila P (2017) The
effects of feedback and reflection on the questioning style of
untrained interviewers in simulated child sexual abuse interviews.
Appl Cogn Psychol 31(2):187–198
Lafontaine J, Cyr M (2017) The relation between interviewers’ per-
sonal characteristics and investigative interview performance in a
child sexual abuse context. Police Pract Res 18(2):106–118
Lamb ME, Brown DA, Hershkowitz I, Orbach Y, Esplin PW (2018)
Tell me what happened: questioning children about abuse. John
Wiley & Sons
Lamb ME, Hershkowitz I, Orbach Y, Esplin PW (2008)Tell me what
happened: structured investigative interviews of child victims and
witnesses. Hoboken, NJ: JohnWiley and Sons
Lamb ME, Orbach Y, Hershkowitz I, Horowitz D, Abbott CB (2007)
Does the type of prompt affect the accuracy of information pro-
vided by alleged victims of abuse in forensic interviews? Applied
Cognitive Psychology: the Official Journal of the Society for
Applied Research in Memory and Cognition 21(9):1117–1130
Lamb ME, Sternberg KJ, Orbach Y, Esplin PW, Mitchell S (2002a)
Is ongoing feedback necessary to maintain the quality of investi-
gative interviews with allegedly abused children? Appl Dev Sci
6(1):35–41
Lamb ME, Sternberg KJ, Orbach Y, Esplin PW, Stewart H, Mitchell
S (2003) Age differences in young children’s responses to open-
ended invitations in the course of forensic interviews. J Consult
Clin Psychol 71(5):926–934
Lamb ME, Sternberg KJ, Orbach Y, Hershkowitz I, Horowitz D, Esplin
PW (2002b) The effects of intensive training and ongoing super-
vision on the quality of investigative interviews with alleged sex
abuse victims. Appl Dev Sci 6(3):114–125
Journal of Police and Criminal Psychology
1 3
Lyon TD (2014) Interviewing children. Annual Review of Law and
Social Science 10(1):73–89
MacDonald S, Snook B, Milne R (2017) Witness interview training: a
field evaluation. J Police Crim Psychol 32(1):77–84
Matthew CT, Sternberg RJ (2009) Developing experience-based (tacit)
knowledge through reflection. Learn Individ Differ 19(4):530–540
Pate R (2012) Open versus closed questions: what constitutes a good
question. Educational research and innovationspp. 29–39
Pompedda F (2018) Training in investigative interviews of children:
serious gaming paired with feedback improves interview qual-
ity. (Doctoral dissertation, Abo Akademi University, Finland).
Retrieved from https:// www. doria. fi/ handle/ 10024/ 152565
Pompedda F, Antfolk J, Zappalà A, Santtila P (2017) A combination of
outcome and process feedback enhances performance in simula-
tions of child sexual abuse interviews using avatars. Front Psychol
8:1474
Pompedda F, Palu A, Kask K, Schiff K, Soveri A, Antfolk J, Santtila P
(2020) Transfer of simulated interview training effects into inter-
views with children exposed to a mock event. Nordic Psychology
73(1):43–67
Pompedda F, Zappalà A, Santtila P (2015) Simulations of child sexual
abuse interviews using avatars paired with feedback improves
interview quality. Psychology, Crime and Law 21(1):28–52
Powell MB, Guadagno B, Benson M (2016) Improving child investiga-
tive interviewer performance through computer-based learning
activities. Polic Soc 26(4):365–374
Powell MB, Hughes-Scholes CH, Smith R, Sharman SJ (2014) The
relationship between investigative interviewing experience and
open-ended question usage. Police Pract Res 15(4):283–292
Ron N, Lipshitz R, Popper M (2006) How organizations learn: post-flight
reviews in an F-16 fighter squadron. Organ Stud 27(8):1069–1089
Sternberg KJ, Lamb ME, Davies GM, Westcott HL (2001) The memo-
randum of good practice: theory versus application. Child Abuse
Negl 25(5):669–681
Taylor PJ, Russ-Eft DF, Chan DW (2005) A meta-analytic review of
behavior modeling training. J Appl Psychol 90(4):692–709
Wolfman M, Brown D, Jose P (2016) Talking past each other: inter-
viewer and child verbal exchanges in forensic interviews. Law
Hum Behav 40(2):107–117
Publisher's Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
... The virtual children can be communicated with via natural language processing, involving speech-to-text and text-to-speech components and a text-based dialog management system (ChatScript, Wilcox & Wilcox, 2013). As in the Empowering Interviewer Training (EIT) approach (Krause et al., 2017;Pompedda, 2018;Pompedda et al., 2015Pompedda et al., , 2020Pompedda et al., , 2022, the memory content for each virtual child is stored in a set of narrative responses that the virtual child can reveal according to probabilistic algorithms that depend on the participant's questioning style. The use of recommended (open, non-suggestive and simply phrased) questions leads to narrative responses with a high probability, while the use of non-recommended (closed, suggestive or too complex) questions leads to generic responses like "yes", "no", or "I don't know", that are selected randomly and thus do not contain informational value. ...
... While in the control group, the percentage of correct conclusions decreased from baseline (23 %) to post-test (12 %), all training groups saw an increase of correct conclusions, with the largest increase (12 % to 59 %) in the combined training group and smaller increases in the single intervention groups (VR: 17 % to 35 %, ST: 12 % to 27 %). The improvements in the combined training group resemble to findings by Pompedda et al. (2022), who saw an average number of 11 % correct conclusions in their no-feedback condition and an increase from 5 % (first conversation) to 50 % correct conclusions (eighth conversation) in the feedback condition. ...
... Although interviews and conversations are highly individual in nature and studies report considerable between-person variance in interviewing performance (e.g. Finnilä-Tuohimaa et al., 2008;Pompedda et al., 2022) and case evaluations (Everson & Sandoval, 2011), differential aspects of bias and suggestiveness are rarely examined: Are some people more prone than others to fall for a bias toward the abuse hypothesis and into using suggestive questioning and sensemaking? If so, what distinguishes these individuals from those who stay more open-minded? ...
Article
A biased mindset can foster confirmatory reasoning and suggestive questioning when adults talk to children about abuse suspicions in child-protection, healthcare, educational or investigative settings. We developed a self-report instrument on Cognitions and Emotions about Child Sexual Abuse (CECSA) that may predict individual propensity for a bias toward the abuse hypothesis. Three subscales, 23 items in total, were created in a sample of 801 students of human sciences via exploratory factor analysis and Ant Colony Optimization. The ‘Naive Confidence’ subscale reflects overestimating one’s ability to recognize abused children and overestimating the accuracy of children’s abuse reports, the ‘Emotional Reactivity’ subscale measures the intensity of one’s emotional reactions towards the topic of child sexual abuse (CSA), and the ‘Justice System Distrust’ subscale covers distrusting the justice system’s ability to prosecute CSA. The CECSA showed adequate model fit and good internal consistencies. Correlations with other self-report measures demonstrated convergent validity. All subscales predicted biased evaluations towards the abuse hypothesis in scenarios of children displaying unspecific behavioral problems. Given context-specific validations, the CECSA may be used to evaluate training programs or to assess training needs of professionals who talk to children about CSA suspicions.
... Interviewing or talking to a child is a highly individual process prone to human error and at risk for false positive or false negative conclusions (Korkman et al., 2024;Lilienfeld, 2016). A common side finding of forensic interview research is that interviewers vary not only in the degree to which they strive to avoid either false positives or false negatives (Everson & Sandoval, 2011;Fessinger & McAuliff, 2020) but also, on a more behavioral level, in their degree of suggestiveness when talking to children (e.g., Brubacher et al., 2014;Finnilä-Tuohimaa et al., 2008;Johnson et al., 2015;Kask et al., 2022;Pompedda et al., 2022). While it has been acknowledged that children differ in their susceptibility to suggestion (Bruck & Melnyk, 2004;Klemfuss & Olaguez, 2020), variation in adults' suggestiveness has essentially gone unexamined (for a notable exception, see the work of Melinder et al., 2020, on Big Five personality traits and intervierwer performance). ...
Preprint
Full-text available
Objective: Although interviews and conversations are highly individual in nature, and suggestiveness is a major pitfall when questioning children, differential aspects of interviewer bias and suggestiveness remain understudied. We assessed the influence of Cognitions and Emotions about Child Sexual Abuse (CECSA) on suggestive questioning and a biased mindset toward the abuse hypothesis across a series of three studies with varying mock paradigms and a meta-analytical integration. Hypotheses: For all studies, we expect the scores on the three CECSA scales (Naive Confidence, Emotional Reactivity, and Justice System Distrust) to be associated with the number of suggestive questions and indicators of a biased mindset. Method: In all studies, participants read mock cases about children displaying mild behavioral symptoms that were unspecific but gave rise to suspecting sexual abuse. In Study 1, 285 human sciences students further read interview transcripts and selected questions suitable to pose to the child. In Study 2, 241 police students read interview transcripts and freely formulated questions to pose to the children. In Study 3, 148 teaching students interviewed virtual children using natural language in a Virtual Reality setting. Results: Across three studies and their meta-analytical integration, we found robust evidence that Naive Confidence and Emotional Reactivity, but not Justice System Distrust, significantly predict bias and suggestive questioning, with effect sizes of b = .14–.37. The newly developed measures to assess suggestive questioning validly captured a unidimensional trait of suggestive questioning but showed unsatisfying reliability. Conclusions: The findings enhance our understanding of individual differences in suggestive questioning and bias and can inform the development, customization, and evaluation of interviewer training programs, as well as the selection of interview personnel. We also provide recommendations to increase the reliability of the suggestiveness measures in future research.
... The user is presented with pre-recorded videos of children displaying different emotions, selected by a human operator. They conducted multiple studies with EIT system and investigated the impact of feedback on training effectiveness using their system and conducted multiple studies to analyze its learning effects [28][29][30][31] . Their findings suggest that incorporating feedback enhances learning effects, but the system has limited response generation capabilities. ...
Article
Full-text available
The impact of investigative interviews by police and Child Protective Services (CPS) on abused children can be profound, making effective training vital. Quality in these interviews often falls short and current training programs are insufficient in enabling adherence to best practice. We present a system for simulating an interactive environment with alleged abuse victims using a child avatar. The purpose of the system is to improve the quality of investigative interviewing by providing a realistic and engaging training experience for police and CPS personnel. We conducted a user study to assess the efficacy of four interactive platforms: VR, 2D desktop, audio, and text chat. CPS workers and child welfare students rated the quality of experience (QoE), realism, responsiveness, immersion, and flow. We also evaluated perceived learning impact, engagement in learning, self-efficacy, and alignment with best practice guidelines. Our findings indicate VR as superior in four out of five quality aspects, with 66% participants favoring it for immersive, realistic training. Quality of questions posed is crucial to these interviews. Distinguishing between appropriate and inappropriate questions, we achieved 87% balanced accuracy in providing effective feedback using our question classification model. Furthermore, CPS professionals demonstrated superior interview quality compared to non-professionals, independent of the platform.
... Correspondingly, most research has focused on situational aspects that aggravate this bias, such as case-irrelevant information or the position of the commissioning party (e.g., Huang & Bull, 2021;Neal et al., 2022;O'Donohue & Cirlugea, 2021). Although interviewing is a highly individual process and studies report considerable between-person variance in interviewing performance (e.g., Finnilä-Tuohimaa et al., 2008;Pompedda et al., 2022) and case evaluations (Everson & Sandoval, 2011), differential aspects of interviewer bias and suggestiveness are rarely examined: Are some people more prone than others to fall for interviewer bias and into using suggestive questioning and sensemaking? If so, what distinguishes these individuals from those who stay more openminded? ...
Preprint
We present the development and validation of a self-report instrument on Cognitions and Emotions about Child Sexual Abuse (CECSA). Three subscales, consisting of 23 items in total, were developed in a sample of 801 humanities students by means of exploratory factor analysis and Ant Colony Optimization, an automated item selection strategy used to simultaneously optimize model fit, reliability, and predictive validity. The “Naïve Confidence” subscale reflects overestimating one's ability to recognize abused children and overestimating the accuracy of children’s abuse reports, the "Emotional Reactivity" subscale measures the intensity of one's emotional reactions towards the topic of child sexual abuse (CSA), and the "Justice System Distrust" subscale covers distrusting the justice system’s ability to prosecute CSA cases. The CECSA showed adequate model fit and good internal consistencies. Bivariate correlations with other self-report measures demonstrated convergent validity. Importantly, all three subscales predicted biased evaluations towards the abuse hypothesis in scenarios of children displaying unspecific behavioral problems. This indicates predictive validity of the CECSA as an instrument measuring vulnerability for interviewer bias. The CECSA can be used to assess individual training needs of professionals who conduct interviews or conversations with children about abuse suspicions and may help to develop and evaluate interviewer trainings.
... When avatars are used to train human interviewers, the trainees are typically given the opportunity to interview a child avatar who describes their experience of sexual abuse. One clear advantage of this approach is that interviewers can practice evidence-based interviewing techniques on an (avatar) child who reflects the experience and responses of a child sexual abuse victim (Kask et al., 2022;Pompedda et al., 2015Pompedda et al., , 2017Pompedda et al., , 2022. ...
Article
Full-text available
Evidential interviewing is often used to gather important information, which can determine the outcome of a criminal case. An interviewer's facial features, however, may impact reporting during this task. Here, we investigated adults' interview performance using a novel tool-a faceless avatar interviewer-designed to minimize the impact of an interviewer's visual communication signals, potentially enhancing memory performance. Adults were interviewed about the details of a video by (1) a human-appearing avatar or a human interviewer (Experiment 1; N = 105) or (2) a human-appearing avatar or a faceless avatar interviewer (Experiment 2; N = 109). Participants assigned to the avatar interviewer condition were (1) asked whether they thought the interviewer was either computer or human operated (Experiment 1) or (2) explicitly told that the interviewer was either computer or human operated (Experiment 2). Adults' memory performance was statistically equivalent when they were interviewed by a human-appearing avatar or a human interviewer, but, relative to the human-appearing avatar, adults who were interviewed by a faceless avatar reported more correct (but also incorrect) details in response to free-recall questions. Participants who indicated that the avatar interviewer was computer operated-as opposed to human operated-provided more accurate memory reports, but specifically telling participants that the avatar was computer operated or human operated had no influence on their memory reports. The present study introduced a novel interviewing tool and highlighted the possible cognitive and social influences of an interviewer's facial features on adults' report of a witnessed event.
... We compared the quality of the interviews with that of interviews in control groups of previous studies where no feedback or other intervention was provided (as in the current study no specific interviewing instruction nor feedback was provided). The current results correspond to the numbers reported by Pompedda et al. (2022) in a mega-analysis comprising of 997 control group interviews: proportion of recommended questions was about a third of all questions asked and use of recommended questions resulted in more correct and less incorrect information being elicited from the child avatars. This latter result means that the simulation worked as it should, consistent with what happens in interviews with real children (Phillips et al., 2012). ...
Article
Full-text available
Introduction In forensic settings interviewers are advised to ask as many open-ended questions as possible. However, even experts may have difficulty following this advice potentially negatively impacting an investigation. Here, we sought to investigate how emotions and psychophysiological parameters are associated with question formulation in real time in an ongoing (simulated) child sexual abuse (CSA) interview. Method In a experimental study, psychology students (N = 60, Mage = 22.75) conducted two interviews with child avatars, while their emotions (anger, sadness, disgust, surprise and relief), GSR and heart rate (HR) were registered. Results First, we found that general emotionality related to CSA and perceived realness of the avatars was associated with stronger overall emotional reactions. Second, we found that closed (vs. open) questions were preceded by more facially observable anger, but not disgust, sadness, surprise or relief. Third, closed (vs. open) questions were preceded by higher GSR resistance and lower heart rate. Discussion Results suggest for the first time that emotions and psychophysiological states can drive confirmation bias in question formulation in real time in CSA.
Article
Virtuelle Realität (VR) bietet die Möglichkeit, kontrollierbare, sichere und realistische Umgebungen für die Erforschung von Verhalten und Emotionen in rechtspsychologischen Kontexten zu schaffen. Bislang war dies für bestimmte Fragestellungen mittels traditioneller Forschungsmethoden (z. B. Fallvignetten) aus ethischen oder praktischen Gründen nicht oder nur eingeschränkt möglich. Auf Basis vorliegender empirischer Befunde und der erfolgreichen Nutzung virtueller Szenarien im Bereich (sexueller) Opfererfahrungen schlagen die Autorinnen den Einsatz von VR zur empirischen Überprüfung aussagepsychologischer Methoden vor, im Besonderen der Merkmalsorientierten Inhaltsanalyse. Vor dem Hintergrund teilweise heterogener Befunde hinsichtlich Validität und Reliabilität der Merkmalsorientierten Inhaltsanalyse wird das Potenzial des Einsatzes von VR zur Überwindung der Herausforderungen traditioneller Studiendesigns hinsichtlich des ethischen Dilemmas und interner sowie externer Validität diskutiert, und ethische Herausforderungen der Nutzung von VR im Kontext (sexueller) Opfererfahrungen werden dargestellt. Die Autorinnen plädieren für den Einsatz von VR zur Validierung aussagepsychologischer Methoden, da transparente und reproduzierbare Schlussfolgerungen von weitreichender Bedeutung für die aussagepsychologische Sachverständigentätigkeit sind.
Article
Full-text available
Previous research with students and some professional groups (psychologists) has demonstrated that repeated feedback in simulated investigative interviews with computerized child avatars improves the quality of interviews conducted with real children who have witnessed a mock event. However, it is not known whether this type of training would improve the quality of investigative interviews with actual child victims and witnesses of physical and sexual abuse. Twenty-two police investigators participated in the study. Half of them received feedback during four simulated interviews whereas the other half received no feedback during four such interviews followed by another four interviews after which they also received feedback. Transcripts of interviews both before and after the training were coded for interview quality. Receiving feedback after the simulated interviews increased the proportion of recommended questions both within the simulations and, importantly, also during interviewing with actual child victims and witnesses. This study demonstrated for the first time transfer of learning from simulated interviews to actual investigative interviews.
Article
Full-text available
This study aims to advance the field of child forensic interviewing by assessing the impact of different refresher training modalities on police officers’ abilities to adhere to the steps of an interview protocol and on the types of questions used. Previously trained police officers (N = 46) were randomly assigned to one of three experimental conditions: (1) supervision with an expert, (2) peer group supervision, and (3) computer-assisted exercises on children’s investigative interview techniques. Comparison of interviews conducted before (n = 136) and after (n = 124) the refresher modalities revealed an improvement in performance across time for almost all steps of the protocol. There were more effects associated with time than with modality of refresher training with regard to question types used during episodic memory training and the substantive phase of the interview. Although there were some differences between the three conditions, no method emerged as clearly superior. Results suggest that all modalities could be useful in refreshing adherence to the steps of an interview protocol and use of best practice questioning approaches. The discussion highlights that the time devoted to the three modalities was likely not enough and that further studies are needed to determine the most optimal delivery of refresher training.
Thesis
Full-text available
Interview quality (i.e., adherence to best practice) in alleged child sexual abuse (CSA) cases remains low. Training programs have been developed in order to tackle this problem. However, these programs are usually not successful in creating stable effects over time, or when positive results have been achieved, programs are often logistically complicated and expensive in addition to requiring a lot of time from those participating. The general aim of the present thesis was to create and test an interview simulation tool (EIT®). This tool was used to train interviewers to use more recommended questions through multiple practice occasions in combination with the administration of detailed, immediate and continuous feedback, but without excessive time and cost burden. We thus applied a serious gaming approach in which trainees interviewed computer-generated avatars equipped with response algorithms and predefined memories to explore the feasibility of this approach to train interviewers in alleged CSA cases. In all the studies presented in the present thesis, we operationalized interview quality as recommended and not recommended questions asked, relevant, neutral and wrong details elicited from the avatars or children and correct conclusions reached concerning what had happened to the avatars or children. In Study I, we showed how interviews with avatars combined with feedback improved the quality of simulated investigative interviews in a group of students compared to a group of students that conducted the interviews without feedback. We also showed that knowledge regarding evidence-based principles relating to CSA investigations did not influence the quality of interviews. Here, we used a combination of outcome (i.e. information regarding the conclusion of the story) and process (i.e. information regarding the question types used) feedback simultaneously. In Study II, we separated between the two types of feedback and showed that the combination of feedback enhanced training effects to a higher degree compared to the process and outcome feedback provided alone. For example, a combination of feedback elicited medium/strong effects (dppc2 = 0.76) in improving the percentage of recommended questions in only four interviews. In Study III, we used a new set of algorithms to relate interviewer questions to avatar responses. In the previous studies, the algorithms were mechanical (i.e., after a certain number of recommended questions an operator provided a detail). Starting from Study III, we used probabilistic algorithms that related interviewer questions to avatar responses probabilistically (in both cases the probabilities themselves were derived from research on child memory and suggestibility). In Study III, we also tested if a simple reflection task enhances training effects. The reflection task did not enhance training effects compared to the group that received a combination of the two previously used feedback types. This study replicated previous results regarding the effect of avatar interviews combined with feedback on interview quality. For example, 90% of participants in the two groups that received feedback improved their use of recommended questions, and 38% reached a reliable change in their use of recommended questions in only two hours. In Study IV, we showed that the improvements in interview quality achieved in student samples in Studies I-III were also achieved in a group of psychologists. The second and most important result of Study IV was that the improvements achieved during the training also transferred into interviews with actual children who had witnessed a mock event. During these interviews, that occurred one week after the training, the feedback group asked 40% of recommended questions compared to the control group who reached 26%. The results of training were analyzed using a mega-analytic approach in the present thesis combining the results of the individual studies. The results showed how simulated interviews with avatars and the provision of a combination of outcome and process feedback improved in a robust manner the quality of simulated investigative interviews compared to a control group. Overall, the results provide support for the use of a serious gaming approach to training interviewers. Previous research clearly shows how important it is to interview the child in the most neutral way possible when there is suspicion of abuse. Because of this, providing interviewers with a new, interactive and efficient tool together with providing police departments or training institutions with a realistically applicable, time-and-cost efficient training protocol can change the way we plan and organize training in this context.
Article
Full-text available
Although previous research has confirmed the effectiveness of simulated child sexual abuse interviews with feedback, its validation is limited to Western contexts and face-to-face settings. The present study aims to extend this research to non-Western and online/remote training conditions. Thirty-two Japanese undergraduate students were randomly assigned to a control or feedback group. The feedback group conducted a set of six online simulated child sexual abuse interviews while receiving feedback after each interview in an attempt to improve the quality of their questioning style. The feedback consisted of the outcome of the alleged cases and the quality of the questions asked in the interviews. The control group conducted the interviews without feedback. The feedback (vs. control) increased the proportion of recommended questions (first interview: 45%; last interview: 65% vs. first: 43%; last: 42%, respectively) by using fewer not-recommended questions and eliciting fewer incorrect details. Furthermore, only participants in the feedback group (7 out of 17) demonstrated a reliable change in the proportion of recommended questions. The present study explores the efficacy of simulated interview training with avatars in a different cultural setting and in the context of remote administration. The differences between the present study and previous research are discussed in light of cultural and logistical aspects.
Article
Full-text available
Background Although many meta-analyses have examined the association between childhood sexual abuse and subsequent outcomes, the scope, validity, and quality of this evidence has not been comprehensively assessed. We aimed to systematically review existing meta-analyses on a wide range of long-term psychiatric, psychosocial, and physical health outcomes of childhood sexual abuse, and evaluate the quality of the literature. Methods In this umbrella review, we searched four databases (PsycINFO, PubMed, Cumulative Index to Nursing and Allied Health Literature, and Global Health) from inception to Dec 31, 2018, to identify meta-analyses of observational studies that examined the association between childhood sexual abuse (before 18 years of age) and long-term consequences (after 18 years). We compared odds ratios (ORs) across different outcomes. We also examined measures of quality, including heterogeneity between studies and evidence for publication bias. This study is registered with PROSPERO, CRD42016049701. Findings We identified 19 meta-analyses that included 559 primary studies, covering 28 outcomes in 4 089 547 participants. Childhood sexual abuse was associated with 26 of 28 specific outcomes: specifically, six of eight adult psychiatric diagnoses (ORs ranged from 2·2 [95% CI 1·8–2·8] to 3·3 [2·2–4·8]), all studied negative psychosocial outcomes (ORs ranged from 1·2 [1·1–1·4] to 3·4 [2·3–4·8]), and all physical health conditions (ORs ranged from 1·4 [1·3–1·6] to 1·9 [1·4–2·8]). Strongest psychiatric associations with childhood sexual abuse were reported for conversion disorder (OR 3·3 [95% CI 2·2–4·8]), borderline personality disorder (2·9 [2·5–3·3]), anxiety (2·7 [2·5–2·8]), and depression (2·7 [2·4–3·0]). The systematic reviews for two psychiatric outcomes (post-traumatic stress disorder and schizophrenia) and one psychosocial outcome (substance misuse) met high quality standards. Quality was low for meta-analyses on borderline personality disorder and anxiety, and moderate for conversion disorder. Assuming causality, population attributable risk fractions for outcomes ranged from 1·7% (95% CI 0·7–3·3) for unprotected sexual intercourse to 14·4% (8·8–19·9) for conversion disorder. Interpretation Although childhood sexual abuse was associated with a wide range of psychosocial and health outcomes, systematic reviews on only two psychiatric disorders (post-traumatic stress disorder and schizophrenia) and one psychosocial outcome (substance misuse) were of a high quality. Whether services should prioritise interventions that mitigate developing certain psychiatric disorders following childhood abuse requires further review. Higher-quality meta-analyses for specific outcomes and more empirical studies on the developmental pathways from childhood sexual abuse to later outcomes are necessary.
Article
Full-text available
Child abuse victims are required to participate in stressful forensic investigations but often fail to fully report details about their victimization. Especially in intrafamilial abuse cases, children’s emotional states presumably involve reluctance to report abuse. The current study examined the effects of interviewers’ support on children’s reluctance and production of information when interviewed. The sample comprised 200 interviews of 6- to 14-year-old suspected victims of physical abuse perpetrated by a family member. Interviews followed the NICHD (National Institute of Child Health and Human Development) Revised Protocol (RP), which emphasizes supportive practices. All the cases were corroborated by external evidence, suggesting that the reports of abuse made by the children were valid. Coders identified instances of interviewer support and questioning, as well as indications of reluctance and the production of forensic details by the children. Expressions of reluctance predicted that information was less likely to be provided in that utterance, whereas expressions of support predicted less reluctance and increased informativeness in the following child utterance. Mediation analyses revealed that decreased reluctance partially mediated the effects of support on increased informativeness. The data indicate that support can effectively address children’s reluctance, resulting in increased informativeness and thus confirming expert recommendations that supportive interviews should be considered best practice. The findings also shed light on the underlying mechanism of support, suggesting both direct and indirect effects on children’s informativeness.
Article
Full-text available
Simulated interviews in alleged child sexual abuse (CSA) cases with computer-generated avatars paired with feedback improve interview quality. In the current study, we aimed to understand better the effect of different types of feedback in this context. Feedback was divided into feedback regarding conclusions about what happened to the avatar (outcome feedback) and feedback regarding the appropriateness of question-types used by the interviewer (process feedback). Forty-eight participants each interviewed four different avatars. Participants were divided into four groups (no feedback, outcome feedback, process feedback, and a combination of both feedback types). Compared to the control group, interview quality was generally improved in all the feedback groups on all outcome variables included. Combined feedback produced the strongest effect on increasing recommended questions and correct conclusions. For relevant and neutral details elicited by the interviewers, no statistically significant differences were found between feedback types. For wrong details, the combination of feedback produced the strongest effect, but this did not differ from the other two feedback groups. Nevertheless, process feedback produced a better result compared to outcome feedback. The present study replicated previous findings regarding the effect of feedback in improving interview quality, and provided new knowledge on feedback characteristics that maximize training effects. A combination of process and outcome feedback showed the strongest effect in enhancing training in simulated CSA interviews. Further research is, however, needed.
Article
Background Previous research has shown the effectiveness of simulation training using avatars paired with feedback in improving child sexual abuse interview quality. However, it has room for improvement. Objective The present study aimed to determine if the combination of two interventions, feedback and modeling, would further improve interview quality compared to either intervention alone. Participants Thirty-two clinical psychologists were randomly assigned to a feedback, modeling, or the combination of feedback and modeling group. Methods The participants conducted five simulated child sexual abuse interviews online while receiving the intervention(s) corresponding to their allocated group. Feedback was provided after each interview and consisted of the outcome of the alleged cases and comments on the quality of the questions asked in the interviews. Modeling was provided after the 1st interview and consisted of learning points and videos illustrating good and bad questioning methods. Results The proportion of recommended questions improved over the five interviews when considering all groups combined. The combined intervention (vs. feedback alone) showed a higher proportion of recommended questions from the 2nd interview onward while the difference between the combined intervention and modeling alone and the difference between the modeling alone and feedback alone were mostly not significant. The number of correct details were affected in the same way. No significant differences in the number of incorrect details were found. Conclusions The results show that the combination of feedback and modeling achieves improvement greater than that of feedback alone.
Article
Research on students suggests that repeated feedback in simulated investigative interviews with avatars (computerized children) improves the quality of the interviews conducted in this simulated environment. It remains unclear whether also professional groups (psychologists) benefit from the training and if the effects obtained in the simulated interviews transfer into interviews with real children who have witnessed a mock event. We trained 40 psychologists (Study I) and 69 psychology students (Study II). In both studies, half of the participants received no feedback (control group) while the other half received feedback (experimental group) on their performance during repeated interviews with avatars. Each participant then interviewed two 4-6-year-old children who had each witnessed a different mock event without any feedback being provided. In both studies, interview quality improved in the feedback (vs. control) group during the training session with avatars. The analyses of transfer effects showed that, compared to controls, interview quality was better in the experimental group. More recommended questions were used in both studies, and more correct details were elicited from the children in Study I, during the interviews each participant conducted with two children (N = 76 in Study I; N = 116 in Study II) one week after the training. Although the two studies did not show statistically significant training effects for all investigated variables, we conclude that interview quality can be improved using avatar training and that there is transfer into actual interviews with children at least in the use of recommended questions.