This article describes the Blueprints database of evidence-based programmes (EBPs) and its potential application in children's services in European countries. It outlines relevant aspects of the European context, including a tendency to be skeptical about programmes imported from the US, and the need for a pan-European source of information about EBPs across multiple outcome areas. It then describes the standards of evidence used by Blueprints, which cover intervention specificity, evaluation quality, intervention impact, and dissemination readiness. The criteria for determining that a programme is ‘Model’ and ‘Promising’ are outlined. The article then summarizes the process by which the standards were developed and some of the issues that were harder to resolve. It also sketches the process by which a programme reaches the Blueprints database, and provides three examples of programmes approved by Blueprints and implemented in Europe: a home-visiting programme for mothers of infants; a parent skills training programme; and a therapeutic intervention for families of chronic offenders. A brief indication is also given of how the wider pool of programmes reviewed fare against the standards of evidence. Finally, the article summarizes future directions for the work, with a particular emphasis on how Blueprints might become widely used in Europe.
Blueprints for Europe:
Promoting Evidence-Based Programmes in Children’s Services*
Blueprints para Europa:
Promoviendo Programas Basados en la Evidencia en los
Servicios de Atención a la Infancia
Nick Axford1, Delbert S. Elliott2, and Michael Little1
1The Social Research Unit, Dartington, UK
2University of Colorado, and Founding Director of Blueprints, USA
Keywords: child well-being, Europe, evidence-based program, program evaluation.
The last decade has seen growing interest in evi-
dence-based programmes (EBPs) in developed coun-
tries. A ‘programme’ is a discrete, organized package of
practices, spelled out in guidance – sometimes called a
manual or protocol – that explains what should be deliv-
ered to whom, when, where and how. A programme is
‘evidence-based’ when it has been evaluated robustly,
typically by randomized controlled trials (RCTs) or
quasi-experimental designs (QEDs), and found
unequivocally to have a positive effect on one or more
relevant child outcomes (Social Research Unit, 2012a).
There are now many such programmes, covering all
areas of children’s lives, including education, behav-
iour, health, relationships and emotional well-being.
However, their market penetration is very poor. They
are rarely adopted, and, when they are, implementation
is often poor and programmes fizzle out when initial
funding ends (Bumbarger & Perkins, 2008). This holds
in the US, where most EBPs originate, and even more
so in Europe, where countries are increasingly import-
ing these programmes.
There are various possible reasons for this state of
affairs (Little, 2010). First, there is a lack of knowledge
among policy makers and senior practitioners regard-
ing the existence and nature of EBPs. Some arguably
place little value on evidence. Second, there is confu-
sion about what constitutes ‘evidence-based’ and what
is an appropriate standard. Third, there is anxiety about
whether these programmes can be implemented in
real-world settings, and what human and financial
resources this requires.
Such obstacles to implementing EBPs are magnified
in Europe, where there is some resistance both to the
concept of a programme and the fact that most of the
best-known EBPs originate in the US (the two ideas
tend to get conflated in debate). States with social-
democratic or Catholic welfare regimes tend to be
skeptical about programmes from the US (Grietens,
2010). This is partly because some programmes have
had mixed or disappointing outcomes when imple-
mented in Europe. For instance, Multisystemic
Therapy performed only a little better than services as
usual in the UK (Butler, Baruch, Hickey, & Fonagy,
2011), whereas in Sweden regular services did equally
well (Sundell et al., 2008).
The hesitancy also reflects cultural differences in
service provision between Europe and the US. The US
has a minimal welfare state. North European countries,
by contrast, have more redistributive policies and pro-
vide universal welfare (Rowlands, 2010). They also
invest in professionalizing children’s services staff,
notably through social pedagogues who are trained in
child development and work therapeutically with chil-
dren and families in many settings (Petrie, Boddy,
Cameron, Wigfall, & Simon, 2006). There is a reluc-
tance to adopt practices from a country (the US) that
routinely performs poorly in league tables of child
well-being in developed countries, particularly since
many European countries, especially those in
Scandinavia, perform relatively well (UNICEF, 2007).
Compounding this skepticism, there is no pan-
European source of information for a European audi-
ence on EBPs that cover a range of outcome areas. The
best-known clearinghouses of EBPs are published in
English and aimed primarily at American providers.
Very few European programmes feature on them.
Similar ventures in Europe tend to be country-specif-
ic,1or system-specific (e.g. education),2or focused on
a single subject (e.g. drug prevention and treatment),3
or concerned with a class of programmes (e.g. parent-
ing interventions).4There is a recognized need for out-
lets in diverse languages and aimed at diverse cultures
(Soydan, Mullen, Alexandra, Rehnman, & Li, 2010).
European children’s services providers also need guid-
ance on how to select and adapt programmes for a dif-
ferent context. This includes information on which
children and families programmes succeed with.
This article discusses Blueprints, a resource that will
provide policy makers and practitioners in Europe with
high-quality information about programmes that meet
a high standard of evidence and are ready for imple-
mentation in service systems. Blueprints is designed to
lead to the greater awareness and use of EBPs and
improved well-being for the children and families who
receive them. We believe it fills an important niche.
Before going any further it is helpful to provide some
historical background.
Blueprints started in the US with a focus on vio-
lence prevention following the Columbine school mas-
sacre in 1993, in which two High School students
killed 12 fellow students and a teacher. This was the
catalyst for the Center for the Study and Prevention of
Violence, part of the University of Colorado in nearby
Boulder, to start compiling a list of evidence-based
programmes specifically aimed at preventing violence
(Elliott, 2010). In 2010, the Annie E. Casey
Foundation funded the Social Research Unit (SRU) at
Dartington, UK, and the Social Development Research
Group (SDRG) at the University of Washington, US,
to develop a method to help system leaders and com-
munities to work better together to implement EBPs at
scale. The idea was to bring together the centres’
respective Common Language and Communities that
Care methods (Axford & Morpeth, 2012; Hawkins &
Catalano, 2002). The project necessitated the develop-
ment of a menu of programmes that cover all key
developmental outcomes, and so work began on a
database of EBPs.
The researchers behind Blueprints for Violence
Prevention were involved in developing the standards
of evidence to underpin the Evidence2Success data-
base (see below). They also recognized the value in
broadening the scope of Blueprints, both in terms of
outcomes and geography, hence the partnership with
Casey and the Social Research Unit. With funding
from Casey, Blueprints for Violence Prevention:
Was re-named ‘Blueprints for Problem Behavior
and Healthy Youth Development’, becoming the
database for Evidence2Success (there is no sepa-
rate Evidence2Success database).
Essentially adopted the standard of evidence
developed for Evidence2Success (which involved
making slight changes to both the existing
Blueprints standards and the Evidence2Sucess
standards in order to align them).
Broadened its remit to include programmes seek-
ing to improve children’s health, education, rela-
tionships, emotional well-being, and behaviour.
Opened a European office in London, staffed by
the Social Research Unit.
• Appointed its first European representative on the
Board, and
• Re-designed its US website and started work on a
European website (ensuring consistency in key
The remainder of this article describes the standards
of evidence used to select programmes for the new
Blueprints, how they were developed, the process by
which a programme reaches the website, the kind of
programmes that have been approved, how pro-
grammes fare against the standards, and future direc-
tions for the work with particular reference to Europe.
The standards of evidence
The standards of evidence that underpin Blueprints
today cover four dimensions:
1. Evaluation quality – whether the investigation
into the efficacy and effectiveness of the pro-
gramme produces valid and reliable findings.
2. Intervention impact – how much positive change
in key developmental outcomes can be attributed
to the programme.
3. Intervention specificity – whether the pro-
gramme is focused, practical and logical.
4. Dissemination readiness – whether the pro-
gramme is accompanied by the necessary sup-
port and information to enable its successful
implementation in communities and public serv-
ice systems.
Within each dimension, the Standards contain
‘Promising’ and ‘Model’ criteria. The Promising crite-
ria set a basic minimum standard, while the Model cri-
teria strengthen confidence in a programme’s scientif-
ic rigour, impact on outcomes, intervention specificity
or readiness to be taken to scale. The four dimensions
are now elaborated.
Evaluation quality
In order for a programme to appear on Blueprints, it
must have been evaluated by at least one good random-
ized controlled trial (RCT) or two good quasi-experi-
mental design (QED) studies. ‘Good’ refers to aspects
of methodological quality, specifically that:
• The method of assignment to intervention and
control is at the appropriate level (eg. individual,
• The measurement instruments are suitable for the
intervention population of focus and desired out-
• Analysis is based on ‘intent-to-treat’.
The statistical procedures and tests are appropri-
Intervention and control groups are equivalent at
baseline on key outcomes or appropriate controls
for differences are included in the analysis.
Such studies must also meet the following criteria.
It must be clear with whom the programme was tested
and what was actually received by the intervention and
comparison groups. The measures used must be valid
and reliable, capture a relevant outcome, and not be
tied to the programme under scrutiny. In order to min-
imize bias, someone without a vested interest in the
programme must have provided the observations, rat-
ings or assessments.
The extent to which participants dropped out during
the study is also considered. Some drop-out is com-
mon, but it is problematic if many youth drop out, if
some categories of youth drop out a lot more than oth-
ers (eg. boys more than girls), or if the drop-out rate
and type of person dropping out differ significantly
between the programme and control groups.
There are several Model evaluation quality criteria.
One is simply if there are at least two RCTs or one
RCT and one QED that meet the quality criteria.
Generally there can be more confidence in findings if
a programme has been evaluated well more than once.
Many evaluations only look at impact at the end of the
programme, which is a problem as impact often fades
with time. For this reason, evidence of a longer-term
effect – at least 12 months after the programme ends –
is also credited. These two criteria must be met for cer-
tification as a Model Programme.
Signs that evaluators have sought a finer-grained
understanding of programme impact are also recog-
nized. For instance, they might have tested for the
relationship between implementation fidelity and out-
come, or between the amount of programme received
and outcome. If a programme has been delivered well,
or if some youth or families have had more of it, a
stronger effect would be expected. Some studies
examine whether the programme works better for
some sub-groups of than others, focusing on gender,
ethnicity and socio-economic status. Studies may also
test whether the logic that underpins the programme
actually holds up; do effects take place for the reasons
that were expected? Both of these are also Model cri-
Intervention impact
Programmes that appear on Blueprints must have a
positive effect on a relevant outcome, the size of which
is known, and have no known harmful effects. Only
evaluation studies that meet all Promising evaluation
quality criteria may be considered when making this
A majority of studies that meet this threshold must
show that the programme has a positive effect on a rel-
evant outcome in order for it to be judged to have an
overall positive effect. A ‘positive effect’ means that
programme group youth or families did better relative
to youth or families in the comparison group. It is
important that this effect is not likely to be the result of
chance, so it has to be ‘statistically significant’. The
size of this positive effect should also be provided.
There should be no evidence of the programme hav-
ing a harmful effect on youth or families. This includes
all outcomes and all sub-groups. For example, a pro-
gramme would not be approved if it improves adoles-
cents’ relationships with peers but at the expense of
their use of illicit substances increasing. Equally, if the
programme decreases substance use for boys but not
girls, it would be approved for boys only.
There are two Model criteria for Intervention
impact. One is the existence of several studies meeting
the Promising evaluation quality criteria, a majority of
which show a statistically significant positive impact.
The other is evidence that children who received a
larger amount of the programme did better than those
who received a smaller amount: in other words, there
is a positive dose-effect relationship.
Intervention specificity
Programmes need to be clear about what outcomes
they target and which group of children will benefit.
There should be a clear description of what the pro-
gramme comprises, and an explanation of why and
how the programme should work – in other words,
how the programme will address the risk and protec-
tive factors as a means of achieving the outcomes.
There is one Model criterion for Intervention speci-
ficity, which is the existence of compelling research
evidence to support the programme logic. This must
explain why and how the programme is likely to bene-
fit the children and youth it is aimed at. For example,
if a parenting programme encourages parents to prac-
tise certain skills to deal with their children’s poor
behaviour, have other studies shown that doing this
Dissemination readiness
Programmes that are accepted for the database also
need to demonstrate that they can be implemented at
scale in communities and service systems. At the sim-
plest level, the programme that was evaluated should
still be available. Next, it should be clear how to get
the programme to the right children, youth and fami-
lies. A manual and training and implementation mate-
rials are also needed, because these will help ensure
that the programme is implemented consistently (or
with fidelity). The financial and human resources
needed for implementation should be stated.
There are several Model dissemination readiness
criteria, starting with the availability of technical sup-
port with implementation and a checklist to help mon-
itor fidelity. Recognition is given to programmes that
are currently being disseminated widely, or that have
been tested and found effective when delivered by reg-
ular practitioners in normal settings. Many pro-
grammes are tested initially under special conditions –
for example, they are delivered in university clinics by
research staff. Policy makers can have more confi-
dence in programmes that have been tested when
delivered by the kinds of people who normally provide
similar services in their daily work.
Developing the standards
In their work on Evidence2Success, the SRU,
SDRG and Annie E. Casey Foundation recognized that
there are already over 25 databases of evidence-based
programmes and, accordingly, several sets of stan-
dards.5In an attempt to build some consensus, the
decision was taken to develop the Evidence2Success
standards of evidence – since adopted by Blueprints
with some amendments, as described above – by
involving experts who had previously developed other
sets of standards of evidence, all of which but the last
listed below have been used to inform databases of
Best Evidence Encyclopedia6(Robert E. Slavin,
Johns Hopkins University, US).
• Blueprints for Violence Prevention7(Delbert S.
Elliott, University of Colorado, US).
LINKS (Lifecourse Interventions Nurturing Kids
Successfully)8(Kristen Moore, Child Trends,
Communities that Care9(J. David Hawkins and
Richard F. Catalano, SDRG, University of
Washington, US).
Greater London Authority Project Oracle10
(Michael Little, SRU at Dartington, UK).
These experts met regularly over a six-month peri-
od and tested prototype standards empirically to see
how easy they were to apply and which programmes
would meet them. Consideration was also given to
other sets of standards, such as those developed by the
Society for Prevention Research (Flay et al., 2005) and
the CONSORT guidelines on reporting RCTs.11 During
this period several issues arose that required discus-
sion. Most concerned evaluation quality and interven-
tion impact. Some of the more important ones are now
outlined briefly.
There was discussion about whether the amount of
attrition from a study was important and whether a
level at which it becomes problematic should be spec-
ified. However, it was argued that this would penalise
follow-up studies, which the standards encourage,
since these tend to have higher attrition. The expert
group therefore decided to focus mainly on differential
rather than overall attrition, although the Blueprints
Board still considers attrition rates and expects some
controls or adjustment (e.g. propensity scoring) when
this rate is high.
Another issue concerned the value of independent
replication, in other words a study not involving the
programme developer that nevertheless shows an
impact. The rationale was that developer involvement
seems to introduce bias (Eisner, 2009). The expert
group agreed that while independent replication is
desirable, and should be encouraged, to insist on it now
would result in a very short list and remove stronger
programmes, such as Nurse Family Partnership.
There was considerable discussion about the accep-
tability of evidence of impact derived solely from self-
report measures. In criminal justice, such measures are
generally considered acceptable. In education, they are
unacceptable if used alone; observations, teacher
ratings, and academic test scores are preferred. The
expert group decided that the focus should be on whe-
ther measures are appropriate: no form of measure-
ment is wrong per se. They insisted, however, that
assessments cannot be limited to those made by per-
sons who are not blind to the experimental condition or
who are providing the intervention, since this can
introduce bias.
There was also extensive debate over the require-
ment for evidence of sustained impact 12 months after
the intervention ended. The worry was expressed that
very few educational programmes would qualify
because the last measure is usually taken at the end of
the intervention. Evidence of a sustained impact was
therefore made a ‘best’ rather than a ‘good enough’ cri-
The main issue on intervention impact concerned
the requirement for a statistically significant effect
size. It was argued that in many studies with large clus-
ters (such as schools) it is difficult to obtain large
enough sample sizes to permit analysis at the cluster
level. However, unbiased and meaningful estimates
can be obtained using participant-level analyses when
sample sizes at the participant level are large. The
expert group therefore agreed to have as an alternative
a sample size weighted mean effect size of 0.2, with a
sample size of more than 500 individuals across all
Finally, a decision needed to be made about the cri-
teria for determining a programme’s overall status.
Blueprints categorizes programmes as either ‘Model’
or ‘Promising’. It was agreed that a programme must
meet all ‘good enough’13 criteria across all four dimen-
sions to be deemed ‘Promising’ and that a programme
is designated ‘Model’ if it meets these criteria and:
It has (a) two or more good enough randomised
controlled trials or (b) at least one good enough
randomised controlled trial and one good enough
quasi-experimental design evaluation, and
There is evidence of a sustained impact (at least
12 months after the end of the programme).
How programmes get onto Blueprints
These standards are applied to programmes that seek
to achieve outcomes in the areas of behaviour, emotion-
al well-being, educational skills and attainment, health
(particularly as it relates to behavioural issues, such as
smoking, eating, drinking), and relationships (primarily
with parents and peers). There are four steps in the
process for Blueprints to approve a programme.
First, all relevant scientific literature on the pro-
gramme is identified. (At present this is restricted to
English-language publications.) The research team
sifts through the primary journals in all areas of prob-
lem behaviour and child health and development on a
regular basis to identify literature that might suggest
new programmes for inclusion or add to or challenge
programmes on the database. Information submitted
by programme developers or purveyors is also consid-
Second, this literature is analyzed against the stan-
dards of evidence by a team of trained reviewers based
at the University of Colorado in the US and the Social
Research Unit in the UK. The result is a structured nar-
rative description of each study and a quantitative sum-
mary of whether overall the programme meets each of
the criteria contained within the standards. The
reviews focus primarily on intervention specificity,
evaluation quality and intervention impact. Each
review must be approved for quality by a review coor-
Third, programmes that are deemed to have a good
chance of meeting the standards of evidence are for-
warded to the Blueprints Board for consideration. The
Board comprises eight leading prevention scientists
from the US and Europe and meets twice a year. The
Board decides whether or not programmes meet the
standards in terms of evaluation quality and interven-
tion impact and can therefore potentially be recom-
mended for dissemination.
Fourth, the review team checks the system readiness
of programmes approved by the Blueprints Board.
This is done by consulting programme websites and by
asking developers or purveyors to complete a written
questionnaire. The questionnaire covers subjects such
as the availability of materials and training, fidelity
monitoring procedures, and human resource require-
ments. If extra information is needed once this is sub-
mitted then follow-up questions are sent and, often, a
telephone discussion is held. Two members of the
review team – at least one with extensive experience of
delivering and managing services – discuss the infor-
mation received and determine if the programme is
‘system ready’. At this stage a programme is formally
approved and the developer is informed that it will
appear on the Blueprints website.
The list of approved programmes is updated regu-
larly. Regular literature searches are undertaken using
a consistent process to identify new studies showing
positive or even negative findings for programmes
already on the list. Similarly, studies on new pro-
grammes are reviewed if they seem likely to meet the
standards of evidence. Particular efforts are being
made to identify programmes originating in Europe
since most programmes approved to date were devel-
oped in the US. Programme developers and evaluators
may also submit their programme for consideration.
Programmes approved for Blueprints
At the time of writing there are 11 ‘Model’ pro-
grammes and 22 ‘Promising’ programmes on Blue-
prints. This is from over 1000 programmes reviewed.
However, several new programmes will shortly be
added to the list, largely as the result of the decision
described earlier in this article to extend the remit of
Blueprints beyond violence prevention (the initial
focus) to encompass other areas of child and youth
development, such as education and health.
Some examples of programmes appearing on the
Blueprints website follow. They represent different
types and levels of intervention and have all been
implemented in Europe. The descriptions outline
briefly how each of the programmes meets the stan-
dards and where they are delivered.
Nurse Family Partnership is a home-visiting pro-
gramme that involves nurses making home-visits to
young, often teenage, vulnerable first-time parents,
starting in early pregnancy and lasting until children
are 24 months old. The programme aims to promote
prenatal health, improve child well-being and develop-
ment through better parenting, and encourage parental
self-sufficiency through education, employment, or
planning future pregnancies. Specially trained nurses
pay weekly or fortnightly structured home-visits to
families. Home-visits allow nurses to prepare young
people for parenthood and guide them to adopt health-
ier lifestyles, take good care of their babies, and plan
for their future. Key to the programme is the strong
therapeutic relationship built between nurse and fami-
Rigorous scientific evaluations show that NFP leads
to a range of improvements in child health and devel-
opment, such as better child behaviour and academic
achievement, more positive parenting practices, reduc-
tions in child maltreatment, and increased parental
independence –including reduced welfare use (eg.
Olds, Henderson, Chamberlin, & Tatelbaum 1986; Olds
& Kitzman, 1990). These impacts are sustained long
after the programme finishes, for example children in
FNP are less likely to be involved in the juvenile justice
system in adolescence (eg. Eckenrode et al., 2010).
Nurse Family Partnership is accompanied by an
extensive package of support, including manuals,
training and technical assistance. It has been imple-
mented in the UK and the Netherlands. Every dollar
invested in the US version of the programme for low-
income families yields a return of $3.23 (Aos et al.,
The second programme described here, Incredible
Years BASIC, is designed for parents of children aged
2-10 years with conduct problems. It seeks to improve
family interaction and prevent early and persistent
anti-social behaviour in these children.
The programme comprises a 12-week course of
two-hour sessions delivered to a group of about 12 par-
ents by two specially trained leaders. Parents are
taught strategies to help them manage their child’s
problem behaviours, such as aggression, tantrums, and
acting out. They also learn how to promote their child’s
social skills through emotion regulation. Sessions
involve group discussion, videotape modelling, and
the rehearsal of parenting techniques.
Incredible Years BASIC has been evaluated by RCT
in several countries, including the US, UK, and Norway.
These evaluations show consistently that the pro-
gramme increases the use of positive parenting strate-
gies, reduces the use of harsh and inconsistent disci-
pline, and reduces deviant behaviour in children (eg.
Webster-Stratton, Kolpacoff, & Hollinsworth, 1988;
Hutchings et al., 2007; Larsson et al., 2009; Scott et al.,
2010; Little et al., 2012; McGilloway et al., 2012).
Incredible Years BASIC has extensive group leader
manuals, DVDs, books, CDs, handouts, and recom-
mended activities and reading between sessions.
Group leaders receive initial three-day training and
ongoing technical support and supervision to assist
successful implementation. The programme has been
provided in mental health agencies, public health cen-
tres and schools in the US, UK, Ireland, Norway,
Germany, Denmark, Netherlands, Norway, Portugal
and Sweden. For every $1 invested the programme
produces a return of $1.20 (Aos et al., 2011).15
The third programme, Multisystemic Therapy
(MST), is an intensive family-based intervention for
adolescents who are chronic offenders; typically they
have committed serious crimes and have substance
abuse problems. MST aims to reduce anti-social
behaviour and criminal activity, as well as improve
parenting skills, family relations, school grades and
involvement with positive peers and activities. A ther-
apist works with the adolescent in their daily surround-
ings – with their family, friends, at school and in their
community. Together with the family, the therapist
designs a treatment plan to tackle identified risks and
encourage protective influences in the adolescent’s
environment. Various strategies are employed, such as
CBT or coaching. The therapist becomes a single point
of contact for the family, available 24/7. A typical MST
intervention lasts 3-5 months and involves 3-6 sessions
weekly, each up to two hours long.
MST has been proven to work in multiple high-
quality RCTs. It reduces criminal recidivism rates and
anti-social behaviour, including conduct problems and
aggression, and also improves emotion management
and family cohesion (eg. Henggeler, Melton, & Smith,
1992; Timmons-Mitchell, Bender, Kishna, & Mit-
chell,, 2006). Some effects are long-lasting, with
improvements still visible several years after treatment
(eg. Henggeler, Clingempeel, Brondino, & Pickrel,
A US-based organisation ‘MST Services’ provides
training, technical support, monitoring, materials
including treatment manuals, and licensing. It is deliv-
ered by experienced therapists, each of whom receive
five days training and ongoing support. MST has been
delivered extensively in the US, and in several
European countries, including Norway, Spain,
Sweden, Denmark, Netherlands, Iceland and the UK.
A UK cost-benefit analysis reveals that every pound
spent on MST produces a return of £1.77 (Social
Research Unit, 2012b)16.
How programmes fare against the standards
of evidence
It is instructive to reflect briefly on how pro-
grammes perform against the standards of evidence.
Here the focus is on the 100 programmes reviewed
against the standards in 2011 as part of the Annie E.
Casey Foundation Evidence2Success project. These
represent a spread of programmes in terms of child
developmental stage and outcomes targeted, and were
deliberately selected because the expert group deemed
them to be the best available.
Intervention specificity was generally good, with
most programmes reviewed meeting each of the ‘good
enough’ criteria on this dimension. Evaluation quality
was much more variable. The ‘good enough’ evalua-
tion quality criteria that tended to be better addressed
in programme evaluations include: the appropriateness
of measures (reflecting outcomes, not being tied to
intervention, not being rated solely by the imple-
menter); having a clear statement of demographics;
and assigning cases to programme and comparison
groups at the appropriate level (although they are not
always analyzed at the correct level). There was less
clarity about: what the control group received; how the
intervention that was actually delivered compares with
intervention as it was designed to be delivered; if or
how clustering is controlled for in analyses, for exam-
ple when the unit of allocation is schools; whether
there is equivalence between the programme and con-
trol groups at baseline on outcome measures; whether
analysis was intent-to-treat or not; and whether there
was differential attrition.
Regarding the ‘Best’17 evaluation quality criteria,
12-month follow-up was available in fewer than half of
cases, as was sub-group analysis. It was rare for there
to be any analysis of the relationship between fidelity
and outcomes or of the role played by mediating fac-
tors. Dose-response analysis – in the proper sense of
setting out deliberately to vary the dose and compar-
ing, for example, a full-length version and a shorter
version of the programme – was extremely rare.
On intervention impact, about half of the pro-
grammes indicated effect size. As regards system
readiness, it was generally difficult to establish system
readiness without contacting the programme develop-
er; the information supplied on programme websites,
for example, is inadequate for that purpose. Usually
the programme that was evaluated is still available,
although this can be difficult to detect as programmes
‘morph’ over time, notably to make improvements or
to be more suitable for a different population or setting.
Most programmes reviewed had a manual and training
but information about financial and human resources
was much less readily available. The extent of dissem-
ination and ‘real world’ testing was often unclear, and
although many programmes purport to have a fidelity
protocol it was less clear whether this is suitable for
use beyond research studies, in other words in ortho-
dox service settings.
The Blueprints websites
Until now, the work of Blueprints has been dissem-
inated through a website managed by the Blueprints
for Violence Prevention team at the University of
Boulder Colorado. The new Blueprints will have two
websites. The main site is aimed primarily at a US
provider audience. It is designed and maintained to
enable policy makers and providers to access readily
the information they need on each approved pro-
gramme. The approach taken is similar to that used in
‘Consumer Report’ (in the US) or ‘Which?’ (UK) mag-
azines. Instead of searching for and comparing cam-
eras or washing machines, website users can search for
suitable programmes by outcome, target group, and
risk and protective factors. It is not possible here to
give an exhaustive list of the information supplied
about each programme but the main fields include:
• Programme objectives.
• Programme recipients.
• Level of intervention (eg. universal prevention,
selected prevention, treatment).
• Setting (eg. school, community, home).
• Targeted risk and protective factors.
• A brief description of the programme.
A brief description of the outcomes achieved by
the programme.
• A brief description of the methodology used in the
relevant evaluation studies.
Financial information (eg. unit cost, cost-benefit
ratio, potential funding strategies).
• Training and technical assistance information.
Contact information for both the programme
designer and the purveyor.
Users can print a fact sheet containing this informa-
tion and also compare different programmes against
the same criteria.
A sister website aimed at European providers will
have essentially the same functions and be consist in
content, but it will contain less text, include a handful
of different fields (eg. evidence of dissemination of the
programme in question in Europe) and be translated
into European languages (initially Spanish, French and
German). Users wishing to obtain further information
will be able to link to the US website.
Both websites are ongoing projects. It is planned in
due course to add new fields, including a visual logic
model, video content (eg. the programme developer
summarizing the programme), the facility to explore
the likely impact of implementing a portfolio of pro-
grammes on costs, benefits and outcomes, and subjec-
tive feedback from policy makers, practitioners, chil-
dren and families who have experience with the pro-
The task of producing the websites requires generat-
ing high-quality content for each approved programme.
Some of this comes direct from the completed reviews,
including outcomes targeted, target group, and logic
model. However, additional data collection is required.
First, a consistent and comparable indicator of the size
of effect will be generated using the meta-analytic
methods in operation at the Washington State Institute
for Public Policy (Aos et al., 2011; Lee, Drake,
Pennucci, Bjornstad, & Edovald, 2012). In the case of
programmes and outcomes not yet examined by the
Washington centre this requires coding studies, con-
ducting meta-analyses and applying the effect size for-
mula utilized by the Washington State Institute for
Public Policy. Second, financial data is also created on
the cost of implementation. This is broken down into
costs for start-up, materials, delivery, training and tech-
nical assistance. Potential strategies and sources for
funding the programme are also listed. This information
is obtained via questionnaires and interviews with pro-
gramme developers/purveyors. Third, the Washington
centre uses the effect size and unit cost data to generate
a cost-benefit ratio, which will be used by Blueprints.
Future developments
The work described here will develop in at least six
ways. First, it is expected that the Blueprints standards,
which already set a high bar in the child welfare field,
will become higher as understanding of the science of
evidence-based programmes and their implementation
improves and as the quality of studies improves in
response to standards such as these. For example, the
stipulation of sustained impact at 12-month follow-up
might move from being a Model criterion to become a
Promising criterion, and the requirement for an inde-
pendent replication might be added.
Second, continuing efforts will be made to build
understanding of how the Blueprints standards and
database link to those used by other groups. For exam-
ple, there will be ongoing discussions with groups such
as the Society for Prevention Research, and in Europe
work has started on building connections with research
and intervention communities in participating coun-
tries – including those representing existing clearing-
houses. Such collaboration will contribute to the wider
use of Blueprints and other sources. For example,
some clearinghouses, such as MOVISIE in the
Netherlands, track innovation, whereas Blueprints
focuses on proven programmes. Both are important,
and the work of the former should contribute to the lat-
Third, new programmes will be approved. More
programmes developed in Europe need to be rated
against the standards of evidence. Several that stand a
reasonable chance of meeting the standards are known
of but they need to be reviewed in full and, if approved
by the Blueprints Board, disseminated more widely
(eg. Atria & Spiel, 2007; Faggiano et al., 2008;
Salmivalli, Kärnä, & Poskiparta, 2011). In addition,
and with a view to the longer-term, work is underway
to show how the standards can be used to help practi-
tioners to take programmes on the journey from inno-
vation to model programme. This argues that innova-
tions should be strengthened and tested with a level of
rigor appropriate to their stage of gestation (Little,
2012a). For example, new programmes might warrant
a pre-post or even a small comparison group study,
with progress to larger RCTs conditional on positive
Fourth, Blueprints will need promoting in Europe.
In the first instance, the standards of evidence should
be shared with colleagues in relevant European organ-
izations, including pan-European centres, clearing-
houses and country-specific research institutes. This
will promote discussion about the differences and sim-
ilarities between different standards/databases (e.g. in
terms of focus, function and audience) and ways of
connecting different initiatives to achieve greater syn-
ergy. Eventually, specified European states and/or the
European Union might be encouraged to adopt the
standards, database and economic model. The strong
alliances forged by Blueprints with key experts and
decision makers in the US need to be replicated in
Europe. This will require written materials as well as
conferences and meetings to understand potential
users’ needs and concerns and encourage them to use
the website. Training materials on the standards and
using the websites might be developed for end-users.
Fifth, research is needed on the transportability of
programmes from one context to another – in particu-
lar from the US to Europe, or from one European coun-
try to another. The success of imported programmes is
mixed, as indicated already. What is effective in one
context might not be effective in another, and what is
culturally appropriate in one context may not be in
another. Various factors might account for this, includ-
ing the extent to which programmes need adaptation
and are adapted carefully (Kumpferet al., 2012) and
the policy and service delivery context of the new site
(Sundell et al., 2008). But more research is needed, and
European providers need information about whether
imported programmes are likely to ‘fit’ or work in their
country and how the chances of this happening can be
improved. Further, the costs and benefits of pro-
grammes may differ across countries due to differ-
ences in welfare systems, hence the importance of the
translation of the Washington economic model referred
to above.
Lastly, Blueprints focuses on programmes but there
is growing interest in the idea of kite-marking policy
and in identifying effective practices (eg. Barth et al.,
2012), given the difficulties and limitations often of
implementing programmes in children’s services sys-
tems (Little, 2010). For this reason, new standards of
evidence will be developed and used to identify evi-
dence-based practices, policies and processes (eg.
assessment methods) that should be recommended for
widespread dissemination.
Multisystemic therapy (MST) delivered through a community mental health center was compared with usual services delivered by a Department of Youth Services in the treatment of 84 serious juvenile offenders and their multiproblem families. Offenders were assigned randomly to treatment conditions. Pretreatment and posttreatment assessment batteries evaluating family relations, peer relations, symptomatology, social competence, and self-reported delinquency were completed by the youth and a parent, and archival records were searched at 59 weeks postreferral to obtain data on rearrest and incarceration. In comparison with youths who received usual services, youths who received MST had fewer arrests and self-reported offenses and spent an average of 10 fewer weeks incarcerated. In addition, families in the MST condition reported increased family cohesion and decreased youth aggression in peer relations. The relative effectiveness of MST was neither moderated by demographic characteristics nor mediated by psychosocial variables.