ArticlePDF Available

Sample size calculations for cluster randomised controlled trials with a fixed number of clusters

Springer Nature
BMC Medical Research Methodology
Authors:

Abstract and Figures

Cluster randomised controlled trials (CRCTs) are frequently used in health service evaluation. Assuming an average cluster size, required sample sizes are readily computed for both binary and continuous outcomes, by estimating a design effect or inflation factor. However, where the number of clusters are fixed in advance, but where it is possible to increase the number of individuals within each cluster, as is frequently the case in health service evaluation, sample size formulae have been less well studied. We systematically outline sample size formulae (including required number of randomisation units, detectable difference and power) for CRCTs with a fixed number of clusters, to provide a concise summary for both binary and continuous outcomes. Extensions to the case of unequal cluster sizes are provided. For trials with a fixed number of equal sized clusters (k), the trial will be feasible provided the number of clusters is greater than the product of the number of individuals required under individual randomisation (nI) and the estimated intra-cluster correlation (ρ). So, a simple rule is that the number of clusters (k) will be sufficient provided: [formula in text]. Where this is not the case, investigators can determine the maximum available power to detect the pre-specified difference, or the minimum detectable difference under the pre-specified value for power. Designing a CRCT with a fixed number of clusters might mean that the study will not be feasible, leading to the notion of a minimum detectable difference (or a maximum achievable power), irrespective of how many individuals are included within each cluster.
Content may be subject to copyright.
COMM E N T A R Y Open Access
Sample size calculations for cluster randomised
controlled trials with a fixed number of clusters
Karla Hemming
*
, Alan J Girling, Alice J Sitch, Jennifer Marsh and Richard J Lilford
Abstract
Background: Cluster randomised controlled trials (CRCTs) are frequently used in health service evaluation.
Assuming an average cluster size, required sample sizes are readily computed for both binary and continuous
outcomes, by estimating a design effect or inflation factor. However, where the number of clusters are fixed in
advance, but where it is possible to increase the number of individuals within each cluster, as is frequently the
case in health service evaluation, sample size formulae have been less well studied.
Methods: We systematically outline sample size formulae (including required number of randomisation units,
detectable difference and power) for CRCTs with a fixed number of clusters, to provide a concise summary for
both binary and continuous outcomes. Extensions to the case of unequal cluster sizes are provided.
Results: For trials with a fixed number of equal sized clusters (k), the trial will be feasible provided the number of
clusters is greater than the product of the number of individuals required under individual randomisation (n
I
) and
the estimated intra-cluster correlation (r). So, a simple rule is that the number of clusters (k) will be sufficient
provided:
k>nI×
ρ.
Where this is not the case, investigators can determine the maximum available power to detect the pre-specified
difference, or the minimum detectable difference under the pre-specified value for power.
Conclusions: Designing a CRCT with a fixed number of clusters might mean that the study will not be feasible,
leading to the notion of a minimum detectable difference (or a maximum achievable power), irrespective of how
many individuals are included within each cluster.
Introduction
Cluster randomised controlled trials (CRCTs), in which
clusters of individuals are randomised to intervention
groups, are frequently used in the evaluation of service
delivery interventions, primarily to avoid contamination
but also for logistic and economic reasons [1-3]. Whilst
a well conducted individually Randomised Controlled
Trial (RCT) is the gold standard for assessing the effec-
tiveness of pharmacological treatments, the evaluation
of many health care service delivery interventions is dif-
ficult or impossible without recourse to cluster trials.
Standard sample size formulae for CRCTs require the
investigator to pre-specify an average cluster size, to
determine the number of clusters required. In so doing,
these sample size formulae implicitly assume that the
number of clusters can be increased as required [1,3-5].
However, when evaluating health care service delivery
interventions the number of clusters might be limited to
a fixed number even though the sample size within each
cluster can be increased. In a real example, evaluating
lay pregnancy support workers, clusters consisted of
groups of pregnant women under the care of different
midwifery teams [6,7]. The available number of clusters
was restricted to the midwifery teams within a particular
geographical region. Yet within each midwifery team it
was possible to recruit any reasonable number of indivi-
duals by extending the recruitment period. In another
real example, a CRCT to evaluate the effectiveness of a
combined polypill (statin, aspirin and blood pressure
lowering drugs) in Iran was limited to a fixed number of
* Correspondence: k.hemming@bham.ac.uk
Department of Public Health, Epidemiology and Biostatistics, University of
Birmingham, UK
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
© 2011 Hemming et al; licensee BioMed Centra l Ltd. This is an Open Access a rticle distributed under the terms of the Creative
Commons Attri bution License (http://creativecommons.org/licenses/by/2.0), which permits u nrestricted use, distri bution, and
reproductio n in any medium, provided the original work is pro perly cited.
villages participating in an existing cohort study [8].
Other such examples of designs in which a limited num-
ber of clusters were available include trials of commu-
nity based diabetes educational programs [9] and
general practice based interventions to reduce primary
care prescribing errors [10], both of which were limited
to the number of general practices which agreed to
participate.
Theexistingliteratureonsamplesizeformulaefor
CRCTs focuses largely on the case where there is no
limit on the number of available clusters [3-5,11,12].
Whilst it is well known that the statistical power that can
be achieved by additional recruitment within clusters is
limited, and that this depends on the intra-cluster corre-
lation [11-13], little attention has been paid to the limita-
tions imposed when the number of clusters is fixed in
advance. This paper aims to fill this gap by exploring the
range of effect sizes, and differences between proportions,
that can be detected when the number of clusters is fixed.
We describe a simple check to determine whether it is
feasible to detect a specified effect size (or difference
between proportions) when the number of clusters are
fixed in advance; and for those cases in which it is infeasi-
ble, we determine the minimum detectable difference
possible under the required power and the maximum
achievable power to detect the required difference. We
illustrate these ideas by considering the design of a
CRCT to detect an increase in breastfeeding rates where
the number of clusters are fixed.
For completeness we outline formulae for simpler
designs for which the sample size formulae are relatively
well known, or easily derived, as an important prelude.
In so doing, the simple relationships between the formu-
lae are clear and this allows progressive development to
the less simple situation (that of binary detectable differ-
ence or power). It is hoped that by developing the for-
mulae in this way the material will be accessible to
applied statisticians and more mathematically minded
health care researchers. We also provide a set of guide-
lines useful for investigators when designing trials of
this nature.
Background
Generally, suppose a trial is to be designed to test the
null hypothesis H
0
:μ
0
=μ
1
where μ
0
and μ
1
represent
the means of some variable in the control and interven-
tion arms respectively; and where it is assumed that var
(μ
0
)=var(μ
1
)=s
2
. Suppose further that there are an
equal number of individuals to be randomised to both
arms, letting ndenote the number of individuals per
arm and letting ddenote the difference to be detected
such that d=μ
0
-μ
1
,1-bdenotes the power and a
the significance level. We limit our consideration to
trials with two equal sized parallel arms, with common
standard deviation, two-sided test, and assume normality
of outcomes and approximate the variance of the differ-
ence of two proportions. The sub-script, I (for Indivi-
dual randomisation), is used throughout to highlight any
quantities which are specific to individual randomisa-
tion; and likewise the sub-script, C (for Cluster rando-
misation), is used throughout to highlight any quantities
which are specific to cluster randomisation. No sub-
scripts are used to distinguish cluster from individual
randomisation for variables which are pre-specified by
the user.
RCT: sample size formulae under individual
randomisation
Following standard formulae, for a trial using individual
randomisation[14], for fixed power (1 - b)andfixed
sample size (n) per arm, the detectable difference, d
I
,
with variance var(d
I
)=2s
2
/n
I
is:
dI=
2σ2
n(zα/2 +zβ
)
(1)
where z
a/2
denotes the upper 100a/2 standard normal
centile.
For a trial with nindividuals per arm, the power to
detect a pre-specified difference of d,is1-b
I
, such that:
z
βI=
n
2
d
σzα/
2
or equivalently:
βI=
n
2
d
σzα/2
(2)
where Fis the cumulative standardised Normal
distribution.
And,finallytherequiredsamplesizeperarmfora
trial at pre-specified power 1 - bto detect a pre-speci-
fied difference of d,isn
I
, where:
n
I=2σ2
(zα/2 +zβ)2
d2
(3)
Using Normal approximations, the above formulae can
be used for binary outcomes, by approximating the var-
iance (s
2
) of the proportions π
1
and π
2
, by:
σ21/2[π1
(
1π1
)
+π2
(
1π2
)]
(4)
for testing the two sided hypothesis H
0
:π
1
=π
2
.
CRCTs: standard sample size formulae under cluster
randomisation
Suppose, instead of randomising over individuals, the
trial will randomise the intervention over kclusters per
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 2 of 11
arm each of size m, to provide a total of n
C
=mk indivi-
duals per arm. Then, by standard results [1], the var-
ianceofthedifferencetobedetectedd
C
is inflated by
the Variance Inflation Factor (VIF):
V
IF =[1+
(
m1
)
ρ](5)
where ris the Intra-Cluster Correlation (ICC) coeffi-
cient, which represents how strongly individuals within
clusters are related to each other. Where the cluster
sizes are unequal this variance inflation factor can be
approximated by:
V
IF =[1+
((
cv2+1
)
¯
m1
)
ρ
]
(6)
where cv represents the coefficient of variation of the
cluster sizes and ¯
is the average cluster size [15]. Thus,
the variance of d
C
(for fixed cluster sizes) becomes:
var(dC)= 2σ
2
[1+(m1)ρ]
mk
(7)
and this is simply extended for varying cluster sizes
using equation 6. To determine the required sample size
for a CRCT with a pre-specified power 1 - b,todetect
the pre-specified difference d, and where there are m
individuals within each cluster, then the required sample
size n
C
=km per arm, follows straightforwardly from
equations 3 and 5 and is:
nC=2σ2
(zα/2 +zβ)2[1 + (m1)ρ]
d2
=nI[1 + (m1)ρ]
=n
I
×VIF
(8)
where n
I
istherequiredsamplesizeperarmusinga
trial with individual randomization to detect a difference
d, and VIF can be modified to allow for variation in
cluster sizes (equation 6). This is the standard result,
thattherequiredsamplesizeforaCRCTisthat
required under individual randomisation, inflated by the
variance inflation factor [1]. The number of clusters
required per arm is then:
k=
nI(1+(m1)ρ)
m
(9)
assuming equal cluster sizes. This slight modification
of the common formula for the number of required
clusters (over that say presented in [2]), has rounded up
the total sample size to a multiple of the cluster size
(using the ceiling function). For, unequal cluster sizes
(using the VIF at equation 6) this becomes:
k=
nI[1 + ((cv2+1)¯
m1)ρ]
¯
m
(10)
again with rounding up to the average cluster size.
CRCTs of fixed size: fixed number of clusters each
of fixed size
Where a CRCT is to be designed with a completely
fixed size, that is with a fixed number of clusters, each
of a fixed size (although this size may vary between
clusters), then it is possible to evaluate both the detect-
able difference and the power, as would be the case in a
design using individual randomisation. CRCTs of fixed
size might not be the commonest of designs, but formu-
lae presented below: are an important prelude to later
formulae, might be useful for retrospectively computing
power once a trial has commenced (and thus the size
has been determined), and will also be useful in those
limited number of studies for which the trial sample
size is indeed completely fixed (for example within a
cohort study) [9,10].
CRCT of fixed size: detectable difference
For a CRCT with a fixed number of clusters kper arm,
with a fixed number of individuals per cluster mand
with power 1 -b, then the detectable difference, d
C
,fol-
lows straightforwardly from equation 1:
dC=2σ2[1 + (m1)ρ]
mk (zα/2 +zβ
)
=dI[1 + (m1)ρ]
=d
I
VIF
(11)
where d
I
is the detectable difference using individual
randomisation and VIF might be either of those pre-
sented at equations 5 and 6. So the detectable difference
in a CRCT can be thought of as the detectable differ-
ence in a trial using individual randomisation, inflated
by the square-root of the variance inflation factor.
CRCTs of fixed size: power
The power 1 - b
C
of a trial designed to detect a differ-
ence of dwith fixed sample size n
C
=mk per arm, fol-
lowing equation 2, is:
z
βC=mk
2
d
σ
VIF zα/
2
or equivalently, that:
βC=
mk
2
d
σVIF zα/2
(12)
where again, VIF might be either of those presented at
equations 5 and 6. So, power in a CRCT can be thought
of as the power available under individual randomisation
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 3 of 11
for a standardised effect size which is deflated by the
square-root of the variance inflation factor.
CRCTs with fixed number of clusters but flexible
cluster size
Standard sample size formulae for CRCTs, by assuming
knowledge of the cluster size (m) and determining the
required number of clusters (k), implicitly assume that
the number of clusters can be increased as required.
However, in the design of health service interventions, it
is often the case that the number of clusters will be lim-
ited by the number of cluster units willing or able to
participate. So for example, in two general practice
based CRCTs (one to evaluate lay education in diabetes
and the other to evaluate a general practice-based inter-
vention to reduce primary care prescribing errors), the
number of clusters was limited to the number of pri-
mary care practices that agreed to participate in the
study. From an estimate of the number of clusters avail-
able, it is relatively straightforward to determine the
required cluster size for each of the clusters. However,
due to the limited increase in precision available by
increasing cluster sizes, it might not always be feasible
to detect the required difference at required power
under a design with a fixed number of kclusters. These
issues are explored below.
CRCTs with a fixed number of clusters: sample size per
cluster
ThestandardsamplesizeformulaeforCRCTsassumes
knowledge of cluster size (m) and consequently deter-
mines the number of clusters (k) required. For a pre-
specified available number of clusters (k), investigators
need instead to determine the required cluster size (m).
Whilst this sample size formula is not commonly pre-
sented in the literature, it consists of a simple re-
arrangement of the above formulae presented at equa-
tion 8 [2]. So, for a trial with a fixed number of equal
sized clusters (k) the required sample size per arm for a
trial with pre-specified power 1 - b, to detect a differ-
ence of d,isn
C
, such that:
nC=nI[1+(m1)ρ]
=nI1+nC
k1ρ
=nIk[1 ρ]
[
knIρ
]
(13)
where n
I
is the sample size required under individual
randomisation. This increase in sample size, over that
required under individual randomisation, is no longer a
simple inflation, as the inflation required is now depen-
dent on the sample size required under individual
randomisation.
The corresponding number of individuals in each of
the kequally sized clusters is:
m=
nI(1 ρ)
(
knIρ
)
(14)
this time rounding up the total sample size to a multi-
ple of the number of clusters (k) available (using the
ceiling function).
For unequal cluster sizes, using the VIF from equation
6, the required sample size is:
nC=nIk[1 ρ]
[knI
(
cv2+1
)
ρ](15)
and the average number of individuals per cluster
becomes:
¯
m=
nI(1 ρ)
knI
(
cv2+1
)
(16)
again rounding up to a multiple of the number of
clusters (k) available.
CRCT with a fixed number of clusters: feasibility check
When designing a CRCT with a fixed number of clus-
ters, because of the diminishing returns that sets in
when the sample size of each cluster is increased, it may
not be possible to detect the required difference at pre-
specified power [2]. In a CRCT with a fixed number of
individuals per cluster, but no limit on the number of
clusters, no such limit will exist. This limit on the differ-
ence detectable (or alternatively available power) stems
from the maximum precision available within a CRCT
with a limited number of clusters. Recall that the preci-
sion of the estimate of the difference is:
var1(dC)= mk
2σ2VIF =mk
2σ2[1 +
(
m1
)
ρ](17)
As the cluster size (m) becomes large, this precision
reaches a theoretical limit:
lim
m→∞
var
1
(d
C
) = lim
m→∞
mk
2σ
2
[1 +
(
m1
)
ρ]=k
2σ
2
ρ(18)
This limit therefore provides an upper bound on the
precision of an estimate from a CRCT. If the CRCT is
to achieve the same or greater power as a corresponding
individually randomised design, it is required that:
k
2σ2
ρ
>
nI
2σ2(19)
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 4 of 11
for equal cluster sizes; and:
k
2σ2ρ
(
cv2+1
)
>
nI
2σ2(20)
for unequal cluster sizes. A simple feasibility check, to
determine whether a fixed number of available clusters
will enable a trial to detect a required difference at
required power, therefore consists of evaluating whether
the following inequality holds:
k>nI
ρ
(21)
for equal cluster sizes [2], and
k>nIρ
(
cv2+1
)
(22)
for unequal cluster sizes. Here, n
I
is the required sam-
plesizeunderindividual randomisation, kis the avail-
able number of clusters, ris the estimated intra-cluster
correlation coefficient, and cv represents the coefficient
of variation of cluster sizes. When this inequality does
not hold, it will be necessary to re-evaluate the specifica-
tions of this sample size calculation. This might consist
of a re-evaluation of the power and significance level of
the trial, or it might consist of a re-evaluation of the
detectable difference. Bounds, imposed as a result of the
limited precision, on the detectable difference and
power are derived below.
CRCT with a fixed number of clusters: minimum
detectable difference
For a trial with a fixed number of clusters (k), and
power 1 - b, the theoretical Minimum Detectable Differ-
ence (MDD) for an infinite cluster size is d
MDD
, where:
dMDD =limm→∞2σ2[1 + (m1)ρ]
mk (zα/2 +zβ
)
=
2σ2ρ
k(zα/2 +zβ)
(23)
which follows naturally from the formula for detect-
able difference (equation 1) and the bound on precision
(equation 18). This therefore gives a bound on the
detectable difference achievable in a trial with a fixed
number of clusters.
For the case of two binary outcomes, where π
1
is fixed
(and π
2
>π
1
), then the minimum detectable difference
for a fixed number of clusters per arm k,isd
C
=π
2
-π
1
such that:
d2
MDD =ρ(zα/2 +zβ)
2
[π1(1 π1)+π2(1 π2)].
k
(24)
Re-arranging this as a function of π
2
is:
0=aπ2
2
+bπ2+
c
(25)
where a=-(1+w), b=2π
1
+w,
c=wπ1(1 π1)π
2
1
,andw=r(z
a/2
+z
b
)
2
/k.Solving
this quadratic gives:
π2=b±(b24ac)
2
a.
(26)
Each of these two solutions to this quadratic will pro-
vide the limit on π
2
for two sided tests.
CRCT with a fixed number of clusters: maximum
achievable power
For a trial again with a fixed number of clusters (k), the
theoretical Maximum Achievable Power (MAP) to
detect a difference dis 1 - b
MAP
where:
lim
m
→∞
zβMAP = lim
m→∞ km
2(1+(m1)ρ)
d
σzα/
2
=
k
2ρ
d
σzα/2
(27)
which again follows from the formula for power
(equation 2) and the bound on precision (equation 18).
So the maximum achievable power is 1 - b
MAP
where:
β
MAP =
k
2ρ
d
σzα/2
.
(28)
This therefore provides an upper limit on the power
available under a design with a fixed number of clus-
ters k.
CRCT with a fixed number of clusters: practical advice
When designing a CRCT with a fixed number of clus-
ters, researchers should be aware that such trials will
have a limited available power, even when it is possible
to increase the number of individuals per cluster. In
such circumstances, it will be necessary to:
(a) Determine the required number of individuals
per arm in a trial using individual randomisation
(n
I
).
(b) Determine whether a sufficient number of clus-
ters are available. For equal sized clusters, this will
occur when:
k>nI
ρ
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 5 of 11
where n
I
is the sample size required under individual
randomisation, ris the intra-cluster correlation coef-
ficient, and kis the number of clusters available in
each arm. For unequal sized clusters:
k>nIρ
(
cv2+1
)
where cv is the coefficient of variation of cluster
sizes.
(c) Where the design is not feasible and cluster sizes
are unequal, determine whether the design becomes
feasible with equal cluster sizes (i.e. if k>n
I
r).
(d) Where the design is still not feasible:
(i) Either: the power must be reset at a value
lower than the maximum available power (equa-
tion 28),
(ii) Or: the detectable difference must be set
greater than the minimum detectable difference
(equations 23 (continuous outcomes) and 26
(binary outcomes)),
(iii) Or: both power and detectable difference are
adjusted in combination.
(e) Once a feasible design is found, determine the
required number of individuals per cluster from
equations 14 (for equal cluster sizes) and 16 (for
varying cluster sizes).
General examples
Maximum achievable power for cluster designs with 10,
20, 30, 50 or 100 clusters per arm are presented in Fig-
ure 1 for standardised effect sizes ranging from 0.05 to
0.30 and for ICCs in the range 0 to 0.1 (which are com-
mon ICCs in the medical literature [16]). As expected,
achievable power increases with increasing numbers of
clusters and increasing effect size. For the smallest effect
size considered, 0.05, even 100 clusters per arm is not
sufficient to obtain anywhere near an acceptable power
level for ICCs above about 0.02. For less extreme effect
sizes, such as 0.2 when there are 50 or 100 clusters
available per arm, for ICCs less than about 0.1 power in
the level of 80% will be obtainable; yet where there are
just 10 or 20 clusters available, 80% power will only be
attainable for ICCs less than about 0.06. Figure 2 shows
similar estimates of maximum achievable power for bin-
ary comparisons at baseline proportions ranging from
0.05 to 0.5 to detect increases of 0.1 (i.e. 10 percentage
points on a percentage scale).
Minimal detectable differences are also presented for
both standardised effect sizes (Figure 3) and proportions
(Figure 4) for 80% power. As expected, increasing the
number of clusters reduces the minimum detectable dif-
ference. Therefore with a large number of clusters avail-
able and sufficient numbers of individuals per cluster,
trials are possible to detect small changes in proportions
and standardised effect sizes. On the other hand, for
trials with few clusters (say 10 or 20 per arm), minimum
detectable differences become large. So, for example for
continuous outcomes, with say 10 clusters per arm and
an ICC in the region of 0.02, then the MDD is in the
region of 0.2 standardised effect sizes (Figure 3). For
binary outcomes (Figure 4) with 10 clusters per arm and
ICC in the region of 0.02 the minimum detectable dif-
ference is in the region of about a 10 percentage point
change (i.e. from about 15% to 25%).
Example
In a real example, a CRCT is to be designed to evaluate
the effectiveness of lay support workers to promote
breastfeeding initiation and sustainability until 6 weeks
postpartum. Due to fears of contamination, whereby
new mothers indivertibly gain access and support from
the lay workers, the intervention is to be randomised
over cluster units. Cluster randomisation will also
ensure that the trial is logistically simpler to run, as ran-
domisation will be carried out at a single point in time,
and midwives will have the benefit of remaining in
either the intervention or control arm for the duration
of the trial. The cluster units to be used are midwifery
teams, which are teams of midwives who visit a set
number of primary care general practices to deliver
antenatal and postnatal care. The trial is to be carried
out within a single primary care trust within the West
Midlands. The nature of this design therefore means
that the number of clusters available is fixed at the
number of midwifery teams delivering care within the
region.
Atthetimeofdesigningthetrial,currentbreastfeed-
ing rates, at 6 weeks postpartum, in the region were
around 40%. National targets had been set to encourage
all regions to increase rates to around 50%. It was
known that 40 clusters are available (i.e. there are 40
midwifery teams within the region), so that the number
of clusters per arm was fixed to k= 20. Estimates of
ICC range from 0.005 to 0.07 in similar trials [6,7].
Firstly, the feasibility check is implemented to deter-
mine whether the 20 available clusters per arm are suffi-
cient to detect the 10 percentage point change assuming
the lower estimated ICC (0.005). Where the power is set
at 80%, the required sample size per arm to detect an
increase in percentages from 40% to 50%, under indivi-
dual randomisation, is n
I
= 385 (Table 1). When multi-
plied by the ICC this gives 385 × 0.005 = 1.925 which is
less than k= 20. This therefore means that 20 clusters
per arm will be sufficient for this design (provided an
adequate number of individuals are recruited in each
cluster). A similar design with 90% power would require
519 individuals per arm using individual randomisation.
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 6 of 11
Again, because 515 × 0.005 = 2.57 < 20, this also means
that 20 clusters per arm will be sufficient to detect an
increase from 40% to 50% with 90% power (again pro-
vided an adequate number of individuals are recruited
in each cluster). Equation 14 shows that under the
assumption that r= 0.005, either 22 or 30 individuals
will be required per cluster (for 80% and 90% power
respectively).
Secondly, the feasibility check is evaluated to deter-
mine whether the 20 available clusters per arm is suffi-
cient to detect the 10 percentage point change assuming
the higher estimated ICC (0.07). However, in this case
as 385 × 0.07 = 26.95 >20, so the condition is not met
at the 80% power level (and so neither at the 90%
power level). Therefore, 20 clusters per arm is not a suf-
ficient number of clusters, however many individuals are
included within each cluster, to detect the required
effect size at the pre-specified power and significance.
Since this latter design is not feasible, formulae at
equation 25 allow determination of the minimum
detectable difference (or maximum achievable power
from equation 27). For a cluster trial with 80% power,
and assuming a baseline event rate of π
1
= 0.40, the
minimum detectable difference is 0.12 (to 2 d.p.). That
is, a change from 40% to 52%. To detect a change from
40% to 52% with 80% power, 189 individuals would be
required per cluster. For a trial with 90% power, the
minimum detectable difference is 0.14 (i.e. a change
from 40% to 54%). To detect a change from 40% to 54%
with 90% power, 146 individuals would be required per
cluster.
Discussion
In health care service evaluation cluster RCTs, pre-spe-
cifying the numbers of clusters available, are frequently
used. That is, trials are designed based on a limited
number of cluster units (e.g. GP practices) willing or
able to participate [6,7,9,10]. In contrast, sample size
methods are almost exclusively based on pre-specified
average cluster sizes, as opposed to number of clusters
available [1,4]. Whilst mapping sample size formulae
from one method to the other is straightforward, a limit
Figure 1 Maximum achievable power for various different standardised effect sizes: limiting values as the cluster size approaches
infinity.
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 7 of 11
on the precision of estimates in such designs leads to a
maximum available power (that is, a limit on the power
available irrespective of how large the clusters are) and
minimum detectable differences (that is, a limit on the
difference detectable irrespective of how large the clus-
ters are).
For example, with just 15 clusters available per arm
and an ICC of 0.05, power achievable for a trial aiming
to detect an increase in percentage change from 40% to
50% is limited to about 62%, irrespective of how large
the clusters are made. Cluster trials with just 15 clusters
available per arm are not uncommon and a 10 percen-
tage point change not an unrealistic goal in many set-
tings. However, power levels as low as 60% are clearly
sub-optimal, and might not be regarded as sufficiently
high to warrant the costs of a clinical trial. Formulae
provided here for minimum detectable differences show
that to retain a power level in the region of 80%, trial-
lists would have to be content with detecting a differ-
ence above a twelve percentage point change. Re-
formulation of the problem in terms of minimum
detectable difference can thus be used to compare the
difference which is statistically detectable (at acceptable
power levels) to that which is clinically, or managerially,
important.
Should the situation arise in which the postulated ICC
suggests that it is not possible to detect the required dif-
ference (at pre-specified power), it might be tempting to
lower the estimated ICC. Such an approach should be
strongly discouraged, since loss of power will most likely
result, potentially leading to a non-significant finding
[12]. Rather, formulae here allow sensitivity of the
design to be explored in light of possible variations in
the ICC. However, other avenues to increase available
power might reasonably be considered. For example, it
may be plausible to consider relaxing alpha and even to
set alpha and beta equivalent [17]. Or alternatively,
incorporating prior information in a Bayesian framework
may lead to increases in power. It might further be
argued that studies of limited power are of importance
as they contribute to the evidence framework by ulti-
mately becoming part of future systematic reviews [18],
Figure 2 Maximum achievable power to detect increases in 10 percentage points for various different baseline proportions (π
1
):
limiting values as the cluster size approaches infinity.
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 8 of 11
and the methods presented here thus allow for the
achievable power to be computed. Before-and-after type
studies offer a further avenue of exploration, as by their
very nature induce smaller intra-cluster correlations.
Methodological limitations of the work presented here
include the assumption of equal sized arms; equal stan-
dard deviations; Normality assumptions (which might
not be tenable for small numbers of clusters as well as
small numbers of individuals); and lack of continuity
correction for binary variables. Furthermore, CRCTs
with a small number of clusters are controversial, pri-
marily because the small number of units randomised
open results to the possibility of bias and approxima-
tions to Normality become questionable. However,
despite this, CRCTs with a small number of clusters are
frequently reported. The Medical Research Council, for
instance, has issued guidelines that cluster trials with
fewer than 5 clusters per arm are inadvisable [19].
Figure 3 Minimum detectable difference (effect size) at 80%
power for continuous outcomes: limiting values as the cluster
size approaches infinity.
Figure 4 Minimum detectable difference (π
2
) at 80% power various different baseline proportions (π
1
): limiting values as the cluster
size approaches infinity.
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 9 of 11
Others have considered some of the issues involved in
community based intervention trials with a small num-
ber of clusters, but have focused on issues of restricted
randomisation and whether the analysis should be at the
individual or cluster level [20].
Conclusions
Evaluations of health service interventions using CRCTs,
are frequently designed with a limited available number
of clusters. Sample size formulae for CRCTs, are almost
exclusively evaluated as a function of the average cluster
size. Where no formal limits exist on the number of
individuals enrolled within each cluster, increasing the
numbers of individuals leads to a limited increase in the
study power. This in turn means that for a trial with a
fixed number of clusters, some designs will not be feasi-
ble, and we have provided simple guidelines to evaluate
feasibility. A simple rule is that the number of clusters
(k) will be sufficient provided:
k>nI×
ρ.
For infeasible designs to retain acceptable levels of
power, detectable difference might not be as small as
desired, leading to the notion of a minimum detectable
difference. Useful aidese memoires are that the detect-
able difference in a CRCT is that of an individual RCT
inflated by the square root of the variance inflation fac-
tor; and the power is that under individual randomisa-
tion with the standardised effect size deflated by the
square root of the variance inflation factor. A STATA
function, clusterSampleSize.ado, allows practical imple-
mentation of all formulae discussed here and is available
from the author.
Acknowledgements
K. Hemming, R. J. Lilford and A. Girling were funded by the Engineering and
Physical Sciences Research Council of the UK through the MATCH
programme (grant GR/S29874/01) and by a National Institute of Health
Research grant for Collaborations for Leadership in Applied Health Research
and Care (CLAHRC), for the duration of this work. The views expressed in
this publication are not necessarily those of the NIHR or the Department of
Health. The authors would like to express their gratitude to Monica Taljaard
and Sandra Eldridge for review comments which helped to develop the
material.
Authorscontributions
KH, JM and AS conceived the idea. KH wrote the first and subsequent drafts.
AG and RJL helped developed the ideas. All authors read and approved the
final manuscript.
Competing interests
The authors declare that they have no competing interests.
Received: 15 February 2011 Accepted: 30 June 2011
Published: 30 June 2011
References
1. Murray DM: The Design and Analysis of Group-Randomized Trials.
London: Oxford, University Press; 1998.
2. Donner A, Klar N: Design and Analysis of Cluster Randomised Trials in
Health Research. London: Arnold 2000.
3. Campbell MK, Thomson S, Ramsay CR, MacLennan GS, Grimshaw JM:
Sample size calculator for cluster randomised trials. Computers in Biology
and Medicine 2004, 34:113-125.
4. Donner A, Birkett N, Buck C: Randomization by cluster: sample size
requirements and analysis. American Journal of Epidemiology 1981,
114:906-914.
5. Kerry SM, Bland MJ: Sample size in cluster randomisation. BMJ 1998,
316:549.
6. MacArthur C, Win ter HR, Bick DE, Lilford RJ, La ncashire RJ, Knowles H,
et al:Redesigning postnatal care: a randomised controlled trial of
protocol-based midwifery-led care focused on individual womens
physical and psychological health needs. Health Technology Assessment
2003, 7:37.
7. MacArthur C, K KJ, Ingram L, Freemantle N, Dennis CL, Hamburger R, et al:
Antenatal peer support workers and breastfeeding initiation: a cluster
randomised controlled trial. BMJ 2009, 338.
8. Pourshams A, Khademi H, Malekshah AF, Islami F, Nouraei M, Sadjadi AR,
et al:Cohort profile: The Golestan Cohort Study - a prospective study of
oesophagael cancer in northern Iran. International Journal of Epidemiology
2009, 39:52-59.
9. Davies MJ, Heller S, Skinner TC, Campbell MJ, Carey ME, Cradock S, et al:
Effectiveness of the diabetes education and self management for
ongoing and newly diagnosed (DESMOND) programme for people with
newly diagnosed type 2 diabetes: cluster randomised controlled trial.
BMJ 2008, 336:491-495.
10. Avery AJ, Rodgers S, Cantrill JA, Armstrong S, Elliott R, Howard R, et al:
Protocol for the PINCER trial: a cluster randomised trial comparing the
effectiveness of a pharmacist-led IT-based intervention with simple
feedback in reducing rates of clinically important errors in medicines
management in general practices. Trials 2009, 10:28.
11. Feng Z, Diehr P, Peterson A, McLerran D: Selected issues in group
randomized trials. Annual Review of Public Health 2001, 22:167-87.
12. Guittet L, Giraudea B, Ravaud P: A priori postulated and real power in
cluster randomised triasl: mind the gap. BMC Medical Research
Methodology 2005, 5:25.
13. Brown C, Hofer T, Johal A, Thomson R, Nicholl J, Franklin BD, et al:An
epistemology of patient safety research: a framework for study design
and interpretation. Part 2. Study design. Qual Saf Health Care 2008,
17:163-169.
14. Armitage P, Berry G, Matthews JNS: Statistical methods in medical
research. London: Blackwell Publishing; 2002.
15. Kerry SM, Bland MJ: Sample size in cluster randomised trials: effect of
coefficient of variation of cluster size and cluster analysis method.
International Journal of Epidemiology 2006, 35:1292-1300.
16. Campbell MK, Fayers PM, Grimshaw JM: Determinants of the intracluster
correlation coefficient in cluster randomized trials: the case of
implementation research. Clinical Trials 2005, 2:99-107.
17. Lilford RJ, Johnson N: The alpha and beta errors in randomized trials.
New England Journal of Medicine 1990, 322:780-781.
18. Edwards SJ, Braunholtz D, Jackson J: Why underpowered trials are not
necessarily unethical. Lancet 2001, 350:804-807.
Table 1 Estimates of the Minimum Detectable Difference
(MDD) for trial with 20 clusters per arm, to detect an
increase in an event rate from 40%
Power = 80% Power = 90%
40% vs 50% MDD 40% vs 50% MDD
ICC = 0.005 n
I
= 385 N/A n
I
= 515 N/A
n
C
= 440 n
C
= 600
m = 22 m = 30
ICC = 0.07 MDD = 12% MDD = 14%
n
I
= 267 n
I
= 262
n
C
= 3,780 n
C
= 2,920
m = 189 m = 146
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 10 of 11
19. Medical Research Council: Cluster randomsied trials: methodological and
ethical considerations; 2002.[http://www.mrc.ac.uk].
20. Yudkin PL, Moher M: Putting theory into practice: a cluster randomised
trial with a small number of clusters. Statistics in Medicine 2001,
20:341-349.
Pre-publication history
The pre-publication history for this paper can be accessed here:
http://www.biomedcentral.com/1471-2288/11/102/prepub
doi:10.1186/1471-2288-11-102
Cite this article as: Hemming et al.: Sample size calculations for cluster
randomised controlled trials with a fixed number of clusters. BMC
Medical Research Methodology 2011 11:102.
Submit your next manuscript to BioMed Central
and take full advantage of:
Convenient online submission
Thorough peer review
No space constraints or color figure charges
Immediate publication on acceptance
Inclusion in PubMed, CAS, Scopus and Google Scholar
Research which is freely available for redistribution
Submit your manuscript at
www.biomedcentral.com/submit
Hemming et al.BMC Medical Research Methodology 2011, 11:102
http://www.biomedcentral.com/1471-2288/11/102
Page 11 of 11
... Sample size was calculated using a formula comparing the difference in means of two independent populations by Hemming et al. (2011) 10 There were experimental groups and a control group of 30 people each, totaling 60 people. Descriptive statistics by mean, standard deviation. ...
... Sample size was calculated using a formula comparing the difference in means of two independent populations by Hemming et al. (2011) 10 There were experimental groups and a control group of 30 people each, totaling 60 people. Descriptive statistics by mean, standard deviation. ...
Preprint
Full-text available
Ischemic heart disease (IHD) and Stroke is the leading cause of disability worldwide and the second leading cause of death. The experimental research was used in this research. The purposes of this study aimed to determine the effect of a health literacy (HL) intervention for the prevention of IHD and stroke among at-risk populations. This research employed an experimental pretest and posttest design, and participants were divided into an experimental group and a control group, each with thirty persons. The experimental group was provided by a researcher with a health literacy intervention, while the control group was given only normal activities. The data were collected through questionnaires during September–November 2024. The data were analyzed through a comparison of the means with analysis of covariance and a presentation of mean differences and 95% CI. The results showed that the experimental group had higher mean scores of HL than the control group at a significance level of 0.05.
... Power spectral densities will be calculated across all segments for each participant. Absolute and relative power will be scaled in dB/Hz, that is, 10×log10 (power) to assess four widely studied frequency bands, and age-appropriate boundaries will be used to define them: theta (3-5 Hz), Open access alpha (6-9 Hz), beta (10-20 Hz), gamma (21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40). Global EEG power will be calculated by averaging across all electrodes over all accepted segments. ...
... For the nutrition package, we accounted for the design effect due to clustering from six health centres, (k=6) and assumed a two-sided test with α=0.05 and power of 80%. 32 ...
Article
Full-text available
Introduction Maternal undernutrition and inflammation in utero may significantly impact the neurodevelopmental potential of offspring. However, few studies have investigated the effects of pregnancy interventions on long-term child growth and development. This study will examine the effects of prenatal nutrition and infection management interventions on long-term growth and neurodevelopmental outcomes of offspring. Methods The Enhancing Nutrition and Antenatal Infection Treatment (‘ENAT’) study (ISRCTN15116516) was a pragmatic, open-label, 2×2 factorial, randomised clinical effectiveness study implemented in 12 rural health centres in Amhara, Ethiopia. The study enrolled 2399 pregnant women who were randomised to receive routine care, an enhanced nutrition package (iron and folic acid, monthly household supply of iodised salt, and micronutrient-fortified balanced energy protein supplement for undernourished women), an enhanced infection management package (genitourinary tract infection screening and treatment, and enhanced deworming), or both packages. In the present Longitudinal Infant Development and Growth study, a subset of 480 children of mothers from ENAT will be recruited equally from each of the four study arms and visited at 12, 18, and 24 months of postnatal age. We will evaluate a range of domains and deploy multiple measures to assess child neurodevelopment, including resting electroencephalography and visual evoked potentials, Hammersmith Infant Neurological Examination, eye-tracking, Bayley Scales of Infant and Toddler Development (Bayley-III), and Magnetic Resonance Imaging (MRI). Discussion This study will advance understanding of the impact of nutrition and inflammation in pregnancy on long-term offspring neurodevelopment. This study aims to fill a critical knowledge gap on the benefits of prenatal interventions to promote the health of mothers and their offspring. Ethics and dissemination This study was approved by the Institutional Review Boards of Addis Continental Institute of Public Health (ACIPH/IRB/002/2022) and Mass General Brigham (2023P000461). Results will be disseminated to local and international stakeholders. Trial registration number NCT06296238.
... To calculate the sample size of the participating mothers and babies in the study, we used the sample size calculation for cluster randomized controlled trials with the fixed number of clusters recommended by Karla Hemming and colleagues [26]. We fixed the number of clusters to 20 for logistic and access reasons. ...
Article
Full-text available
Background Maternal and newborn mortality and morbidity remain high in low- and middle-income countries such as Ethiopia. Limited access and dropouts from essential continuum of care interventions are critical factors. In Ethiopia, about one in five completes the continuum of essential care through pregnancy, childbirth, and the postnatal period. Evidence is limited on whether packages of interventions involving key community health actors increase the proportion completing essential maternal and newborn healthcare continuum in rural Sidama regional, state, Ethiopia. Objective This study aims to implement and evaluate the cumulative effectiveness of a package of community-based interventions designed to enhance involvement of key community health actors to improve the completion rate of continuum of maternal care and utilization of essential newborn care. Methods Twenty rural kebeles (clusters) in Sidama Regional State, Ethiopia, are randomly allocated to intervention and control arms. A total of 2000 pregnant women, 1000 per arm, will be recruited between 20th and 26th week of gestation after intervention. Then the pregnant mothers and their newborn babies will be followed until six weeks postpartum between June 2024 and February 2025. In the intervention arm, mothers and newborns will receive targeted interventions at home and in their community designed to improve the completion rate of recommended maternal and newborn care. Control clusters will receive normal care from the state public health system. Primary outcomes will be the difference in the completion of continuum of essential maternal and rate of use of essential and emergency newborn care and referrals between intervention and control clusters. These outcomes include rates of antenatal care completion, facility deliveries with skilled care, completion of at least four postnatal care contacts, and the overall completion of all the way from first antenatal visit through the postnatal care. Newborn outcomes will be measured through essential newborn care utilization and emergency (danger sign) identification and referrals. Secondary outcomes will include the effect of the intervention on reducing neonatal mortality and stillbirths. Conclusion This trial will implement and evaluate a package of community-based interventions within existing community healthcare infrastructure. The outcome may inform evidence-based community-based decisions to improve the continuum of essential maternal and newborn care. Trial registration The trial is registered at Pan African Clinical Trial Registry: PACTR202402782261294
... (13) The number of repeated measurements (t=4), type one error (0.05), power (90%), and an assumed correlation of the repeated measures (r= 0.02) were considered. The sample size was further adjusted to account for the effect of clustering with an assumed Intracluster Correlation Coefficient of ρ=0.05 (17) and a potential dropout rate of 20%,. Therefore, a sample of 72 children with SAM aged 6 to 59 months was sufficient in each group with a total of 144 children across the two intervention arms. ...
Article
Full-text available
Psychosocial stimulation is one of the recommended interventions in the management of hospitalised children with severe acute malnutrition (SAM). However, there is currently limited scientific evidence supporting the effectiveness of the intervention. The study aimed to examine the effects of psychosocial stimulation on the development, nutrition, and treatment outcomes of hospitalised SAM children. A cluster-randomised controlled trial was conducted among health facilities that provide inpatient care for children with SAM in Silti Zone, Ethiopia. Fifty-eight children enrolled in the intervention facilities were provided stimulation intervention during their inpatient care and for 6 months after discharge. Sixty-eight children enrolled from control health facilities received routine inpatient care without stimulation and were followed for six months. Health education was provided to all caregivers on child health-related topics. Child development and nutrition outcomes were assessed four times using Denver II-Jimma and anthropometric measurements while the length of hospitalisation was used to measure treatment outcome. Children in the intervention group showed significantly better scores in Personal Social (p=0.001, effect size=0.77), Fine Motor (p=0.001, effect size=1.87), and Gross Motor (p=0.001, effect size=0.78) developmental domains from baseline to end line. Language domain however showed a significant difference only after discharge and intervention children scored better at six months (p<0.001, effect size=0.59). The intervention significantly improved treatment outcomes (p=0.010), but no significant changes in nutritional outcomes were documented. The findings highlighted the benefits of the intervention and the need to promote these interventions in health facilities within resource-limited settings.
... The sample size calculation is based on a two-group cluster design comparing means, with the following parameters: expected reduction of 0.5 units in HbA1c across two arms [64], a standard deviation of 1, cluster size of 50, design effect of 2, power of 80%, alpha error of 5% (two-sided), and a 10% attrition rate. Each study arm will require 140 participants per site (280 per site total), totalling 840 participants across three sites (3 sites x 140 participants per arm x 2 arms) [65]. The trial will be conducted in 18 clusters, with six clusters per site. ...
Article
Full-text available
Background Cardiometabolic multimorbidity (CMM), characterized by the coexistence of diabetes, hypertension, and cardiovascular disease, poses a major health challenge in India, particularly in rural areas with limited healthcare resources. Lifestyle interventions can manage cardiometabolic risk factors, yet adherence remains suboptimal. Mobile health (mHealth) interventions offer a scalable approach for managing CMM by promoting behaviour change and medication adherence. We will develop and evaluate the MultiLife intervention, a mHealth-based lifestyle toolkit aimed at improving CMM management among individuals receiving primary care in Eastern India in the year 2025. Methods This study is a two-arm, cluster-randomized controlled trial with a hybrid Type 1 design involving 840 participants across 18 primary health centres in Odisha and Jharkhand. Using the Health Belief Model as a conceptual framework, the MultiLife intervention will deliver daily digital reminders, weekly health education broadcasts, and ongoing primary care support in the intervention arm, while the control group will receive the standard ongoing primary care support care. The trained healthcare workers will recruit 50 CMM patients, with a 6-month intervention period, during routine visits in each cluster. Primary outcomes include changes in HbA1c from baseline (T0) to end-line (T6). Secondary outcomes include blood pressure, body mass index, physical activity, and dietary habits. Qualitative assessments will explore intervention barriers and facilitators. Implementation outcomes, assessed through the RE-AIM QuEST framework, will evaluate MultiFrame’s acceptability, adoption, fidelity, and maintenance. A random-effects regression model will be used for difference-in-difference analysis, adjusting for covariates and within-cluster correlations. Discussion The MultiLife trial may provide valuable insights into how mHealth-enabled primary care can enhance patient engagement, adherence, and cardiovascular risk reduction in resource-constrained settings. By integrating patient perspectives, this study could inform scalable digital health strategies for comprehensive CMM management, providing a model for future interventions in similar contexts. Trial registration CTRI.nic.in, CTRI/2024/10/074559, Registered on 1 October 2024.
Article
Cancer care delivery research trials conducted within the National Cancer Institute (NCI) Community Oncology Research Program (NCORP) routinely implement interventions at the practice or provider level, necessitating the use of cluster randomized controlled trials (cRCTs). The intervention delivery requires cluster-level randomization instead of participant-level, affecting sample size calculation and statistical analyses to incorporate correlation between participants within a practice. Practical challenges exist in the conduct of these cRCTs due to unique trial network infrastructures, including the possibility of unequal participant accrual totals and rates and staggered study initiation by clusters, potentially with differences between randomized arms. Execution of cRCT designs can be complex, ie, if some clusters do not accrue participants, unintended cluster-level crossover occurs, how best to identify appropriate cluster-level stratification, timing of randomization, and multilevel eligibility criteria considerations. This article shares lessons learned with potential mitigation strategies from 3 NCORP cRCTs.
Article
A recent study design for clinical trials with small sample sizes is the small n, sequential, multiple assignment, randomized trial (snSMART). An snSMART design has been previously proposed to compare the efficacy of two dose levels versus placebo. In such a trial, participants are initially randomized to receive either low dose, high dose or placebo in stage 1. In stage 2, participants are re‐randomized to either dose level depending on their initial treatment and a dichotomous response. A Bayesian analytic approach borrowing information from both stages was proposed and shown to improve the efficiency of estimation. In this paper, we propose two sample size determination (SSD) methods for the proposed snSMART comparing two dose levels with placebo. Both methods adopt the average coverage criterion (ACC) approach. In the first approach, the sample size is calculated in one step, taking advantage of the explicit posterior variance of the treatment effect. In the other two step approach, we update the sample size needed for a single‐stage parallel design with a proposed adjustment factor (AF). Through simulations, we demonstrate that the required sample sizes calculated using the two SSD approaches both provide the desired power. We also provide an applet to allow for convenient and fast sample size calculation in this snSMART setting.
Article
Background Radiological case reports summarize the imaging characteristics of individual cases to identify patterns and valuable lessons. Their writing can be guided by the CARE reporting guidelines or the CARE-radiology reporting guidelines. However, the impact of following these guidelines, or not following them, on the quality and efficiency of radiological case reports writing remain unclear. Objective To examine whether following the CARE reporting guidelines or the CARE-radiology reporting guidelines, or not following any reporting guidelines, affects the quality and writing efficiency of radiological case reports. Protocol The purpose of this protocol is to outline the background and objectives of this study, as well as the design of a cluster randomized controlled trial (cRCT). We will design a three-arm, multicentre cRCT, targeting radiologists or radiology students from different medical institutions. The intervention include training on the CARE-radiology checklist; the second intervention involves training on the CARE checklist. The control group will follow standard procedures without any intervention. After the intervention, participants will be required to write a case report within a period of 10 days. The primary outcome measures will be the completeness and clarity of the case reports. Discussion This study aims to enhance the adherence of researchers to reporting guidelines for case reports through rigorous design. The results will be further explained and interpreted in the final trial outcomes. Registration This study has been registered with the Chinese Clinical Trial Registry, with the registration number: ChiCTR2400089973.
Technical Report
Full-text available
This report presents findings from a baseline survey conducted by UNICEF and SEAMEO-RECFON to assess the nutritional status of primary school children in these areas. Using a mixed-methods approach, the study gathered data from 798 students and their parents in 40 schools, as well as insights from school staff, healthcare providers, and food vendors. These findings provide essential groundwork for ongoing efforts to create healthier school environments and improve child nutrition across Indonesia.
Article
Full-text available
Techniques for estimating sample size for randomised trials are well established,1 2 but most texts do not discuss sample size for trials which randomise groups (clusters) of people rather than individuals. For example, in a study of different preparations to control head lice all children in the same class were allocated to receive the same preparation. This was done to avoid contaminating the treatment groups through contact with control children in the same class.3 The children in the class cannot be considered independent of one another and the analysis should take this into account.4 5 There will be some loss of power due to randomising by cluster rather than individual and this should be reflected in the sample size calculations. Here we describe sample size calculations for a cluster randomised trial.
Article
Full-text available
This is the second in a four-part series of articles detailing the epistemology of patient safety research. This article concentrates on issues of study design. It first considers the range of designs that may be used in the evaluation of patient safety interventions, highlighting the circumstances in which each is appropriate. The paper then provides details about an innovative study design, the stepped wedge, which may be particularly appropriate in the context of patient safety interventions, since these are expected to do more good than harm. The unit of allocation in patient safety research is also considered, since many interventions need to be delivered at cluster or service level. The paper also discusses the need to ensure the masking of patients, caregivers, observers and analysts wherever possible to minimise information biases and the Hawthorne effect. The difficulties associated with masking in patient safety research are described and suggestions given on how these can be ameliorated. The paper finally considers the role of study design in increasing confidence in the generalisability of study results over time and place. The extent to which findings can be generalised over time and place should be considered as part of an evaluation, for example by undertaking qualitative or quantitative measures of fidelity, attitudes or subgroup effects.
Article
Full-text available
To assess the long term clinical and cost effectiveness of the diabetes education and self management for ongoing and newly diagnosed (DESMOND) intervention compared with usual care in people with newly diagnosed type 2 diabetes. We undertook a cost-utility analysis that used data from a 12 month, multicentre, cluster randomised controlled trial and, using the Sheffield type 2 diabetes model, modelled long term outcomes in terms of use of therapies, incidence of complications, mortality, and associated effect on costs and health related quality of life. A further cost-utility analysis was also conducted using current "real world" costs of delivering the intervention estimated for a hypothetical primary care trust. Primary care trusts in the United Kingdom. Patients with newly diagnosed type 2 diabetes. A six hour structured group education programme delivered in the community by two professional healthcare educators. Incremental costs and quality adjusted life years (QALYs) gained. On the basis of the data in the trial, the estimated mean incremental lifetime cost per person receiving the DESMOND intervention is pound209 (95% confidence interval - pound704 to pound1137; euro251, -euro844 to euro1363; 326,326, -1098 to $1773), the incremental gain in QALYs per person is 0.0392 (-0.0813 to 0.1786), and the mean incremental cost per QALY is pound5387. Using "real world" intervention costs, the lifetime incremental cost of the DESMOND intervention is pound82 (- pound831 to pound1010) and the mean incremental cost per QALY gained is pound2092. A probabilistic sensitivity analysis indicated that the likelihood that the DESMOND programme is cost effective at a threshold of pound20 000 per QALY is 66% using trial based intervention costs and 70% using "real world" costs. Results from a one way sensitivity analysis suggest that the DESMOND intervention is cost effective even under more modest assumptions that include the effects of the intervention being lost after one year. Our results suggest that the DESMOND intervention is likely to be cost effective compared with usual care, especially with respect to the real world cost of the intervention to primary care trusts, with reductions in weight and smoking being the main benefits delivered.
Article
Full-text available
ABSTRACT: Correction to Protocol for the PINCER trial: a cluster randomised trial comparing the effectiveness of a pharmacist-led IT-based intervention with simple feedback in reducing rates of clinically important errors in medicines management in general practices.
Article
Full-text available
Medication errors are an important cause of morbidity and mortality in primary care. The aims of this study are to determine the effectiveness, cost effectiveness and acceptability of a pharmacist-led information-technology-based complex intervention compared with simple feedback in reducing proportions of patients at risk from potentially hazardous prescribing and medicines management in general (family) practice. RESEARCH SUBJECT GROUP: "At-risk" patients registered with computerised general practices in two geographical regions in England. Parallel group pragmatic cluster randomised trial. Practices will be randomised to either: (i) Computer-generated feedback; or (ii) Pharmacist-led intervention comprising of computer-generated feedback, educational outreach and dedicated support. The proportion of patients in each practice at six and 12 months post intervention: - with a computer-recorded history of peptic ulcer being prescribed non-selective non-steroidal anti-inflammatory drugs; - with a computer-recorded diagnosis of asthma being prescribed beta-blockers; - aged 75 years and older receiving long-term prescriptions for angiotensin converting enzyme inhibitors or loop diuretics without a recorded assessment of renal function and electrolytes in the preceding 15 months. SECONDARY OUTCOME MEASURES; These relate to a number of other examples of potentially hazardous prescribing and medicines management. ECONOMIC ANALYSIS: An economic evaluation will be done of the cost per error avoided, from the perspective of the UK National Health Service (NHS), comparing the pharmacist-led intervention with simple feedback. QUALITATIVE ANALYSIS: A qualitative study will be conducted to explore the views and experiences of health care professionals and NHS managers concerning the interventions, and investigate possible reasons why the interventions prove effective, or conversely prove ineffective. 34 practices in each of the two treatment arms would provide at least 80% power (two-tailed alpha of 0.05) to demonstrate a 50% reduction in error rates for each of the three primary outcome measures in the pharmacist-led intervention arm compared with a 11% reduction in the simple feedback arm. At the time of submission of this article, 72 general practices have been recruited (36 in each arm of the trial) and the interventions have been delivered. Analysis has not yet been undertaken.
Chapter
This chapter describes group-randomized trials (GRTs) in context in terms of other kinds of designs and in terms of the terminology used in other fields. It summarizes their development in public health, and characterizes the range of public health research areas that now employ GRTs. The chapter also characterizes the state of practice with regard to the design and analysis of GRTs, and considers their future in public health research. Further, it reviews the steps required to plan a new GRT. GRTs would benefit by following the example of clinical trials, where some evidence of feasibility and efficacy of the intervention is usually required before launching the trial.