Content uploaded by Peter Hegarty
Author content
All content in this area was uploaded by Peter Hegarty on Dec 19, 2024
Content may be subject to copyright.
A DETAILED ANALYSIS OF CLAIMS OF MISCATEGORIZATION BIAS IN STUDIES
OF COVID-19 VACCINE EFFECTIVENESS
Peter Hegarty
Department of Mathematical Sciences
Chalmers University Of Technology and University of Gothenburg
41296 Gothenburg, Sweden
hegarty@chalmers.se
Abstract. In a recent paper, Neil, Fenton and McLachlan (NFM) claimed that a certain kind
of statistical bias, which they termed “miscategorization”, was ubiquitous in studies of Covid-19
vaccine effectiveness. Their literature search yielded 39 peer-reviewed studies and they claimed
that every single one had this bias. Our own analysis of these 39 articles revealed that only about
a quarter of them in fact had the specific bias claimed by NFM. What was much more common
was something closer to what NFM termed “exclusion bias”, since the primary focus of such
studies was typically to compare an unvaccinated cohort with those deemed “fully vaccinated”.
However, many studies also contained at least some information about the effectiveness of the
vaccines amongst “partially vaccinated” participants. Three further types of bias that NFM
identified were also found to be much less prevalent than they claimed. NFM concluded that
any claims of Covid-19 vaccine effectiveness based on these studies are likely to be a statistical
illusion. Overall, we found no basis for this claim. In studies that, in part or in whole, correct
for the biases of exclusion or miscategorization that are in fact present, substantial evidence for
vaccine effectiveness against a spectrum of Covid-19-related outcomes remains.
1. Introduction
Since the rollout of the Covid-19 vaccines there has been a steady stream of academic studies
to assess their real-world effectiveness in preventing Covid-19 outcomes of varying severity,
from asymptomatic infection all the way through to Covid-19-related death. In messaging to
the general public, vaccine effectiveness (VE) against a particular outcome is typically presented
as a single percentage P∈[0,100], whose naive interpretation is that the risk of that outcome,
over a given time interval and conditioned on not having any prior immunity to the virus, for
a typical vaccinee is 1 −P
100 times that for a non-vaccinee.
Among the many reasons why assessing the effectiveness of the Covid-19 vaccines is in reality
a more complicated affair, one in particular is that there is assumed to be a time lag after
administration of the vaccine before it provides significant protection. We are all now familiar
with the notion of being “fully” as against “partially” vaccinated. The vast majority of studies
of VE have as their primary goal to compare outcomes in “fully vaccinated” versus unvaccinated
cohorts. In such studies, the starting point for this comparison is the choice of a time period
t, such that an individual is considered “fully vaccinated” only from time tafter receipt of the
final dose in a primary vaccination schedule (usually one or two doses), or of the most recent
“booster dose” if the intent is to compare boosted individuals with other groups. The choice of
tis obviously somewhat arbitrary, but typically it is either 7 or 14 days.
Even if publicized estimates of VE are high (Pclose to 100) and the public are aware that
they only apply to fully vaccinated individuals, it is still important to know what the situation
is like for those “partially vaccinated”. If someone can expect an initial extended period of low
protection, it will influence their assessment of the overall benefits of vaccination versus the risk
Date: December 19, 2024.
Key words and phrases. Covid-19, Covid-19 vaccines, vaccine effectiveness, statistical bias, miscategorization
bias.
1
of side effects. In an extreme case, if there were a period when VE were negative, then this
would affect one’s assessment even in the case of a completely safe vaccine.
That the most widely used Covid-19 vaccines are “safe” has also been a central and consistent
part of public messaging throughout their deployment. It is also well-known that this claim has
been extensively challenged in certain quarters, in particular regarding those products employing
the novel mRNA-based technology. Since our concern in this article is only with effectiveness,
we adopt an agnostic attitude on the issue of safety and will not comment further on it. This
leaves the issue of how studies of VE deal with partially vaccinated individuals.
In a recent preprint, Neil, Fenton and McLachlan (NFM) [10] alleged that studies of Covid-
19 vaccine effectiveness ubiquitously miscategorized partially vaccinated individuals as unvac-
cinated. Their full defintion of what they term “miscategorization (bias)” is as follows:
Miscategorization: During the arbitrarily defined period, the vaccinated are categorized
as unvaccinated, twice vaccinated categorized as single vaccinated, or boosted categorized as
twice vaccinated.
In the case of a vaccine whose initial effectiveness is low but positive, miscategorization would
generally lower VE estimates for fully vaccinated individuals. More seriously, a negative VE for
partially vaccinated individuals would instead show up as an artificially high VE for those fully
vaccinated.
NFM identified 39 studies of vaccine effectiveness from a search of Pubmed and Scopus. In-
credibly, they claimed that every single one had the miscategorization error. From the text of
[10] it seems clear that what primarily concerned them was the more serious situation above,
though they do not directly allege that any of the 39 studies has evidence for this happening.
Moreover, their text and simulations suggest they were concerned with an even much more egre-
gious form of the bias, whereby only Covid cases amongst the partially vaccinated contributed to
the infection rate for the unvaccinated. In other words, partially vaccinated cases were counted
in the numerator of the infection rate for the unvaccinated, but partially vaccinated individuals
were counted in the denominator for the fully vaccinated. We can state categorically that we
found no evidence in any of the 39 studies of this “trick” being employed. From now on, we will
assume NFM’s allegation of ubiquitous miscategorization bias refers only to what is covered by
the definition above.
In an earlier paper [3], we compared their database with that underlying another recent meta-
analysis of VE studies [8] performed under the auspices of the WHO. The latter paper, instead
of either Pubmed or Scopus, employed the COVID-19 Study Explorer from the International
Vaccine Access Center at John’s Hopkins University. Our main finding in [3] was, in fact,
the lack of overlap between the two databases. Only 2 of the 39 articles identified by NFM
could possibly have contributed1to the meta-analysis in [8]. But when we further investigated
these two studies, it turned out that only one of them [1] in fact had a bias which met the
above definition of Miscategorization. The other study [2], which compared fully vaccinated and
unvaccinated individuals, did not miscategorize partially vaccinated individuals as unvaccinated,
but instead effectively excluded them from the primary analysis. More precisely, this meant one
of three things:
(a) if an individual had a diagnosed Covid-19 infection while partially vaccinated, they were
excluded completely,
(b) similarly, if an individual was still only partially vaccinated when the study ended, they
were excluded completely,
(c) the comparison of fully vaccinated and unvaccinated individuals employed the standard
concept of person-days, and the days during which an ultimately fully vaccinated individual was
partially vaccinated were not counted.
118 of the 39 articles were present in the John’s Hopkins database, but [8] was only concerned with estimating
VE against the outcome “Covid-19-related death”. Thus filtered, we were left with only 2 items.
2
This “bias” meets the definition of another category identified by NFM and defined by them as
follows:
Exclusion:2Participants who are vaccinated but who become infected or died during the
arbitrarily defined period are neither categorized as unvaccinated or vaccinated but are instead
simply removed from analysis.
Exclusion works differently than Miscategorizaton. Both suppress information about par-
tially vaccinated individuals, but the former does not yield an incorrect estimate of VE for the
fully vaccinated. But furthermore, enough quantitative data on the excluded individuals was
provided in [2] to allow for computation of a floor3for “corrected” VE estimates, which sought
only to compare the unvaccinated, i.e.: never vaccinated, with everyone else. It turned out that
VE estimates remained high when thus corrected. This also sharply contrasted with the overall
conclusion made by NFM, that
“any claims of Covid-19 vaccine efficacy4based on these (39) studies are likely to be a statis-
tical illusion”.
Note that, in addition to Miscategorization and Exclusion, NFM defined three further types
of statistical bias, termed Unverified, Uncontrolled and Undefined, and their conclusion is based
on the full extent of all 5 categories of bias. However, only 12 of the 39 papers are claimed
to possess at least one of the categories other than Miscategorization, so their conclusion is
effectively based on the claim that this particular bias is ubiquitous.
The long and the short of the above discussion is that my initial investigation in [3] left me
sceptical about the claims made by NFM, so I decided to examine in detail all 39 papers in
their bibliography. In the next section, I will describe the results of that investigation. I close
this section with the definitions of the remaining three categories of bias identified by NFM.
Unverified: Participants whose vaccination status is unknown or unverified are categorized
as unvaccinated.
Uncontrolled: Participants are allowed to self-administer or self-report their vaccination
or infection status, became unblinded or sought vaccination outside the study.
Undefined: The authors of the study fail to provide definitions for either or both vaccinated
and unvaccinated cohorts.
2. Summary of findings
A detailed discussion of each of the 39 studies identified by NFM can be found in the Appen-
dix. Numbering from 1-39 follows that in the Appendix. For full bibliographic references to all
39 studies, see the appendix in [10].
First of all, there are really only 38 studies, since Polack et al (#31) and Thomas et al (#37)
are the same paper.
Eight (8) of them (#1,2,4,16,17,23,31,35) are randomized control trials (RCTs). Here the
participants are chosen at the outset and followed throughout the study. Some are administered
2NFM write ”Excluded”.
3The floor is obtained by counting all person-days in (c) from the day of vaccination, but assuming that
everyone who was excluded was of type (a) and became infected instantaneously upon vaccination.
4Technically, the terms “efficacy” and “effectiveness” are not synonymous, but the literature doesn’t seem to
be too careful about the distinction. We shall use the two terms synonymously - we are always referring to the
performance of a vaccine in a real-world study.
3
doses of vaccine, others of placebo, with due attention paid to blinding. The initial choice of
participants is intended to deal from the outset with the problem of matching cohorts.
What should immediately strike one as dubious about NFMs assertions is that, in an RCT
study, miscategorizing a partially vaccinated participant as unvaccinated doesn’t even make
sense, as it would involve placing them in the other cohort, and the cohorts are fixed at the
outset. If one doesn’t want to take account of such individuals when one computes VE, the only
thing that makes sense in an RCT is to exclude them entirely. Indeed, all 8 of the RCTs had
Exclusion bias in their primary efficacy analysis, but 4 of the 8 corrected for it in part or in full
in secondary analyses (see below). Note that the Unverifed and Undefined categories of bias
don’t really make sense in an RCT setting either, but NFM don’t claim any of these 8 studies
to possess either bias.
Twenty-eight (28) of the studies, hence the clear majority, are observational. Such studies
typically involve much larger cohorts (often the whole population of a specific region), rely on
public databases and are often retrospective. Here we see a greater variety of study designs
(many employ the test-negative case-control design, for example), of approaches to matching
cohorts with different vaccination status and subsequent statistical techniques for adjusting VE
estimates. All five categories of bias defined by NFM are at least a priori meaningful in an
observational setting. In particular, one might imagine that the Unverified bias is ubiquitous,
due to incompleteness of public vaccination records. Only one study (#38) explicitly addresses
this issue and claims to be the first study of Covid-19 vaccine effectiveness which specifically
tries to correct for it. However, NFM only assign Unverified bias to 4 of the 38 studies, so they
must have had in mind something beyond this “unintentional” biasing which, in the case of a
vaccine with positive effectiveness, would generally lead to its being under estimated. In fact,
we identified only one study (#26) that seemed to involve a stronger form of Unverified bias,
which could be described as “intentional”, see the Appendix for details.
There was also one study (#33) that admitted to a kind of Miscategorization bias not covered
by NFMs definition. In that study from New York State, a tiny proportion of individuals who
had received a vaccine not approved by the FDA were classified as unvaccinated. Otherwise, we
found that 10 of the 28 observational studies (#3,7,8,21,22,26,28,34,36,39) had Miscatego-
rization bias covered by NFMs definition. Only one of them (#8) attempted to correct for the
bias in secondary analysis, but only did so in part and thereby actually introduced Exclusion
bias as well. Only in #21 was it completely clear, from the data provided, that the proportion
of miscategorized individuals was too small to be able to substantially affect VE estimates, even
in a worst-case scenario.
Three (3) of these 10 studies (#8,22,26) also had Exclusion bias. Sixteen (16) of the re-
maining 18 observational studies, including #33, definitely had Exclusion bias in their primary
efficacy analysis. Only #30 appeared not to have any of NFMs biases at all. We were not
really sure about #29, but decided to assign it Exclusion bias. Together with the 8 RCTs,
this gives a total of 28 out of 36 studies so far that had Exclusion bias. We already mentioned
that 4 of the 8 RCTs corrected for this bias, in part or in full, in secondary analyses. The
same is true for 6 observational studies. Thus, 10 out of 28 studies with Exclusion bias in
the primary efficacy analysis corrected for it, at least partially, in secondary analyses: these
are #2,4,5,9,13,14,15,16,31,38. In fact, in all but three (#2,15,16), the correction was es-
sentially done in full. In all ten, it was clear that all previously high VE estimates remained
signficantly positive after correction. Furthermore, in studies #10,11, while no correction for
Exclusion bias was performed, it was highly likely from the data provided that the proportion
of excluded individuals was too small to significantly affect the high VE estimates presented.
Including #21,30, we thus arrived at a total of 14 studies, out of 36 so far discussed, in
which evidence for high VE remained after correction for Exclusion and/or Miscategorization
bias. None of the 10 studies that actually had such bias and corrected for it, supported the
assertion of NFM that claims of VE based on these studies are likely to be a “statistical illusion”.
However, even in the remaining 22 studies, there was very little direct evidence for NFMs
assertion. Typically, they simply did not provide enough information about the miscategorized
4
or excluded participants to be able to draw any firm conclusion on how bias correction would
affect VE estimates. In a few cases, it was clear that the proportion of excluded individuals
was large enough so that it might have a significant effect on VE, but just as often no data was
provided on excluded participants whatsoever. What seems clear is that researchers studying
the effectiveness of Covid-19 vaccines generally operated from an assumption that effectiveness
was close to zero in the early days after vaccination. Very few studies explicitly acknowledged
making this assumption, which seems to have thus had the character of an “unwritten rule”.
But we did find the occasional passage like the following, in #13:
“The period immediately after the first dose, when immunity is gradually building, was ex-
cluded in the main analyses because the risk ratio is expected to be close to one during this
period ”.
In fact, there was only one study (#35) which had evidence directly pointing to negative VE
soon after a first vaccination, but here even the primary analysis only yielded moderate VE. This
Iranian study involved SpikoGen, an inactivated virus vaccine not used in Western countries. In
all, only 6 of the 36 studies discussed above did not involve mRNA vaccines: #1,4,16,17,24,35.
One study (#28) had CoronaVac as its primary focus, but did a secondary study on BNT162b2
(Pfizer mRNA). The remaining 29 studies all had mRNA vaccines as their primary focus, though
some also included other vaccines. Only one of these, a Singaporean study #36, had any direct
evidence of negative VE, with estimates for efficacy against symptomatic Covid-19 infection
being generally negative, while strongly positive against more severe outcomes.
This leaves two of the 38 studies which are not included in the above discussion. One of these
(#28) was a meta-analysis while the other (#6) was not a VE study at all, rather it considered
the effect of treatment of Covid-19 patients with an oral protease inhibitor called Nirmatrelvir.
We don’t think NFM should have included these two in their list in the first place.
Finally, we consider the remaining three categories of bias identified by NFM. They assigned
at least one of the three biases Unverified, Uncontrolled and Undefined to 11 of the 36 studies
discussed above. By contrast, we determined that only #26 definitely had Unverified bias (over
and above the “unintentional” biasing discussed in #38), and #33 may have had it, we were
not sure. None at all had either Uncontrolled or Undefined bias. In #26, the Unverified bias
involved a large enough proportion of individuals to be potentially significant, but it could not
be corrected for.
Here is a summary of our findings for the 39 papers on NFMs list:
•2 of the items are the same, thus there are 38 different items.
•1 item is a meta-analysis and 1 is not a VE study at all. We omit both from consideration.
•NFM claim that each of the remaining 36 items has Miscategorization bias. We found
that only 10 items had such bias. Indeed, 8 items were RCTs, for which such bias
doesn’t even make sense.
•In addition, 1 item had a type of miscategorization bias not covered by NFMs definition,
but it affected a tiny proportion of the cohort and was not significant.
•NFM claimed that only 2 items had Exclusion bias. In fact, 28 items had this bias in
their primary analysis and it was by far the most common form of statistical bias.
•Of the 28 items with Exclusion bias, 7 had secondary analyses which corrected for it
more or less in full, while 3 more corrected for it in part. In all 10 cases, corrected VE
remained significant.
•In a further 2 items with Exclusion bias and 1 with Miscategorization bias, it was clear
that the bias affected too small a proportion of the cohort to be significant.
•1 item had none of the biases defined by NFM at all.
•NFM claimed that 4 items had Unverified bias, whereas we found only 1 item definitely
had that bias, and for 1 other item we were not sure.
•NFM claimed that 5 items had Uncontrolled bias, but we found no item with that bias.
•NFM claimed that 2 items had Undefined bias, but we found no item with that bias.
•Overall, we found that NFM correctly assigned categories of bias to only 6 out of 39
items (#3,7,28,34,36,39).
5
•Of the 22 items with either Miscategorization or Exclusion bias which could not be
corrected for from the data provided in the study, only 1 gave a clear suggestion of
negative VE in the period soon after a first vaccination. But VE was already modest in
this study for fully vaccinated individuals, and the vaccine involved was SpikoGen, a non-
mRNA vaccine deployed outside of Western countries. Of the 29 studies mainly focused
on mRNA vaccines, 1 had evidence for negative VE, even for fully vaccinated individuals,
against Covid-19 infection but not against more severe outcomes. Otherwise, there was
scant evidence that, in NFMs words, “any claims of vaccine efficacy are likely to be a
statistical illusion” resulting from the biases they defined.
Our overall impression is that the claims of NFM are essentially baseless, but beyond that
their work is riddled with errors and very shoddy. Just how shoddy is perhaps best exemplified
by their treatment of items #31,37. Not only did these turn out to be the exact same study,
but they claimed #31 had both Miscategorization and Exclusion bias, and that #37 had only
the former. Moreover, this is probably the best-known and most important study on the list,
since it is the phase-3 RCT conducted by Pfizer/BioNTech which led to FDA approval of its
BNT162b2 vaccine and which received widespread media publicity as having established that
this novel mRNA-vaccine “was 95% effective”. The study’s primary efficacy analysis, comparing
fully vaccinated with unvaccinated individuals, did indeed arrive at such a figure, for effective-
ness against laboratory-confirmed Covid-19 disease. It thus had Exclusion bias, but this was one
of the studies that completely corrected for this bias in secondary analyses, all nicely presented
in a single Figure (see the Appendix) and obtaining an aggregate VE for vaccinees of 82% -
somewhat less impressive than 95% for sure, but still significant, especially since the study also
included a safety analysis which found no signal for serious adverse events. Of all the studies
on NFMs list, one could argue that this is, ironically, the one least deserving of their specific
criticisms.
None of the above excludes the possibility that the studies on NFMs list, and studies of
Covid-19 vaccine efficacy in general, have serious shortcomings other than the statistical biases
defined by NFM. For example:
•Every study we looked at formulated VE in terms of relative versus absolute risk reduc-
tion.
•Only 1 of the 38 studies (#27) defines efficacy in terms of all-cause health outcomes
rather than Covid-19-related ones - specifically, it compares all-cause mortality outcomes
between cohorts of differing Covid-19 vaccination status5. Now it is true that many of
the other studies also conduct a safety analysis and none that do report a strong signal
for serious adverse events. Only the meta-analysis (#18) acknowledges the existence
of evidence for such events to the extent that it attempts a ”risk-benefit” analysis to
determine the net worth of the vaccines. However, as time goes on, it makes more and
more sense to focus on all-cause health outcomes in order to capture potential long-term
effects. This perspective seems to be very underdeveloped in the existing literature.
•Quite a few of the studies on the list make reference to the possibility of VE waning over
time. Some are explicitly concerned with investigating this issue, and reach a variety of
conclusions, though a common thread is that waning is seen as an argument in favour
of boosting (since safety is almost universally postulated, as already mentioned above).
Many others, especially the earlier ones, have a short follow-up time and usually the
authors acknowledge that the possibility of waning is something their study was not
able to investigate.
•Many of the observational studies in particular apply sophisticated statistical techniques
to adjust their VE estimates to take account of unmatched cohorts. While our clear
overall impression was that authors were very meticulous, the complexity of the task
5Study #19 also conducts an ACM analysis, but its primary focus is on Covid-19-related outcomes.
6
obviously leaves room for doubt as to whether there are significant confounding factors
systematically not being accounted for.
It is not our goal here to evaluate the quality of Covid-19 vaccine effectiveness studies in
general, only to evaluate the specific criticisms against such studies levelled by NFM. We have
found that their criticisms are largely without merit. I do want to finish with a few remarks
mitigating my critique, however.
Firstly, of the ten studies which we identified as indeed having Miscategorization bias, four
(#3,7,8,34) were observational studies performed in the UK which used data from English and
Scottish public health authorities. It seems to indeed be the case that the UK public health
authorities systematically had miscategorization bias in their data. Since NFM are all from
the UK themselves, they have paid particular attention to the UK data, especially in media
comments directed at the general public. To the extent that their criticism is levelled at the
UK public health authorities, it carries more weight.
Secondly, study #32 includes some (side) remarks which suggests that Miscategorization bias
might in fact be quite common in the Covid-19 vaccine literature overall, even if they provide
no references to compare with the list of papers considered here. See the discussion of that
paper in the Appendix below.
3. Context of the present work
It is widely accepted in the scientific mainstream that mass vaccination against Covid-19
was a tremendous success. In particular, the award of the Nobel Prize in 2023 to Karik´o and
Weissmann formally declared the success of the novel mRNA-based vaccine technology, with
the press release for the award stating that6: “the (mRNA-based) vaccines have saved millions
of lives and prevented severe disease in many more”.
It is normal for a Nobel Prize, in any scientific field, to be awarded a long time - perhaps
several decades - after the relevant breakthrough was achieved. While it is true that the scientific
advances leading to the mRNA vaccines occurred over several decades, the award specifically
refers to the successful real-world deployment of the technology itself. It should be obvious that
such an evaluation should not be made hastily, especially in the case of a medical innovation
which, as people like Bret Weinstein have pointed out, involves a large-scale intervention in a
multi-layered complex system. A lay observer would be justified, on this basis alone, in regarding
the hasty award of this prize as reckless.
In fact, persistent trends in international all-cause mortality data raise concrete, legitimate
concerns about the long term effects of the mass deployment of Covid-19 vaccines. I have myself
analyzed this data in detail [4] [5]. There is in no way a consensus on how to interpret this data,
in particular because of the difficulty of defining excess mortality, but I am also convinced that
it is at least concerning enough to warrant urgent public discussion. Such discussion remains,
however, almost non-existent and very difficult to conduct. In a recent high-profile case, a
paper in the British Medical Journal by a team of Dutch medical professionals [9] pointed out
the persistence of all-cause excess mortality in many Western countries, mentioned the Covid-19
vaccines as one of several possible factors, and recommended further investigation. It did not
investigate, or even mention, correlation between rates of excess mortality and those of vaccine
uptake. Despite such circumspection, it was upon publication immediately subjected to a
concerted media onslaught, leading to both the journal and the authors’ university conducting an
investigation of the pap er’s genesis. The paper remains published, but along with an “expression
of concern”
7.
Suppression of open scientific debate has two complementary, equally detrimental, implica-
tions. On the one hand, and this is obvious, any flaws in the mainstream consensus are unlikely
to be promptly recognized and corrected, potentially leading to additional harm. But equally
important, and this is less obvious, is that pushing dissident voices to the margins risks creating
6https://www.nobelprize.org/prizes/medicine/2023/press-release
7https://bmjpublichealth.bmj.com/content/2/1/e000282eoc
7
a selection pressure whereby the views that gain prominence are those which most radically dis-
sent from the mainstream rather than those which are scientifically most robust, since the latter
are likely to be more nuanced. Ultimately, this means that even if the mainstream consensus
were eventually to shift, it risks being replaced by a different consensus which is equally rigid
and flawed.
I believe there is already considerable evidence of this unhealthy, echo-chamber atmosphere
taking root amongst scientists who are critical of the mainstream narratives on Covid-era in-
terventions8, in particular mass vaccination9. We also saw this, for example, in the wake of
publication of the Dutch paper mentioned above. While the work was being assailed in the
mainstream, some prominent social media accounts were promoting it as having shown that “35
million deaths were caused by the Covid-19 vaccines”, an outrageous misrepresentation of the
authors’ findings10.
While I think, as my own work details, that there is evidence in the all-cause mortality data for
the possibility of the Covid-19 vaccines already having caused net harm, I have written several
critical reviews ([5], Section 5), [6], [7] of other widely disseminated statistical analyses of this
data. This paper, which builds on [3], is concerned with a different claim that has gained wide
acceptance amongst “Covid-19 dissidents”. It alleges that in published academic studies of the
effectiveness of the vaccines against Covid-19 itself, there is a ubiquitous, purely statistical, bias
resulting from miscategorization of some vaccinated participants as unvaccinated. Moreover, it
is alleged that evidence in these studies for vaccine effectiveness is a “statistical illusion” which
would disappear upon accurately correcting the data in them for these biases. My analysis
shows that both allegations are wildly exaggerated.
References
[1] N. Andrews et al, Duration of Protection against Mild and Severe Disease by Covid-19 Vaccines, The New
England Journal of Medicine, 386, No. 4, (2022).
[2] E. J. Haas et al, Impact and effectiveness of mRNA BNT162b2 vaccine against SARS-CoV-2 infections
and COVID-19 cases, hospitalisations, and deaths following a nationwide vaccination campaign in Israel: an
observational study using national surveillance data, The Lancet, 397, (2021), pp. 1819-29.
[3] P. Hegarty, To what extent do miscategorisation biases affect estimates of Covid-19 vaccine effectiveness ? A
preliminary analysis of competing claims, Preprint (2024).
https://www.researchgate.net/publication/384189219 To what extent do miscategorisation biases affect
estimates of Covid-19 vaccine effectiveness A preliminary analysis of competing claims
[4] P. Hegarty, Excess mortality and the effect of the Covid-19 vaccines. Part 1: European data, Preprint (2023).
https://www.preprints.org/manuscript/202309.0674/v2
[5] P. Hegarty, Excess mortality and the effect of the Covid-19 vaccines. Part 2: Global data, Preprint (2024).
https://www.researchgate.net/publication/379815723 EXCESS MORTALITY AND THE EFFECT OF THE COVID-
19 VACCINES PART 2 GLOBAL DATA
[6] P. Hegarty, Review of the paper: “Spatiotemporal variation of excess all-cause mortality in the world (125
countries) during the Covid period 2020-2023 regarding socio-economic factors and public-health and medical
interventions”, by D.G. Rancourt, J. Hickey and C. Linard, Preprint (2024).
https://www.researchgate.net/publication/387204423 Review of the paper Spatiotemporal variation of excess all-
cause mortality in the world 125 countries during the Covid period 2020-2023 regarding socio-
economic factors and public-health and medical i
8This is especially apparent on social media, where much of this activity takes place, since it remains largely
excluded from academic journals. Most of the dissident academic literature is still confined to preprint servers,
which I expect will also be the fate of this paper.
9Criticism of so-called “non-pharmaceutical interventions” has become more acceptable as time has gone on.
10For example: https://x.com/CartlandDavid/status/1831932684035068337
8
[7] P. Hegarty, Review of the paper: “The correlation between Australian excess deaths by State and booster
vaccinations”, by D.E. Allen, Preprint (2024).
https://www.researchgate.net/publication/387207098 Review of the paper The correlation between Australian
excess deaths by State and booster vaccinations by DE Allen
[8] M. M. I. Mesl´e et al, Estimated number of lives directly saved by COVID-19 vaccination programmes in the
WHO European Region from December, 2020, to March, 2023: a retrospective surveillance study, The Lancet
Respiratory Medicine, 12, No. 9, (2024), pp. 714-727.
[9] S. Mostert, M. Hoogland, M. Huibers and G. Kaspers, Excess mortality across countries in the Western World
since the COVID-19 pandemic: ”Our World in Data” estimates of January 2020 to December 2022, BMJ Public
Health, 2024 2:e000282. doi:10.1136/bmjph-2023-000282
[10] M. Neil, N. Fenton and S. McLachlan, The extent and impact of vaccine status miscategorisation on covid-19
vaccine efficacy studies, Preprint (2024).
https://www.researchgate.net/publication/378831039 The extent and impact of vaccine status
miscategorisation on covid-19 vaccine efficacy studies
9
Appendix: Detailed discussion of all 39 papers identified by NFM
1. Al Kaabi et al
This is an RCT and is intended to study the effects of a booster dose of two different
inactivated-virus vaccines, denoted WIV04 and HB02. It appears to be a continuation of an
earlier study of the effects of an initial primary schedule of both vaccines. The original study
appears to have had 9370 participants, of which 9309 received the booster dose (of either vaccine
or placebo). What is termed the modified full analysis set (mFAS) consists of 9071 participants.
From the text in Section 3.1 and in the Supplementary Materials, the mFAS appears to ex-
clude all participants who tested positive for Covid-19 at least once, from the start of the study
to 14 days after receipt of the booster dose. Thus, this study has Exclusion bias rather than
Miscategorization bias. In order to correct for this bias, we would need to know exactly how
many of the 9309 −9071 = 238 excluded boostees were Covid-19 negative before being boosted,
and not excluded during the 14 days post-boosting for any reason other than testing positive
(there are other exclusion criteria given in the Supplementary Materials). As far as I can see,
such a breakdown of those excluded from the mFAS is not provided anywhere, hence we cannot
properly correct for Exclusion bias in this study.
We can, however, consider a worst-case scenario, since the figures in Section 3.1 suggest that,
of the 238 excluded individuals, 77 and 95 were in the two vaccine groups and 66 in the placebo
group. In the mFAS, there were 36 and 28 Covid-19 cases in the respective vaccine groups,
and 193 in the placebo group. As far as the efficacy of the booster is concerned, the worst-case
scenario is that all excluded vaccinees were excluded only because of a positive Covid-19 test
in the first 14 days post-boosting, whereas none of the 66 in the placebo group were excluded
for this reason. This would yield a maximum of 113 and 123 post-booster cases in the two
vaccine groups, compared to a minimum of 193 in the placebo group. Since total person-years
are similar in all three groups, this would still yield positive, but very modest VE.
On the other hand, I am not sure I am interpreting these numbers for excluded individuals
correctly. If we assume the vast majority of the above 238 individuals were excluded because
of at least one positive Covid-19 diagnosis after the first dose, then it suggests no long-term
effectiveness at all, and indeed somewhat negative effectiveness, for the primary vaccination.
This is suspicious, but I can find no comment whatsoever in the paper related to this. But it
seems equally strange that there would be 193 Covid-19 cases amongst non-vaccinees from 14
days after boosting and only 66 before that, since the mean follow-up time after boosting was
less than 2 months, whereas most study participants received their booster at least 6 months
after the primary schedule. I therefore suspect I am misinterpreting these numbers, but I could
not find any further information in the paper to shed light on them.
Summary: The study does not have Miscategorization bias, but does have Exclusion bias.
We could not extract enough information from the paper to make any meaningful attempt to
correct for the bias.
2. Ali et al
This is an RCT, which investigates the effect of a 2-dose primary schedule of the mRNA-1273
(Moderna) vaccine specifically amongst adolescents. In the primary analysis, Covid cases are
only counted, in both the vaccine and placebo groups, from 14 days after receipt of the 2nd
dose. Hence the study has Exclusion bias, but not Miscategorization bias. In part because the
number of symptomatic Covid cases, in both groups, is so small, the authors are motivated to
perform several secondary analyses. They consider a less stringent definition of Covid cases, and
even asymptomatic cases. More interestingly, form the point of view of correcting for Exclusion
bias, they also compare outcomes in the vaccine and control groups from 14 days after receipt
10
of the first dose. The results are summarized in Figure 3. Interestingly, this partial correction
for exclusion bias does not yield lower VE, in fact it is somewhat higher for the less stringent
definitions of Covid cases.
As far as I can see, no information is provided in the paper about case numbers during the
first 14 days post-first dose, hence we cannot correct completely for Exclusion bias.
Summary: This paper does not have Miscategorization bias, but does have Exclusion bias.
It corrects partially, but not completely for the bias in secondary analyses. There is no evidence
for this correction yielding substantially lower VE.
3. Andrews et al
This paper was already discussed by me in [3]. It is an observational study and does have
Miscategorization bias. We could not find in the paper any information about the numbers of
miscategorized individuals, so cannot correct for the bias. The claims of NFM are thus accurate
in regard to the statistical biases in this paper.
4. Anez et al
This is an RCT, evaluating a primary 2-dose schedule of the Novovax recombinant S-protein
vaccine in adolescents. Approximately 2/3 of participants received active vaccine. The pri-
mary efficacy analysis was conducted from 7 days post-second dose, as asserted by NFM. It was
performed on the so-called per protocol efficacy population (PPEP), which excluded, amongst
others, all those with a prior Covid infection. Thus, the primary efficacy analysis has Exclusion
bias. However, the paper seems to mostly correct for it since, in Figure 3 (also in the Supple-
mentary Material), the total number of cases from receipt of first dose are also given. There
were a total of 18 cases in the placebo group, of which 14 occurred in the PPEP, and 11 in the
vaccine group, of which 6 were in the PPEP. Thus, correcting for Exclusion bias reduces VE
but it remains significant overall. The authors are quite open about this, stating in relation to
Figure 3 that the incidence curves begin to separate only from about day 21 post-first dose.
The only reason this correction of Exclusion bias is not perfect is because the secondary
analysis is performed on a bigger set of participants, the so-called full analysis set (FAS), which
includes even those who had a Covid infection before the receipt of the first dose. These 359
individuals comprised 16% of the FAS. 234 of them received active vaccine and 125 received
placebo. This is close to the ratio in the entire FAS (1484 vs. 748), and in the PPEP (1205
vs. 594). Assuming no reinfections, we would thus obtain VE estimates fully corrected for
Exclusion bias. I couldn’t see this assumption confirmed anywhere in the text of the paper, but
given the small total number of cases it is highly likely to be true.
Summary: This paper does not have Miscategorization bias. The primary efficacy analysis
does have Exclusion bias, but secondary analysis corrects for it, perhaps completely. In the case
of this paper, it would be completely wrong to suggest that evidence for VE disappears once
the classes of bias identified by NFM are corrected for.
5. Angel et al
This observational study is a retroactive cohort study, involving workers at a health facility
in Israel. As with the Israeli population in general, BNT162b2 (Pfizer) was the only vaccine
administered. Participants were considered fully vaccinated from 7 days after receipt of the
second dose. The primary efficacy analysis compared those fully vaccinated with unvaccinated
workers who were still being followed up 28 days after the start date of December 20, 2020,
an interval meant to correspond to the recommended 21-day interval between vaccine doses.
Person-days for the primary analysis were counted from day 28 of the study for the unvaccinated
11
group and from day 7 after second dose in the fully vaccinated group. All those with an
earlier Covid infection were excluded. Hence, the primary analysis certainly has Exclusion bias.
However, a secondary analysis is performed comparing those “partially vaccinated” to non-
vaccinees. An individual was considered partially vaccinated from days 7-28 after the first dose.
Note that this comparison only counted person days within this 21-day window, not the entirety
of the period from day 7 after first dose. Thus, in order to compute VE for the entire period
from day 7 post-first dose one needs to aggregate the numbers in the primary and secondary
analyses, but this can be done and this already yields a partial correction for Exclusion bias.
All the necessary ingredients are in Table 2. Moreover, since the total number of Covid cases
are provided amongst both the unvaccinated (85 symptomatic and 31 asymptomatic, of which
10 and 6 respectively occurred within the first 7 days of the study) and the vaccinated (64
symptomatic and 63 asymptomatic, of which 25 and 7 respectively occurred within 7 days of
first dose), we can almost completely correct for Exclusion bias. All that’s missing is the exact
number of person days for those testing positive in this initial 7-day window. However, this
is a minor source of uncertainty. After correction for Exclusion bias, VE remains high against
symptomatic infection and at least moderate against asymptomatic infection.
NFM also claimed that this study had Uncontrolled bias. Note that the above definition of
that term leaves several options. The only thing that could possibly apply to this paper is that
“participants are allowed to self-report their infection status”. The actual weakness in the study
is that testing protocols for most fully vaccinated workers (those not deemed at “high risk of
exposure to Covid”) were relaxed in the middle of the study period, leaving the possibility of a
bias in detection of cases. Given the study environment and the fact that same-day testing was
always available to all workers, it seems a priori unlikely that there would have been a signfi-
cant detection bias, at least for symptomatic cases. The authors tried nevertheless to correct for
possible detection bias by performing a so-called propensity-score adjusted sensitivity analysis,
which basically sought to match non-vaccinees with a subset of vaccinees with similar testing
habits. Overall, there isn’t any indication in the data that this kind of bias had a significant
effect on the results.
Summary: The study does not have Miscategorization bias. The primary analysis has Ex-
clusion bias, but it can be almost fully corrected for from the data provided. After correction,
VE against symptomatic Covid infection remains high. The study does not have Uncontrolled
bias either, according to the strict definition of that term given above. However, relaxation of
testing protocols for the fully vaccinated yield the possibility of a so-called detection bias. The
data suggests, however, that this was not signficant.
6. Arbel et al
This paper is concerned with the effect of treatment of Covid-19 positive patients with Nir-
matrelvir, an oral protease inhibitor, during the omicron wave of early 2022. Thus, it is not
a study of Covid-19 vaccine effectiveness at all. There are several other Covid-related papers
with Arbel as first author in the literature, so perhaps NFM listed the wrong paper.
7, 8. Bermingham et al (2023, 2023b)
The first paper is an observational study based on the entire adult population of England, a
total of slightly less than 42 million people. NFM are correct that this paper has Miscategoriza-
tion bias, with individuals being categorized as (n−1)-times dosed (unvaccinated means 0-times
dosed) up to 21 days after receipt of the n:th dose, for each n≥1. I can find no information
in the paper about the number of miscategorized invidividuals, nor the number of Covid cases
amongst such individuals, hence no correction for the bias is possible.
The second paper seems to be based on exactly the same underlying dataset, but begins by
extracting a cohort of about half a million individuals. Otherwise. the methodology is the
12
same and thus we have the same Miscategorization bias. However, in Supplementary Figure
4 they re-calculate all VE estimates with individuals excluded from the unvaccinated category
who are within 21 days of receipt of a first dose. This still leaves us with a combination of
Miscategorization and Exclusion biases: individuals who recently received subsequent doses
remain miscategorized, while we are not provided with the number of excluded individuals.
Furthermore, the re-calculated VE numbers are only represented as dots in a figure, no precise
numbers are given. On the other hand, none of them appear markedly different from the
corresponding numbers in the primary analysis (Table 2), so there is no strong indication that
fully correcting for Miscategorization bias would have a drastic effect on VE.
NFM claim that the second paper also has Undefined bias. This is simply not true, vaccina-
tion status categories are defined clearly in Supplementary Table 2, and are exactly the same as
in the first paper. It’s very odd that NFM would even suspect this bias to occur in one of the
two papers and not the other, given that both evidently work from the same data and adopt
the same methodology.
Summary: Both papers have Miscategorization bias. However, paper (b) does not have Unde-
fined bias, as claimed by NFM. Paper (b) also partly corrects for the Miscategorization bias,
but a combination of Miscategorization and Exclusion bias remains.
9. Baum et al
The primary analysis in this observational study compares unvaccinated individuals with
fully vaccinated and boosted individuals. An individual is considered fully vaccinated 14 days
after receipt of the 2nd dose of a 2-dose primary schedule and boosted from 14 days after receipt
of a third dose. Hence, the primary analysis has Exclusion rather than Miscategorization bias.
However, in the Supplementary Material (Tables S6-S12), we have more or less full correction
for this bias. VE estimates are computed, with the unvaccinated always as the reference group,
for various subgroups of the ever-vaccinated population. These subgroups partition the entirety
of the ever-vaccinated population. Moreover, since numbers of cases and person-years are pro-
vided for each subgroup, one could compute aggregate VE for the entirety of the ever-vaccinated
population. The authors don’t do this, but it is clear from the numbers in Tables S6-S12 that
such aggregate VE estimates will remain high for each of the three Covid-19 outcomes studied
(hospitalization, severe diesase (defined in the paper) and death).
Summary: This paper does not have Miscategorization bias. The primary analysis does have
Exclusion bias, but secondary analysis corrects for it, essentially in full. Hence, this is another
paper where NFM are completely wrong to suggest that evidence for VE disappears once the
biases they identify are corrected for.
10. Buchan et al
This is a test-negative case-control observational study performed on the adult population of
Ontario, Canada. The reference group of unvaccinated tests (since only tests performed in a 20-
day period from 6-26 December 2021 were included in the study, I presume very few individuals
contibuted multiple tests) were compared with those from individuals who had received either
2 or 3 Covid-19 vaccine doses. All tests from individuals who had received just 1 dose, or had
received a second dose less than 7 days before their test, were excluded. Hence this study has
Exclusion bias, but not Miscategorization bias. Tests from individuals who had received 4 doses
were also excluded, which is appropriate since no VE estimates were computed for this group.
Further, while individuals were considered “boosted” from day 7 after a 3rd dose, VE estimates
against the various Covid-19 outcomes were also computed for days 0-6 after a third dose, so
there was no exclusion bias favouring boosted individuals.
13
The exclusion bias for the partially vaccinated (1 dose, or less than 7 days after a second
dose) was not corrected for. However, according to eFigure in the Supplementary Material, of
an initial sample of 222,880 tests, only 3,250 were excluded because of the vaccination status of
the testee, and this includes those who had received 4 doses. So it is unlikely that correction
for the exclusion bias would drastically affect the VE estimates.
Summary: This paper does not have Miscategorization bias. It does have Exclusion bias,
but both partially vaccinated and very highly vaccinated individuals were excluded, so a priori
the bias does not necessarily favour vaccinees. Moreover, the proportion of tests excluded on
the basis of vaccination status was small (less than 2% of the total), so it is unlikely that the
exclusion bias signficantly affected the VE estimates.
11. Carazo et al
This is another test-negative case-control observational study, with tests performed on health
care workers in Quebec, Canada during the period March 27 - June 4, 2022. Unlike the vast
majority of studies in this appendix, participants with a previous infection were not all ex-
cluded. In other words, this study sought not only to estimate the efficacy of the vaccines alone,
but also that of previous infection and of the vaccines and previous infection in combination.
Nevertheless, despite the fact that the study period was well after the emergence of BA.1, and
in a period when BA.2 was dominant, a large majority of the tests finally included were from
individuals with no previous infection (Table 2). For this group, estimates of the efficacy of the
vaccines alone were then computed.
The study has Exclusion bias instead of Miscategorization bias. NFM suggest that tests on
individuals within 14 days after receiving their most recent dose were excluded. This is not
quite right either: it is the correct period for exclusion of 1-dosed individuals, but for 2- and
3-dosed individuals the exclusion period was only 7 days. All 4-dosed individuals were excluded,
as in the previous paper of Buchan et al, which is appropriate since no efficacy estimates were
computed for this group. A large majority of the tests included in the study were from 3-dosed
individuals.
Supplementary Figure A1 tells us that, from an initial sample of 256,636 tests, 8,746 were ex-
cluded because of the above criteria, of which 4,983 were from 4-dosed individuals. Many more
were excluded for a variety of other reasons, and the final sample consisted of 111,239 tests from
unique individuals (37,732 cases and 73,507 controls), of whom 88,384 had no previous infection
(34,552 cases and 53,832 controls). Hence, even in a worst-case scnario, those contributing to
the Exclusion bias represent only a small fraction of the total sample size. Correction for the
bias is thus unlikely to significantly alter the VE estimates, though we can’t be sure because no
further breakdown into cases and controls, or with respect to number of doses, is provided for
these excluded tests.
Summary: The study does not have Miscategorization bias, but does have Exclusion bias.
While not enough information is given to properly correct for the bias, we do know that the
fraction of excluded individuals is small and hence that the bias is unlikely to have signficantly
affcted the VE estimates.
12. Chung et al
Like Buchan et al, this is a test-negative case-control observational study, performed on the
adult population of Ontario. However, this study covered a much longer period, January 11 to
November 21, 2021, which is also before the period of the Buchan study. Furthermore, it was
only concerned with the efficacy of a primary vaccine schedule (2 doses), so that all tests from
people with 3 or more doses were excluded, appropriately. Otherwise, the study has the same
Exclusion bias as Buchan et al, since tests from all individuals who had received only one dose,
14
or a second dose less than 7 days before the test, were excluded. The most relevant difference
from the Buchan et al study is that the proportion of tests thereby excluded was in this study
significant. Of a total of about 7.9 million tests, close to 1.6 million were excluded because of
the vaccination status of the testee (Supplementary Figure 1). We are not told how many of
these had 3 doses, nor how many were cases. So we cannot correct for the Exclusion bias and
this time we must admit the possibility of the bias being significant.
Summary: This study does not have Miscategorization bias, but it does straightforwardly
have Exclusion bias.
13. Dagan et al
This is one of the earliest real-world observational studies of the efficacy of the BNT162b2
mRNA-vaccine (Pfizer), performed in Israel from late December 2020, through late February
2021. It is concerned with the efficacy of both 1 dose and 2 doses of the vaccine. Since in Israel
at this point, 21 days was the recommended gap between doses, the study defines individuals as
1-dosed from 14-20 days after receipt of the first dose. It defines them as 2-dosed from 7 days
after receipt of the second dose. The primary analysis computes VE separately for these two
groups, with the unvaccinated as the reference group in both cases. Hence, the primary analysis
has Exclusion bias rather than Miscategorization bias. Even the description of the exclusion
bias by NFM is not quite right, since they assert that there is 14-day exclusion period for both
VE estimates.
More importantly, in secondary analyses, the exclusion bias is completely corrected for. The
main text of the article already presents the cumulative incidence curves (Figure 3) for all
Covid-19 outcomes studied, as well as numerical VE estimates for the period 21-27 days after
the first dose (Table 2). This still excludes those less than 14 days after their first dose, but the
remaining correction is performed in Tables S3 and S4 of the Appendix which, in particular,
present VE estimates for the entire group of vaccinees: row “0-End of follow-up” in these tables.
Aggregate VE for all vaccinees goes from 42% against asymptomatic infection up to 80% against
Covid-19-related death.
Summary: This paper does not have Miscategorization bias. The primary analysis does have
Exclusion bias, but secondary analysis corrects for it in full. Once again, we have a paper where
NFM are completely wrong to suggest that evidence for VE disappears once the biases they
identify are corrected for.
14. Ferdinands et al
This observational study is a test-negative case-control study performed on patients admitted
to hospitals, emergency departments and urgent care centers in 10 US states over an 18-month
period. NFM claim that the study has both Miscategorization and Exclusion bias, and even
Unverified bias.
As with most of the papers so far analysed, it doesn’t have Miscategorization bias, but
does have Exclusion bias in the primary analysis. NFM are not quite accurate regarding the
time periods for exclusion. Individuals are considered “partially vaccinated” from 14 days after
the first dose, and “2-dose vaccinated” from 14 days after the second dose. Here, NFM give
the correct number of days. However, VE estimates are also computed for 3- and 4-dosed
individuals, and in these cases only those receiving their latest dose within 7 days of the index
date are excluded. As well as excluding, appropriately, all those who had received more than 4
doses, the primary analysis thus excludes all those within 14 days of a first dose, or within 7 days
of a 3rd or 4th dose. But secondary analyses include these testees, who are termed as having
“indeterminate vaccination status”. In Supplementary Tables S16 and S17, VE estimates are
computed separately for groups labelled “indeterminate” (0-14 days after first dose) and “3-dose
15
indeterminate”(0-7 days after third dose). Thus, we can almost completely correct for Exclusion
bias: all that’s missing is data on “4-dose indeterminate”individuals. The VE estimates for the
“indeterminate” group are close to zero, but these are a small fraction of the overall sample
so they do not significantly reduce the aggregate VE estimates for all vaccinees. The text of
the paper also includes phrases like “these patients were not expected to have substantial vaccine
induced protection”, thus making clear that the authors of this paper are applying the underlying
assumption of VE close to zero soon after receipt of a first dose.
As regards Unverified bias, it seems that the study does not have that either, though we are
not 100% sure. In the section Vaccination status, it states that “Patients with no record of
vaccination before the index contact date were considered unvaccinated”, and then lower down
that “Vaccination status was ascertained from immunization registries, electronic health records
and insurance claims ”. This seems to us describe the normal procedure for determining vacci-
nation status - to classify someone as unvaccinated if there is no official record of a vaccination.
Moreover, Figures S1 and S2 make clear that individuals “with possibly invalid or incomplete
vaccination records prior to index date” were excluded from all analyses. Hence, the method
of assigning vaccination status in this study doesn’t seem to differ from all the others above.
Since no previous study was claimed to have Unverified bias, it is highly improbable that the
designation is correct in this case either. See also the discussion on “intentional” versus “unin-
tentional” Unverified bias in Section 2.
Summary: The study does not have Miscategorization bias, and almost surely does not have
Unverified bias. The primary analysis does have Exclusion bias, but secondary analysis almost
completely corrects for it. Exclusion bias only concerns a small fraction of the total sample,
and correcting for it does not significantly affect aggregate VE estimates for all vaccinees. The
claims of NFM are thus almost completely wrong for this study as well.
15. Haas et al
This observational study was already discussed by me in [3]. It has Exclusion bias instead of
Miscategorization bias. The information provided in the paper allows for partial correction of
the bias.
16. Heath et al
This is an RCT, intended to evaluate the efficacy (and safety) of two doses of the recombinant
S-protein-based vaccine NVX-CoV2373. The efficacy analysis does not have Miscategorizaton
bias, but instead has Exclusion bias, since the per-protocol (PP) population for efficacy con-
sisted only of those who received both doses of vaccine or placebo and had no confirmed cases
of symptomatic Covid-19 up to 7 days after the second dose. VE for this population against
symptomatic Covid-19 disease was computed as 82.7%. However, Figure 2A presents the cumu-
lative incidence curves for all those who received the first dose - the “intention-to-treat” (ITT)
population. In the vaccine group, 25 of 60 cases occurred in days 0-14 after the first dose,
whereas 24 of 186 cases in the placebo group occurred in this period. Thus, VE was effectively
zero (or worse) soon after vaccination, since the two groups were of approximately equal size.
The curves diverged soon thereafter, however and, while we can’t correct completely for ex-
clusion bias since we don’t know the exact number of person days contributed by the non-PP
participants, it is clear that VE for the entire ITT population will still be significant.
NFM claim that this paper also has Uncontrolled bias. The only reason I can think of for
why they suggest this is because this study is a continuation of an earlier one and includes
a blinded crossover. At the time of the crossover, participants had three choices: to become
unblinded, to remain blinded and not crossover (and thus receive no further doses of either
vaccine or placebo), or to crossover. However, the text makes clear that, for the efficacy anal-
ysis, participants were censored as soon as they either became unblinded (for any reason), or
16
received their first crossover dose. Unblinded participants only remained in the study for the
safety analysis, not the efficacy analysis. Hence, I don’t think Uncontrolled bias exists at all.
Summary: The study has neither Miscategorization nor Uncontrolled bias, as claimed by NFM.
It has Exclusion bias, but cumulative incidence curves are presented for the ITT population,
allowing for partial correction for this bias and certainty that VE would remain significant even
with full correction.
17. Khairullin et al
This is an RCT performed in Kazakhstan and with a whole-virion vaccine called QazCovid-in.
2400 participants were in the vaccine arm and 600 in the placebo arm. The efficacy analysis
has Exclusion bias, since it only considered cases of symptomatic Covid-19 occurring 14 days
or more after administration of the first dose. Unusually, VE was not estimated separately
for those who received a second dose, though this comprised the vast majority of participants.
The data for the VE calculations are presented in Table 4. A total of 31 cases were recorded
from day 14 onwards in the vaccine arm, and 43 in the placebo arm, from which a high VE
was calculated. Oddly, however, 7 of 8 cases classified as “moderate” occurred in the vaccine
arm, as did the only recorded “serious” case. Moreover, all 4 cases that occurred in the first
14 days post-first dose were in the vaccine arm, and we are not told how many of these were
mild. Given that the follow-up period was 180 days, the total number of cases observed seems
low. Another oddity in this paper is that the simple formula for computing VE, presented in
the section Vaccine efficacy, does not seem to employ person-days - in other words, all cases
were given equal weight, rather than censoring cases at the time of detection. Neither is there
any attempt made to adjust the VE estimate for inadequate matching of cohorts. Indeed, the
section Statistical analysis only talks about the safety and immunogenicity analyses, so
the efficacy analysis seems to have been assigned the lowest importance. On the other hand,
Figure 2 does present cumulative incidence data.
Summary: The study has Exclusion rather than Miscategorization bias, and it could be par-
tially corrected for from the available data. However, the efficacy analysis in this study does
not seem to be performed very rigorously to begin with.
18. Kitano et al
This is the only meta-analysis on the list. It is also one of the few studies which acknowl-
edges risks with the Covid-19 vaccines and performs an overall risk-benefit analysis. However,
since the “risks” and “benefits” are essentially analyzed separately, then somewhow aggregated,
and “benefits” are taken to be synonymous with vaccine effectiveness against various Covid-19
outcomes, we can extract VE estimates from the study. In fact, these are presented straight-
forwardly in Web Table 2, and they rely on a meta-analysis comprising both published studies
and “gray literature”. Taken at its word, the study has Exclusion bias rather than Miscatego-
rization bias, since individuals are considered “2-dosed” from 14 days after the second dose, and
considered “3-dosed” or “4-dosed” from 7 days after the respective dose (as asserted by NFM).
We have not checked whether the literature on which the figures in Web Table 2 are based
consistently apply the same exclusion criteria. In any case, there is no information given in the
study itself which would allow the exclusion bias to be corrected for.
Summary: The study has Exclusion rather than Miscategorization bias.
17
19. Liu et al
This observational study is a retrospective study conducted on the entire adult population of
Australia. The period of observation is January 1 to November 30, 2022 and VE is estimated
against both “Covid-19 death” and all-cause mortality. For each n∈ {2,3,4}, an individual is
considered n-dosed from day 8 after receipt of the n:th dose (as asserted by NFM) and VE esti-
mates are presented for n-dosed individuals, n∈ {2,3,4}, in Figures 1, 2 and 3. Thus the paper
has straightforward Exclusion bias (5-dosed individuals were also excluded, appropriately). In
the section Analysis the authors write “The 0-7 day interval was included in analyses but not
shown due to small numbers”. This suggests that correcting thus far for exclusion bias would
not affect the results signficantly, but no data is provided in the paper for the reader to verify
this independently. Nor is any correction made so as to include individuals who received only
a single dose of a 2-dose primary schedule, which one can reasonably expect to be the most
signficant contributor to the exclusion bias.
Summary: The paper does not have Miscategorization bias but instead has Exclusion bias.
The authors make some remarks suggesting this bias is not signficant, but the data is not pro-
vided which would allow the reader to verify this independently.
20, 21. Lygnse et al (2022, 2022b)
Both are observational studies which investigate VE in a very particular setting, namely
secondary transmission within households.
The first study was performed in Denmark in the period June-October 2021, when the Delta
variant was dominant. Unvaccinated individuals are compared with those deemed “fully vacci-
nated”, the latter defined as having received the final (1st or 2nd) dose of a primary schedule
at least kdays before the index date - i.e.: the test-positive day of the primary case in their
household - where k∈ {7,14,15}depends on the vaccine brand (as asserted by NFM). The
section Methods-Vaccines states: “Individuals that were in the period between first dose and
fully vaccinated were defined as partially vaccinated and excluded. Individuals that had received
a booster vaccination were also excluded ”. Thus, this is a straightforward case of Exclusion bias,
and no data whatsoever is presented for the excluded individuals, so no attempt at correction
for the bias can be made.
The second study was instead performed in December 2021, when both the Delta and Omicron
variants were circulating widely in Denmark. The term “fully vaccinated” is defined as in the
first study. However, there are three relevant differences from the first study:
•VE was also computed for “boosted” invididuals, defined as having received a booster
dose at least 7 days before the index date. Individuals who had received the booster
dose less than 7 days before the index date were included among the “fully vaccinated”.
•Partially vaccinated invididuals were counted here as “unvaccinated”.
•The study was also concerned with the protective effect of previous infection. Hence,
unlike with most of the studies in this list which excluded all previously infected individ-
uals, the primary analysis in this study counted as “fully vaccinated” all unvaccinated
individuals who had been infected more than 14 days previously.
The first two items meet the definition of Miscategorization bias. We are told that there were 59
partially vaccinated individuals, out of a total of 87,677 participants (Table 1), and since there
were also thousands of secondary cases this miscategorization will definitely not have signficantly
affected VE estimates. We are not told how many boosted participants were miscategorized as
fully vaccinated, but the VE estimates are higher for those boosted anyway.
18
The third item doesn’t meet any of the definitions of NFM, but it isn’t a problem anyway
because, in the Supplementary Material (Tables S7, S8, S22), efficacy estimates are re-calculated
with previously infected individuals excluded.
Finally, NFM claim that the second study also has Unverified bias. We can find no evidence
of this. Indeed, both studies use the same Danish registry data to determine vaccination status,
so it doesn’t make sense for one to have this bias and not the other.
Summary: Lygnse (2022) does not have Miscategorization bias, but does have Exclusion bias
which cannot be corrected for. Lyngse (2022b) does have Miscategorization bias, but the par-
tially vaccinated participants miscategorized as unvaccinated are only a tiny proportion of the
total sample, so the effect of the bias on overall VE estimates is certainly negligible. Lyngse
(2022b) does not appear to have Unverified bias either.
22. Mitchell et al
This observational study deals with patients hospitalized with Covid-19 during the first 2+
years of the pandemic in a collection of Canadian hospitals, operating under the auspices of
a surveillance program called CNISP. As far as exploring the effect of vaccines is concerned,
the paper doesn’t actually compute any VE estimates, rather it just computes the cumulative
incidence rates of two kinds of outcomes - ICU admission and Covid-related death - amongst
patients with different vaccination status. The relevant results are in Figure 4. Patients are
divided into three vaccination status categories: unvaccinated, fully vaccinated and fully vacci-
nated with an additional dose. In the section Definitions we read that “Fully vaccinated was
defined as a patient with symptom onset or specimen collection date that was 14 or more days
following receipt of a second dose of a 2-dose COVID-19 vaccine or a single dose of a 1-dose
COVID-19 vaccine. Fully vaccinated with an additional dose was defined as a fully vaccinated
patient with symptom onset or specimen collection date that was 14 or more days following re-
ceipt of at least 1 additional dose of a COVID-19 vaccine”. Then, in the caption below Figure
4, we read: “Partially vaccinated patients (defined as patients with a symptom onset or a speci-
men collection date that was ≥14 days following receipt of a first dose of a 2-dose COVID-19
vaccine or <14 days after receiving a second dose) were excluded from these analyses”. The
latter extract makes clear that the study has Exclusion bias. Though I can not find it stated
explicitly anywhere, it also appears that individuals were classed as Unvaccinated up to 14 days
after a first dose of a 2-dose primary schedule. If so, then the study also has Miscategorization
bias. As I could not find any data on the Excluded or Miscategorized individuals, no correction
for these biases is possible.
Summary: The study does appear to have Miscategorization bias, but also has Exclusion
bias. No correction is possible for either.
23. Mu˜
noz et al
This RCT, involving children from 6 months to 4 years of age, divided approximately 4500
participants in a 2:1 ratio to receive up to 3 doses of BNT162b2 or placebo. In the section
Efficacy we read: “Vaccine efficacy against the first occurrence of Covid-19 from at least 7
days after dose 3 to the data-cutoff date (June 17, 2022) was calculated”. This is the only VE
calculation performed and thus we have Exclusion bias. The bias is potentially significant, since
only about 1/3 of the participants had received a 3rd dose by the cutoff date. As far as I can
see, no data is given on Covid-19 incidence amongst the remaining 2/3 of participants, thus no
correction for the exclusion bias is possible.
Summary: The study does not have Miscategorization bias, but does have Exclusion bias.
Since about 2/3 of all participants are excluded, the bias is potentially large in this study, but
19
no correction is possible from the data provided.
24. Nadeem et al
This is an observational, test-negative case-control study performed on the elderly (60+)
population of Faisalabad province in Pakistan, and intended to evaluate the efficacy of 2 doses
of the Sinopharm (BBIBP-CorV) vaccine. Unvaccinated individuals were compared with those
“fully vaccinated”, defined as having received the 2nd dose at least 14 days before the test. The
paper states clearly that “partially vaccinated” individuals were excluded and, while it is never
stated explicitly, we assume this to mean everyone who was tested and was neither unvaccinated
nor “fully vaccinated” (since the study was performed in mid-2021, there are unlikely to have
been any boosted individuals, but nothing is said about this either). Hence, we probably have
straightforward Exclusion bias, which cannot be corrected for since no data at all is presented
on the excluded testees. Adding to the uncertainty in how the bias might affect the results is
that the pattern of VE estimates in this paper is unusual. In Table 3, we see very high VE
estimates against Covid-19 detection and death, but considerably lower against hospitalization
and ICU admission. This is an unusual pattern. Moreover, Table 1 suggests a considerably
healthier unvaccinated cohort, in itself unusual, plus it’s not clear if and how VE estimates are
adjusted to account for this. Finally, in the section Results it states that the vaccination rate
amongst testees was 40%, whereas the subsequent numbers imply that it was rather 40% who
were unvaccinated.
Summary: The study most probably has Exclusion rather than Miscategorization bias. No
correction for the bias is possible.
25. Nordstr¨
om et al
This is an observational, retrospective total population cohort study. It is the only paper
on the list which NFM claim to have Mistcategorization, Exclusion and Undefined bias. It
definitely has Exclusion bias, and we can understand why NFM concluded it also had the other
two, since there are parts of the text which seem to contradict one another as to whether
certain partially vaccinated individuals were included as being “unvaccinated”. A complicated
procedure for matching vaccinated and unvaccinated cohorts adds to the confusion, as does the
parallel use of three terms vaccinated/fully vaccinated/exposed.
However, we think a careful reading suggests that Exclusion bias is the only one present in
the VE calculations. In determining this, the following extract from the section Study design
and participants is the crucial one (the bold font is mine):
“From this cohort, each individual who was vaccinated with two doses, with no documented
SARS-CoV-2 infection and alive within 14 days of vaccination, was matched (1:1) to one ran-
domly sampled individual from the rest of the cohort on birth year and sex. Baseline for both
individuals in each matched pair was set to the date of the second dose of vaccine in the vac-
cinated individual. Matched individuals were excluded if they received a first dose of
vaccine, had a documented previous SARS-CoV-2 infection, or died within 14 days
of baseline, whereby a new individual was searched from the remaining total cohort.”
The last sentence seems to confirm that, for the purpose of the VE estimates, “unvaccinated”
means having received no doses at all. Judging from Figure 1, the term “unexposed” seems to
include all those who received at most 1 dose. Hence, “exposed” individuals were first matched
to “unexposed” ones, but an unexposed match was discarded if they were not unvaccinated.
As the whole extract suggests, and as the rest of the paper makes clear, the comparison is
between the unvaccinated and “fully vaccinated” individuals, defined as being at least 15 days
after a second dose (only vaccines with a 2-dose primary schedule were administered). The
20
presence of Exclusion bias is clear. Unusually, the second sentence in the quote above suggests
that person-days for fully vaccinated individuals (and their unvaccinated matches) were counted
from the day of receipt of a second dose, rather from 15 days afterwards, which adds another
wrinkle to the Exclusion bias. Note that Figure 1 gives the number of vaccinated individuals ex-
cluded due to death in the 14-day post-second dose exclusion window, but doesn’t say anything
about the number of discarded “unexposed” matches. Hence, we can’t meaningfully attempt
to correct for the Exclusion bias. This study is already notable for its findings that VE wanes
significantly after several months, though this is interpreted as evidence in favour of boosting.
These findings are summarized in the section Research in context.
Summary: The study has Exclusion bias. After a careful reading, we conclude that VE calcu-
lations have neither Miscategorization nor Undefined bias, though we admit the authors could
have defined their terms more clearly. We can’t correct for the exclusion bias, and note that
the study already finds signficant waning of VE after several months.
26. NSW Health
This is an observational, retrospective study comprising the entire adult population of New
South Wales. Note that it does not actually make any VE calculations, just reports numbers and
rates of Covid-19 cases over a 4-month period and amongst groups with different vaccination
statuses. The charts on pages 2,3,4 and 11 divide Covid-19 cases into the following categories:
None, One dose, Two doses and Unknown. These terms are defined on page 9. The group
“None” includes those who received a single vaccine dose less than 21 days before the date their
case was recorded, and the “two dose” group only includes those who received their second dose
at least 14 days before their case. Thus, we have Miscategorization bias. These definitions
should imply that the “One dose” group contains all those who received just one dose, at least
21 days before their case, plus all those who received a second dose less than 14 days before
their case. However, the definition of “One dose” on page 9 reads: “One dose of vaccine at least
21 days prior to onset date”, which could imply that those who received a second dose less than
14 days prior are completely excluded, thus yielding also Exclusion bias for the charts on pages
2,3,4 and 11. We are not sure. The miscategorization bias is significant, because on page 2 we
see that, of 39,017 cases classified as “None”, 10,090 were from partially vaccinated individuals.
The case-rate charts on pages 6,7 and 8 have a different set of biases. Case-rates are only
presented here for two groups: “not vaccinated” and “2 dose recipients”. The captions below the
charts explain that: “Cases with unknown vaccination status are categorized as unvaccinated
(see methods). Cases who received a single dose of vaccine, regardless of when it was given, or
two doses but it was less than 14 days since receipt at the time of onset, are not included in the
rate analyses”. Hence, in these charts, we have Unverified and Exclusion bias. Since the latter
encompasses the same individuals who were miscategorized in the first set of charts, it is singi-
ficant. So is the Unverified bias, which applies to the group previously classed as “Unknown”,
because 21.7% of all cases were in this category.
Summary: NFM correctly assert that the study has both Miscategorization and Unverified
bias. However, at least some of the charts also have Exclusion bias. All the biases affect a
significant proportion of cases. Note that NFM assert that the “arbitrarily defined period” is 14
days, whereas it is in fact 21 days for miscategorization of one-dosed individuals as unvaccinated.
27. P´alink´as et al
This is an observational, retrospective study of the entire adult population of Hungary. It is
the only study on the list which is concerned with estimating VE against all-cause mortality
only. The estimates were based on deaths occurring in the period April 1 - August 15, 2021,
which was divided into an “epidemic period” (up to June 20) and a “non-epidemic period”
21
(June 21 - August 15). In Section 2.2 we read: “Partially vaccinated people who received the
vaccine during the follow-up period but did not become protected during the study period (did
not receive the booster or died before the 7th day after vaccine completion) were excluded from
the investigation”. The meaning of this sentence is not completely clear. Firstly, it is not quite
clear if individuals who survived to the end of the study, but received a second dose during its
final week, were excluded or not. Secondly, “study period” seems to refer to the epidemic or
non-epidemic period, and hence an individual could have been excluded from the analysis of
the first period but not of the second. This is despite the fact that VE estimates are presented
(Table 2) only for the entire period of April 1 - August 15. Indeed, Section 3.3 suggests that
the data for the “non-epidemic period” is used to correct the data in the former period for
healthy vaccinee effect. Whatever about this effect, the fact that crude mortality rates in both
periods vary so drastically by vaccine type (Tables 1 and 3) indicates the presence of significant
confounding. The VE estimates in Table 2 are intended to account for this but, as the authors
admit, the degree of confounding inevitably reduces the reliability of the estimates.
The study therefore has Exclusion rather than Miscategorization bias, even if we are not
sure exactly what it entails. In addition to the presence of significant confounding, it appears
a large fraction of deaths were excluded. We are only told the number of deaths included in
each period, but by checking the weekly death totals for Hungary on Eurostat we can estimate
the number of excluded deaths. It appears that more than 10% of all deaths during the non-
pandemic period were excluded and perhaps close to 20% during the epidemic period.
Summary: The study has Exclusion rather than Miscategorization bias. The bias appears
to be very significant, given the large fraction of deaths excluded and the evidence for signifi-
cant confounding between unvaccinated and vaccinated cohorts.
28. Paternina-Caicedo et al
This is an observational, retrospective study on a cohort of approximately 800,000 people,
all aged over 40 and enrolled with Mutual Ser, a Colombian health insurer. Participants were
divided into three categories: unvaccinated, vaccinated and fully vaccinated. A person was
considered unvaccinated for the first 14 days after receipt of a first vaccine dose (both vaccines
studied had a 2-dose primary schedule), and considered vaccinated rather than fully vaccinated
for the first 14 days after receipt of a second dose. Hence, this study does have Miscategoriza-
tion bias. I could not find any data specifically on the number of miscategorized patients, so
no correction is possible. It is noteworthy, however, that the VE estimates are already low for
some vaccines and outcomes.
Summary: NFM are correct that this study has 14-day Miscategorization bias, and no data is
provided on the miscategorized individuals.
29. Petr´as et al
This observational, retrospective cohort study was performed on health care workers in a
number of Prague hospitals. In Section 2.2, it states: “To determine the vaccine effectiveness
against breakthrough infections, participants with no previous SARS-CoV-2 infection were ar-
ranged into three cohorts according to their vaccination status (unvaccinated, partially, and fully
vaccinated)”. I cannot find any explicit and precise definition of these three categories anywhere
in the paper, but I infer from the entirety of the text that, most likely, an individual was consid-
ered partially vaccinated from the time of receipt of a first dose to that of a second dose (only
the Comirnaty vaccine, with a 2-dose primary schedule, was considered), and fully vaccinated
from the time of a second dose. If that is the case, then there is no Miscategorization bias.
There remains the possibility of Exclusion bias. Once again, the situation is not quite clear,
and my assessment is based on piecing together various statements in the text.
22
Firstly, in the Abstract, it reads: “The post-vaccination and post-infection protection were
assessed in a total of 11,443 hospital workers who were followed up for more than 14 days
either after their Comirnaty vaccination or study enrolment”. Further down, in Section 2.2, we
find: “The terms ‘post-vaccination’ and ‘post-infection’ protection referred to periods of more
than 14 days after vaccine administration and 90 days after previous SARS-CoV-2 infection,
respectively”. Next, in Section 2.3: “The analyses were conducted in a group of those having PCR
tests in the above periods, while not previously infected and being either more than 14 days after
the second vaccine dose or unvaccinated with >14-day follow-up”. Then, in Section 3: “The
analysis of vaccine effectiveness was conducted in 11,016 HWs followed up for more than 14
days irrespective of the vaccination status, with a total of 254 laboratory-confirmed infections”.
This total of 254 infections is consistent with the figures in Table 2. Finally, just above Table
2, we have: “Partial vaccination of PU11 employees showed short-term effectiveness of 47.7%
(19.2−66.2%) regardless of the infection-related symptoms. However, the rate increased to
75.4% (0.7−93.9%) in PIs12, with only two breakthrough infections reported within 15-30 days
after single-dose administration ...”.
Based on these and similar extracts, it seems to me that “follow-up” ended at the time of a
positive test and that anyone who tested positive within 14 days of their first or second vaccine
dose, as well as any unvaccinated person who tested positive within 14 days of enrolment was
excluded entirely from the VE analysis. This would suggest Exclusion bias and that the reason
for these exclusions is that the onset of such a case cannot be guaranteed to be after the time
of vaccination, respectively enrolment. If so, then this would be the only paper on the list
that offers this motivation for exclusions, the usual one being, as we discussed in Section 2, a
general assumption that VE is close to zero soon after a first dose, or little changed soon after
a subsequent dose.
On the other hand, I am far from sure that I am interpreting things correctly. I can’t find
any explanation for why only 11,016 of the 11,443 workers with 14+ days of follow-up were
included in the VE analysis. Secondly, at the start of Section 3 it states that 549 employees had
a RT-PCR positive test result over the study period, and I don’t understand where the figure
of 254 tests included in the VE analysis comes from. It seems highly unlikely that a majority
of cases would have been discovered in the first 14 days post-enrolment or vaccination.
Summary: This study suffers, in my view, from a lack of clarity in defining terms, but my
best guess is that there is Exclusion rather than Miscategorisation bias. Some of the numbers
in the data don’t make sense to me either, however.
30. Pilishvili et al
This observational, test-negative case control study was performed on health care workers in
a fixed number of locations across the US. VE was estimated for a primary schedule of either
the BNT162b2 (Pfizer) or mRNA-1273 (Moderna) mRNA vaccine. A testee was deemed fully
vaccinated if they got their 2nd dose at least 7 days before their test, and partially vaccinated
if they got they got their second dose within the previous 7 days, or only a first dose at least 14
days previously. A testee was deemed unvaccinated if they had taken no doses at all before their
test. The primary analysis compares the unvaccinated with the other two categories, and thus
has Exclusion bias. However, in Table 3, separate VE estimates are also presented for those in
days 0-9 after a first dose, and those in days 10-13. This corrects for the exclusion bias, except
for the fact that these two estimates are not broken down separately according to vaccine type
(Pfizer or Moderna).
NFM also claim that this study has the Uncontrolled bias. Since it is not an RCT, the only
part of the definition of that bias which is a priori applicable would be that participants are
11previously uninfected
12previously infected
23
“allowed to self-administer or self-report their vaccination or infection status”. As regards infec-
tion status, this makes no sense, because the tests were performed at the workers’ own health
care centers. But it doesn’t seem to apply to vaccination either since, at the end of the section
Study Design it says: “Information on Covid-19 vaccination dates and products received was
obtained from occupational health clinics, vaccine cards, state registries, or medical records ”.
Thus, vaccination status was obtained in the usual manner from public health databases, as
one would expect given the nature of the cohort.
Summary: NFM claim this study has both Miscategorization and Uncontrolled bias. It has
neither, and doesn’t really have any of the 5 categories of bias they define, though one could
argue that some trace of Exclusion bias is present.
31. Polack et al
This important RCT was the original phase-3 trial of the BNT162b2 (Pfizer/BioNTech)
mRNA-vaccine in adults. The primary efficacy analysis compared those receiving placebo with
those fully vaccinated, defined as 7+ days after receipt of the second active dose. Hence, the
primary analysis, which yielded the widely touted VE estimate of 95% against laboratory-
confirmed Covid-19, has Exclusion bias. However, Figure 3 presents VE estimates for subsets
of the excluded participants, as well as an aggregate VE for all vaccinees. This represents a
complete correction for the exclusion bias. The aggregate VE of 82% is somewhat lower than
the widely quoted figure of 95%, but is still signficant.
Summary: NFM claim this pivotal phase-3 trial of the Pfizer/BioNTech vaccine has both Mis-
categorization and Exclusion bias. The primary efficacy analysis has only Exclusion bias, but
this is completely corrected for in secondary analysis and aggregate VE against PCR-confirmed
Covid-19 infection remains substantial after correction.
32. Robles-Font´an et al
This is an observational study, performed on the entire 12+ population of Puerto Rico from
December 2020 to October 2021. The section Estimating effectiveness since time of
vaccination begins with: “We denoted an individual as fully vaccinated two weeks after the
date they received the final dose in the COVID-19 vaccine series. SARS-CoV-2 infections in
which the laboratory confirmation occurred after the first dose but before being fully vaccinated
were removed from the analysis”. This describes straightforward Exclusion bias. We are told
that a total of 112,726 Covid-19 infections were recorded during the study period, of which
88,704 were in people aged 12+. Supplementary Table S3 gives a total of 82,069 cases included
in the analysis, thus 6,635 cases (about 7.5%) were excluded. That is enough to substantially
affect the VE estimates, though not to completely render them “statistical illusions”, to use
NFMs term. Moreover, in the section No evidence of confounding it is claimed that “we
examined vaccine effectiveness during the days right after the first dose”, and that this yielded
results consistent with an expectation of VE close to zero during these first days. However, I
could not find any description of this analysis in the paper or the supplementary materials.
We note that some interesting remarks are made in the section Evidence before this
study:
“Most studies compared incidence rates between the vaccinated group and the unvaccinated
group. A vaccinated group comprehended people who had completed a vaccination series for at
least 7 days while the unvaccinated group inclusion criteria were more variable going from people
with no vaccine dose, one dose with more than 7 days after the administration date, and less
than 7 days after vaccination series is completed.”
24
These remarks seem to give credence to the claim of NFM of there being widespread Mis-
categorization bias in the literature, but they provide no references to compare with the ones
on NFMs list.
Summary: This study does not have Miscategorization bias, but instead has Exclusion bias.
Remarks suggest the authors performed some correction for this, but they do not provide de-
tailed numbers. Taking what they say at face value, it seems likely that fully corrected VE
estimates would remain positive, though perhaps moderately reduced.
33. Rosenberg et al
This observational study surveilled the population of New York State from May 1 to early
September, 2021. A person was defined as fully vaccinated if they received the last dose of a
primary schedule at least 14 days before May 1. The fully vaccinated thus constituted a closed
cohort. Unvaccinated meant having no record of vaccination in one of the public databases
employed in the study, up to September 23 (presumably leaving some time after the end of the
study for such records to be updated). In the text of the paper, “unvaccinated” is described
as synonymous with “neither partially nor fully vaccinated”. The term “partially vaccinated” is
never defined explicitly, as far as we can see. However, in the Supplementary Appendix, the
term “never vaccinated” is used synonymously with “unvaccinated” (page 4 onwards), and in the
table on page 7, this is further clarified to mean not having “received dose of COVID-19 vaccine,
in the registry as of September 23”. Thus, we are quite sure that what is being described here is
Exclusion rather than Miscategorization bias. We can find no data in the paper on the excluded
cases, so no correction for the bias is possible.
NFM also claim this study has Unverified bias. To adjudicate this, the following passage in
the section Study cohorts is the crucial one:
“Whereas vaccinated cohorts were directly observed in the NYSIIS13 and CIR14 databases,
three age-specific unvaccinated comparison cohorts were defined as the census population minus
persons partially or fully vaccinated by September 23; persons with Covid-19 and persons who
were hospitalized with Covid-19 were classified as unvaccinated if they did not have a matching
Covid-19 vaccination record according to the NYSIIS and CIR databases”.
This just seems to describe the normal procedure for determining vaccination status in ob-
servational studies, namely by checking public immunization records. In the NSW Health study
(#26 above), which clearly did have Unverified bias because it had a category of people with
“unknown” vaccination status who were assumed to be unvaccinated, the term “unknown” was
defined on page 1 as follows: “vaccination status was classified as not known because they did
not match to a record in the AIR15 of COVID-19 vaccine receipt”. Now this seems to describe
the same situation as in the previous extract, but the NSW study nevertheless had a separate
category “None” for people who somehow they could be sure had received no doses. Hence,
whether or not the present study has Undefined bias is unclear to us - it seems to depend on
exactly what information is included in the databases the authors used. Perhaps also residents
of New York State could get vaccinated out-of-state and not be registered as such in the State
databases, but this is also something we simply don’t know.
There is one further interesting remark in the Discussion section:
”an estimated 0.03% of persons vaccinated through April 30, 2021 received vaccines that had
not been authorized by the FDA, and these persons were analytically classified as unvaccinated”.
These seems to describe a kind of miscategorization bias, but does not fall under the defini-
tion given by NFM. That the percentage is tiny means it is very unlikely to have a significant
effect on VE estimates.
13New York State Immunization Information System
14Citywide Immunization Registry
15Australian Immunization Registry
25
Summary: The paper has Exclusion rather than Miscategorization bias, though it does have a
type of miscategorization bias not covered by NFMs definition. The latter is not significant. On
the other hand, no correction for the exclusion bias is possible from the available data. Whether
or not the paper has Unverified bias is unclear.
34. Stock et al
This observational study, involving pregnant women in Scotland, computes incidence rates of
various Covid-19 and pregnancy outcomes in groups with different vaccination status, but does
not actually deduce any VE estimates (which would, for example, require adjustment for im-
properly matched cohorts). Individuals are divided into three categories: unvaccinated, partially
vaccinated and fully vaccinated. An individual was classed as unvaccinated for the first 21 days
after receiving the first dose of a 2-dose primary schedule, and classed as partially vaccinated
for the 14 days after receipt of the final primary dose. Hence, NFM correctly ascribe Miscate-
gorization bias to this study. No data is provided specifically for the miscategorized individuals,
so no correction for the bias is possible. It is conceivable that such correction could be signifi-
cant, because vaccination rates in pregnant women were low during the time of the study and
the numbers in the paper suggest the partially vaccinated contributed a number of person-days
which was a substantial fraction of that contributed by the fully vaccinated. On the other hand,
with the paper’s definitions, the fully vaccinated had far better outcomes than the unvaccinated.
Summary: This paper has Miscategorization bias, correctly described by NFM. Data is not
provided which would allow for bias correction.
35. Tabarsi et al
This is an RCT of a non mRNA-vaccine called SpikoGen, with VE estimates based on out-
comes from 14 days after receipt of the second dose of a 2-dose schedule. Thus, we have
straightforward Exclusion bias, but no sign of Miscategorization bias. The VE estimates are
already very moderate and, while we aren’t given full information about the excluded cases and
the meaning of some numbers isn’t entirely clear, there are suggestions that VE may even have
been significantly negative for those not “fully vaccinated”. The flowchart in Figure 1 seems to
say that 369 participants in the vaccine arm were excluded due to testing PCR positive “before
14 days after 2nd dose”, while only 48 were excluded from the placebo arm for this reason. It is
not completely clear if “before 14 days after 2nd dose” refers to the period beginning at the time
of the 1st dose, but it appears so, since those excluded due to being “seropositive at baseline”
form a separate group in the flowchart. If this interpretation is correct, then since the ratio
of participants was approximately 3:1 between the two arms, this would indicate substantially
negative VE for those partially vaccinated, and indeed probably negative aggregate efficacy,
since only 247, resp. 119 cases were observed in the two arms from 14 days after dose 2 to the
end of the study.
Further indication of negative efficacy soon after vaccination is provided in Table S5 and the
text below it. In the paper itself, in the section Efficacy outcomes, it says there were only 11
cases of “severe” Covid-19 from 14 days after dose 2, of which 5 were in the vaccine arm, and
no Covid-19 related deaths at all in either arm in this period. However, Table S5 lists among
“severe adverse events” 63 cases of Covid-19, of which 44 were in the vaccine arm. It is not
clear if a Covid-19 case needed to be “severe” in order to be classed as a “serious adverse event”.
Note that, if that were the case then, even in the placebo arm, a majority (13 out of 19) of
severe Covid-19 cases occurred in the 35-day period between dose 1 and 14 days after dose 2.
This seems unlikely, given that the median follow-up time in the placebo arm is given as 51
days from day 14 after dose 2, though not impossible since the study apparently overlapped
with at least one very large wave of Covid infections in Iran. The text below Table S5 also
26
discusses two deaths, both of which occurred in the vaccine arm soon after the first dose. One
is specifically said to have been Covid-19 related, and the other due to myocardial infarction,
an acknowledged serious side effect of some Covid-19 vaccines.
NFM also claim this study has Uncontrolled bias. We can find nothing which clearly supports
this claim. Self-administration or self-reporting don’t make sense a priori for an RCT, though in
the present case we didn’t find a clear description of the testing protocols during the follow-up
period. Neither unblinding nor extraneous vaccination seem to be an issue. The flowchart in
Figure 1 clearly states that those who became unblinded or received another approved Covid-19
vaccine were excluded from the per-protocol efficacy analysis. It also appears that the whole
trial ended rather abruptly due to a large number of participants seeking to become unblinded,
but the text makes clear in several places that participants were censored once they became un-
blinded. Of course, if many participants became unblinded during a short interval, it increases
the chances of errors creeping in due to not properly tracking these individuals.
Summary: The study has Exclusion bias but not Miscategorization bias. Difficulty in in-
terpreting parts of the text leave us unsure to what extent the exclusion bias can be corrected
for. However, the biased VE estimates are already only moderate and there are indications that
proper correction would lower them further, perhaps even yielding negative aggregate VE. We
could find no clear evidence for Uncontrolled bias in the study.
36. Tan et al
This observational study from Singapore was first and foremost concerned with vaccine ef-
ficacy in cancer patients compared to the general population, and basically this involved com-
puting VE separately for a cohort of cancer patients and a matched cohort from the general
population. As far as statistical biases are concerned, in the section Study Design we read:
“Patients administered zero or 1 vaccine dose were considered as ‘unvaccinated/partially vacci-
nated,’ 2 doses as ‘fully vaccinated,’ and 3 or 4 doses as ‘boosted’.”. This already makes clear
that we have Miscategorization bias as the partially vaccinated are not separated from the un-
vaccinated. No breakdown of numbers between these two subgroups is given in the paper, so no
correction for this bias is possible. What is not completely clear is whether, as NFM also claim,
for each n≥2, individuals who received their n:th dose remained classified as (n−1)-times
dosed for the first 7 days. What suggests that this is the case is the following passage from the
section Statistical Analysis:
“For waning of vaccine effectiveness, fully vaccinated patients within 8 to 59 days were the
reference group in the delta phase, whereas boosted patients within 8 to 59 days were the reference
group in the omicron phase for better comparability. The IRRs for patients were grouped by
number of vaccine doses received (zero/single-dose, 2-dose, 3-dose, and 4-dose groups)”.
The IRRs in question are presented in Table 3. The above quote suggests what NFM claim,
as does eTable12 in the Supplementary materials. However, we can find nothing in the text to
resolve this issue definitively. If NFM are correct, as we suspect they are, then the study has
additional Miscategorization bias, over and above grouping 1-dosed with unvaccinated individ-
uals. It is also conceivable that person-days in this 7-day window were simply not counted,
which would be Exclusion bias. This seems unlikely, however.
Note that Table 2 suggests VE is negative against Covid-19 infection, though positive against
more severe outcomes. Such a situation makes the Miscategorization bias even more relevant.
Summary: The paper has Miscategorization bias since 1-dosed invididuals are not separated
from unvaccinated ones in VE calculations. NFM are probably also correct that all vaccinated
individuals are miscategorized in the 7-day period after receipt of their most recent dose. No
correction for the bias is possible. This is especially problematic since IRR calculations in Ta-
ble 3 already suggest negative VE against Covid-19 infection, though not against more serious
outcomes.
27
37. Thomas et al
This is the same paper as Polack et al.
38. Wu et al
This observational study was concerned with VE in children and adolescents and, according
to the authors, relied on “electronic health record (EHR) data from a national network of U.S.
pediatric medical centers ”. The primary analysis compares outcomes in unvaccinated partici-
pants with all those who received at least 1 vaccine dose, but excludes infections deemed to have
onset within 28 days after the first dose. This is Exclusion rather than Miscategorization bias.
Furthermore, the bias is completely corrected for in Section 7 of the Supplementary Material,
with aggregate VE estimates presented in Supplement Table 12. Now it is true that a further
“dose-response” analysis is conducted in Section 11 of the Supplement, which computes VE for
two groups: all those at least 28 days out from a first dose, and all those at least 7 days out from
a second dose. If these were the only analyses performed, then we would indeed have Exclusion
bias, but they aren’t so we don’t.
NFM also claim that this study has Uncontrolled bias. This seems to refer to the fact that the
study admits that vaccination records may be incomplete. It is the only study on the list which
explicitly does this. However, it presents this as being an issue for any observational study, and
claims to be the first study which specifically tries to address this bias in its statistical analysis.
Moreover, the bias would seem to favour the unvaccinated, since the issue is that vaccinations
are underreported. It thus seems unfair of NFM to specifically single out this study. Further,
as we discussed in Section 2, this issue seems to fall under the definition of Unverified rather
than Uncontrolled bias.
Summary: There is no Miscategorization bias. The primary analysis has Exclusion bias, but
this is completely corrected for in secondary analysis. Neither does the study have Uncontrolled
bias. It explicitly acknowledges a problem which potentially affects any observational study,
namely incompleteness of vaccination records. As discussed in Section 2, this should be cate-
gorized, if anything, as Unverified bias, but in that case every observational study that relied
on public vaccination records should have been assigned that bias by NFM. Moreover, since the
present study explicitly tries to correct for it, one could reasonably argue that it is the one that
least suffers from Unverified bias of all the observational studies on this list.
39. Yau et al
In the section Key outcomes we read:
“By 8 June 2022, Singapore had a very high proportion (88.6%) of residents who were fully
vaccinated – defined as having received the necessary doses of COVID-19 vaccines approved
under WHO Emergency Use Listing, as follows: (i) 2 doses of MRNA vaccines; (ii) 3 doses
of Sinovac/Sinopharm, and (iii) 2 doses of non-MRNA vaccines besides Sinovac/Sinopharm.
Patients who were partially vaccinated or did not receive any vaccination were deemed ‘not fully
vaccinated’ ”.
Thus, NFM correctly describe the Miscategorization bias in this study. Since no information
is provided specifically for partially vaccinated individuals, no correction for the bias is possible.
28