ArticlePDF Available

Evolutionary genetics of malaria

Frontiers
Frontiers in Genetics
Authors:
  • U.S. Naval Medical Research Unit - SOUTH

Abstract and Figures

Many standard-textbook population-genetic results apply to a wide range of species. Sometimes, however, population-genetic models and principles need to be tailored to a particular species. This is particularly true for malaria, which next to tuberculosis and HIV/AIDS ranks among the economically most relevant infectious diseases. Importantly, malaria is not one disease—five human-pathogenic species of Plasmodium exist. P. falciparum is not only the most severe form of human malaria, but it also causes the majority of infections. The second most relevant species, P. vivax, is already considered a neglected disease in several endemic areas. All human-pathogenic species have distinct characteristics that are not only crucial for control and eradication efforts, but also for the population-genetics of the disease. This is particularly true in the context of selection. Namely, fitness is determined by so-called fitness components, which are determined by the parasites live-history, which differs between malaria species. The presence of hypnozoites, i.e., dormant liver-stage parasites, which can cause disease relapses, is a distinct feature of P. vivax and P. ovale sp. In P. malariae inactivated blood-stage parasites can cause a recrudescence years after the infection was clinically cured. To properly describe population-genetic processes, such as the spread of anti-malarial drug resistance, these features must be accounted for appropriately. Here, we introduce and extend a population-genetic framework for the evolutionary dynamics of malaria, which applies to all human-pathogenic malaria species. The model focuses on, but is not limited to, the spread of drug resistance. The framework elucidates how the presence of dormant liver stage or inactivated blood stage parasites that act like seed banks delay evolutionary processes. It is shown that, contrary to standard population-genetic theory, the process of selection and recombination cannot be decoupled in malaria. Furthermore, we discuss the connection between haplotype frequencies, haplotype prevalence, transmission dynamics, and relapses or recrudescence in malaria.
This content is subject to copyright.
Evolutionary genetics of malaria
Kristan Alexander Schneider
1
* and Carola Janette Salas
2
1
Department of Applied Computer- and Biosciences, University of Applied Sciences Mittweida,
Mittweida, Germany,
2
Department of Parasitology, U.S. Naval Medical Research Unit No 6 (NAMRU-6),
Lima, Peru
Many standard-textbook population-genetic results apply to a wide range of
species. Sometimes, however, population-genetic models and principles need
to be tailored to a particular species. This is particularly true for malaria, which
next to tuberculosis and HIV/AIDS ranks among the economically most relevant
infectious diseases. Importantly, malaria is not one diseaseve human-
pathogenic species of Plasmodium exist. P. falciparum is not only the most
severe form of human malaria, but it also causes the majority of infections. The
second most relevant species, P. vivax, is already considered a neglected
disease in several endemic areas. All human-pathogenic species have
distinct characteristics that are not only crucial for control and eradication
efforts, but also for the population-genetics of the disease. This is particularly
true in the context of selection. Namely, tness is determined by so-called
tness components, which are determined by the parasites live-history, which
differs between malaria species. The presence of hypnozoites, i.e., dormant
liver-stage parasites, which can cause disease relapses, is a distinct feature of P.
vivax and P. ovale sp. In P. malariae inactivated blood-stage parasites can cause
a recrudescence years after the infection was clinically cured. To properly
describe population-genetic processes, such as the spread of anti-malarial drug
resistance, these features must be accounted for appropriately. Here, we
introduce and extend a population-genetic framework for the evolutionary
dynamics of malaria, which applies to all human-pathogenic malaria species.
The model focuses on, but is not limited to, the spread of drug resistance. The
framework elucidates how the presence of dormant liver stage or inactivated
blood stage parasites that act like seed banks delay evolutionary processes. It is
shown that, contrary to standard population-genetic theory, the process of
selection and recombination cannot be decoupled in malaria. Furthermore, we
discuss the connection between haplotype frequencies, haplotype prevalence,
transmission dynamics, and relapses or recrudescence in malaria.
KEYWORDS
complexity of infection (COI), co-infection, mixed-species infection, recrudescence,
relapse, seed bank, hypnozoites, multiplicity of infection (MOI)
OPEN ACCESS
EDITED BY
Rongling Wu,
The Pennsylvania State University (PSU),
United States
REVIEWED BY
Chenqi Wang,
University of South Florida,
United States
Edith Christiane Bougouma,
Groupe de Recherche Action en Santé
(GRAS), Burkina Faso
*CORRESPONDENCE
Kristan Alexander Schneider ,
kristan.schneider@hs-mittweida.de
SPECIALTY SECTION
This article was submitted to
Evolutionary and Population Genetics,
a section of the journal
Frontiers in Genetics
RECEIVED 29 August 2022
ACCEPTED 26 September 2022
PUBLISHED 03 November 2022
CITATION
Schneider KA and Salas CJ (2022),
Evolutionary genetics of malaria.
Front. Genet. 13:1030463.
doi: 10.3389/fgene.2022.1030463
COPYRIGHT
© 2022 Schneider and Salas. This is an
open-access article distributed under
the terms of the Creative Commons
Attribution License (CC BY). The use,
distribution or reproduction in other
forums is permitted, provided the
original author(s) and the copyright
owner(s) are credited and that the
original publication in this journal is
cited, in accordance with accepted
academic practice. No use, distribution
or reproduction is permitted which does
not comply with these terms.
Frontiers in Genetics frontiersin.org01
TYPE Original Research
PUBLISHED 03 November 2022
DOI 10.3389/fgene.2022.1030463
1 Introduction
After a decade of declining incidence the number of annual
malaria infections rises since 2018, challenging the WHO goal
to reduce malaria incidence by at least 90% by 2030 (WHO,
2021a). This is partly attributed to the rapid emergence and
spread of anti-malarial drug resistance, an evolutionary-genetic
process whose understanding is a global health priority (WHO,
2021b).
Malaria is caused in humans and animals by Plasmodium
parasites. These unicellular, haploid eukaryotes are transmitted
by numerous species of female Anopheles mosquitoes. Both the
parasite and vector species are adapted to specic human or
animal hosts. Five species of Plasmodium are pathogenic to
humans, which can be transmitted by over 100 Anopheles
species (Nicoletti, 2020). Over 95% of the 240 million annual
infections and 620,000 deaths worldwide are attributed to P.
falciparum. Although, the WHO recommended the use of RTS,S,
the rst approved malaria vaccine, in children to prevent P.
falciparum infections in areas of moderate to high transmission,
the vaccinesefcacy is low and malaria control depends strongly
on reliable diagnostics and drug treatments to cure acute
infections (Greenwood et al., 2021). While the second most
relevant species, Plasmodium vivax, receives considerable
attention, the other species P. ovale sp., P. malariae, and P.
knowlesi are somewhat neglected, due to an outdated distinction
between harmful and harmless malaria species (Lover et al.,
2018).
The spread of deletions in the histidine-rich protein 2 and 3
(HRP2/3) genes of P. falciparum, which encode for the antigens
targeted by rapid diagnostic tests (RDTs) as well as drug-resistant
P. falciparum and P. vivax haplotypes substantially challenge
successful malaria control. These evolutionary genetic processes
are tightly linked to the pathogens complex transmission cycle,
which besides some species-specic differences, is commonly
shared among all Plasmodia (Su et al., 2019;Beshir et al., 2022).
The transmission cycle starts with an infected mosquito
taking her blood meal. She inoculates parasites in the form of
sporozoites from her salivary glands into the human body. This
is followed by the exo-erythrocytic cycle, during which
sporozoites reach the liver to infect hepatocytes. In the
infected liver cells parasites mature into schizonts. The
erythrocytic cycle is initiated when the schizonts rupture and
merozoites are released into the bloodstream. Erythrocytes are
invaded by merozoites, which form ring stage trophozoites and
then mature into schizonts. Once they rupture, new merozoites
are released into the bloodstream. During this step of asexual
reproduction, some parasites differentiate into male or female
gametocytes, which do not reproduce in the human host. Once
a mosquito ingests male and female gametocytes, the
sporogonic cycle is initiated. Gametes released by male and
female gametocytes fertilize and form zygotes. Following a step
of meiosis, and hence recombination, the zygote becomes
tetraploid and develops into ookinetes, which migrate
through the midgut wall and transform into oocysts. In the
oocyst sporozoite budding occurs in the haploid state. Division
of each oocyst produces thousands of sporozoites that move
into the mosquito salivary glands, completing the transmission
cycle. Because gametocytes immediately release gametes, only
parasites exiting the same host recombine, potentially leading
to a high degree of inbreeding during the sexual reproduction of
the parasite (Ngwa et al., 2016).
Species-specic differences occur in the number of parasites
within an infection (parasitemia and gametocytemia counts), and
the duration of the various phases in the transmission cycle. The
replication of merozoites in 72-hour- rather than 48-hour-cycles
distinguishes P. ovale sp. from other species. The onset of
gametocytogenesis and the longevity of gametocytes were
argued to accelerate drug-resistance evolution in P. falciparum
compared to P. vivax (Schneider and Escalante, 2013). Dormant
liver-stage parasites (hypnozoites), can result in disease relapses
weeks, months, or even years after the clearance of blood stage
parasites and occur only in P. vivax and P. ovale sp. Currently
primaquine (PQ) and tafenoquine (TQ) are the only approved
drugs to clear hypnozoites (Watson et al., 2021). Unfortunately,
patients with (glucose-6 phosphate dehydrogenase) G6PD
deciency, which is widespread in many malaria-endemic
areas, cannot be treated with these drugs (Baird et al., 2018;
Dean et al., 2020). Extremely prolonged carriage of blood-stage
parasites causing recrudescences occur in P. malariae (Collins
and Jeffery, 2007). It is commonly accepted, although not
completely ruled out, that the rebounce of parasitaemia in P.
malariae is not caused by quiescent pre-erythrocytic stages such
as hypnozoites. Because of relapses occurring in P. vivax,P. ovale
sp., and prolonged blood stage parasite carriage in P. malariae,
these species are resilient in areas in which P. falciparum
transmission cannot be sustained. While all other human
malaria species canat least in theorybe eradicated by
concentrating on the human host, this is not possible for P.
knowlesi, which is characterized by zoonotic transmission. It
became the predominant species in several endemic countries in
Southeast Asia, which shifted from malaria control toward
elimination (Sutherland, 2016).
The characteristics of the transmission cycle render the
application of standard textbook population-genetic results
incorrect. Particularly it was shown that the process of
selection acting on parasites in the human hosts (including
selection for drug resistance) and recombination cannot be
separated (Schneider and Kim, 2010). Hence, population-
genetic theory and models have to be tailored to the malaria
transmission cycle. This has been done mainly for P. falciparum.
Because a clear path to eradication has been chartered only for P.
falciparum, the other malaria species gain more importance due
to their resilient nature (Lover et al., 2018). This requires to
further adapt population-genetic theory to the characteristics of
other human-pathogenic malaria species.
Frontiers in Genetics frontiersin.org02
Schneider and Salas 10.3389/fgene.2022.1030463
Here, we extend a population-genetic framework, originally
developed for P. falciparum, to be applicable to all other malaria
species.
We exemplify the importance of species-specic differences
by clarifying the role of hypnozoites in the evolution of drug
resistance in P. vivax vs. P. falciparum. We also clarify, how
haplotype frequencies (i.e., their relative abundance in the
parasite population) and prevalence (i.e., the likelihood that a
given haplotype occurs in an infection) are affected by relapses/
recrudescence in other malaria species. Based on this framework,
we discuss past and current developments with relevance for the
evolutionary genetics of malaria.
2 Methods
We extend the population-genetic framework of (Schneider
and Kim, 2010;Schneider and Kim, 2011;Schneider, 2021) that
describes the temporal change in the distribution of parasite
haplotypes due to recombination and selection in generations of
FIGURE 1
Transmission cycle of human malaria. All species have the same cycl e, but parasites life-stages have different morphology (illustrated here for P.
falciparum). In P. vivax and P. ovale sp. dormant hypnozoites remain in the liver. In P. malariae recrudescence form prolonged blood stage parasites
occur. In P. knowlesi humans and non-human primates can be infected.
Frontiers in Genetics frontiersin.org03
Schneider and Salas 10.3389/fgene.2022.1030463
transmission cycles. While the original framework was tailored to
P. falciparum, the extension captures the characteristics of all
human-pathogenic malaria species.
The model is based on an idealization of the complex malaria
transmission cycle (cf. Figure 1), which is illustrated in Figure 2.
Although, pathogen, mosquito vector, human hosts (and, in the
case of P. knowlesi the animal host) are involved in transmission,
the framework does not require to model transmission dynamics
(i.e., the interaction of mosquito vectors and human or animal
hosts) explicitly. This conceptional advantages arise, because
haplotype frequencies are considered at the end of the
sporogenic cycle (cf. Figure 2). Thus, the frequency
distribution of parasite haplotypes in the mosquitoessalivary
glands, which are ready for vector-host transmission, is followed.
Host and vector populations are assumed to be sufciently
large and malaria infections sufciently frequent to justify a
deterministic description of the evolutionary dynamics. Steps
of full transmission cycles correspond to steps of sexual
reproduction, because only one step of sexual reproduction
occurs during one full transmission cycle, namely inside the
mosquito vector. Many steps of asexual reproduction occur
inside the vectors and hosts.
2.1 Genetic architecture of haplotypes
The genetic architecture of haplotypes is determined by their
allelic conguration at one or several loci. We denote the number
FIGURE 2
Illustrated is the idealization of the malaria transmission cycle underlying the population-genetic framework. The illustrated genetic
architecture of malaria haplotypes assumes two biallelic loci, leading to four possible haplotypes. Furthermore, two groups of hosts are illustrated.
Each host is infected by randomly drawing haplotypes from generation t, or a relapse/recrudescence from a previous generation occurs, which
corresponds to randomly draw parasites from a previous generation (haplotype reservoir). With probability G(t)
ga host belongs to group gin
generation t. The selective environment is different in the two groups. Recombination occurs exclusively between haplotypes exiting the same host.
After recombination, haplotypes in the mosquitoes are pooled together to derive their distribution in generation t+1.
Frontiers in Genetics frontiersin.org04
Schneider and Salas 10.3389/fgene.2022.1030463
of all possible haplotypes by H. E.g., Lbiallelic loci lead to H=2
L
haplotypes. In general, if haplotypes are determined by Lloci, and
n
l
alleles are segregating at locus l,H=n
1
·n
2
·...·n
L
. The
frequency of haplotype hin generation tis denoted by P(t)
h.
Collectively, the vector of haplotype frequencies is Pt
(P(t)
1,...,P
(t)
H).
2.2 Idealizing the transmission cycle
The idealized transmission cycles allows to describe the
evolutionary genetics of malaria in generations of full
transmission cycles (Figure 2). In generation t, it is assumed
that all hosts are infected (or have a relapse or recrudescence) at
the same time. Moreover, host-vector transmission is also
synchronized. Inside the mosquito, parasites, which were
ingested by the mosquitoes, can recombine during one step of
sexual reproduction. This determines the distribution of
haplotypes in the mosquitoessalivary glands of the parasite
(sporozoite) population in generation t+1.
2.2.1 Heterogeneity
Disease exposure and transmission intensities are
heterogeneous in endemic areas and change over time (e.g. in
the context of seasonal transmission) (Bousema et al., 2011;
Selvaraj et al., 2018). Moreover, hosts are heterogeneous
regarding their level of genetic and naturally acquired
immunity, number of co-morbidities, or the drug treatment
they receive to cure the infection (in case they receive any),
etc. (Hedrick, 2011;Gonzales et al., 2020). All of these factors can
be addressed by modeling hosts in different groups (strata). Let
G(t)
gbe the probability that a host, in which an infection occurs in
generation t, belongs to group g. Hence, G(t)
1+/+G(t)
S1 for
every generation t.
The number of groups, S, has to be chosen to capture the
features important to the specic application of the framework.
For instance, when considering drug resistance evolution, a
simple distinction would be between treated and untreated
infections, i.e., S= 2. In the case of P. knowlesi different
groups can model human and animal hosts. In the simplest
case one would have just two groups (S= 2), namely humans and
animals.
2.2.2 Relapses and recrudescence
Hosts are not modelled explicitly. This becomes relevant
when considering relapses (in P. vivax and P. ovale sp.) and
recrudescence in P. malariae. In the following we use relapse and
recrudescence synonymously, unless a distinction is necessary.
In the idealized transmission cycle, a relapse in generation t,
which occurs after a delay of dgenerations, is equivalent to a new
infection from the sporozoite population from dgenerations in
the past, i.e., from generation td. Let R(t)
dbe the probability that
an infection in generation tis a relapse, with a delay of d
generations, where R(t)
0is the probability of a new infection at
time t. Assuming the maximum possible delay is D, the relation
D
d0R(t)
d1 for all t, and 1 R(t)
0is the probability that a relapse
occurs at time t.
The framework models the haplotype distribution in
generations of transmission cycles not in real-time. The higher
the transmission intensities, the more transmission cycles occur
per year. The choice of the distribution of relapses has to take this
into account (see Results section The effect of recrudescences and
relapses). Moreover, the timing of relapses depends on the
Plasmodium species (White, 2011).
Importantly, a host might have been exposed differently to
the disease in the past, i.e., the host might belong to different
groups in generations tdand t. Let G(td,t)
g,g be the probability
that a host, who belonged to group gin generation td, belongs
to group gin generation t(d0). Marginalisation yields
G(t)
g
S
g1
G(td,t)
g,g (1)
for all t,d,g. Hence, the probability that a relapse occurs in
generation tin a host in group gafter a delay of dgenerations,
when he belonged to group g, is given by
R(t)
dG(td,t)
g,g .
2.3 Vector-host transmission and
multiplicity of infection
The presence of multiple genetically distinct parasite
haplotypes within an infection is frequently referred to as
multiplicity of infection (MOI) or complexity of infections
(COI) and considered important in malaria. The terms MOI
and COI are ambiguously dened in the literature (see
(Schneider et al., 2022) for a comprehensive review).
Although, it is unclear whether MOI is affecting the clinical
pathogenesis of malaria, or whether different parasite haplotypes
are competing within infections (intra-host competition), MOI
mediates the amount of meiotic recombination and scales with
transmission intensities (Pacheco et al., 2020) (see Figure 3).
Different parasite haplotypes can occur within an infection,
because they are 1) sequentially transmitted (during the course of
one disease episode) by different mosquitoes (super-infection); 2)
co-transmitted by one mosquito (co-infection); 3) mixed up with
parasites from previous infections by relapses or recrudescence.
Concerning models of MOI, the focus was mainly on super-
infections. More recently, the importance of co-infections is
being emphasized. Namely, more parasite genomics data is
being generated, which has enough resolution to study genetic
relatedness of parasites. Such data is appropriate for molecular
surveillance of transmission routes (Ndiaye et al., 2021). Formal
population-genetic frameworks to describe the evolutionary
Frontiers in Genetics frontiersin.org05
Schneider and Salas 10.3389/fgene.2022.1030463
genetics of malaria that consider relapses do not exist.
Mathematical models describing relapses in P. vivax and P.
ovale sp. are limited to epidemiological models, e.g., the
compartmental model of (Chamchod and Beier, 2013), which
neglects parasite genetics. A population-genetic framework
applicable to all human-pathogenic malaria species has to be
exible enough to accommodate super-infections, co-infections,
relapses, and recrudescence.
To set up the framework an infection is identied by a vector
m=(m
1
,...,m
H
), where m
h
is the number of times haplotype h
is infecting. Hence, m
h
=0orm
h
>0 if haplotype his absent or
present in the infection, respectively. The number m
h
accounts
for super-infections with the same haplotype. Moreover, it can be
interpreted as the concentrationof haplotype hif several
haplotypes are co-infecting, etc.
Let Pr [m|t] be the probability of an infection with
conguration mgiven generation t. The infection might be a
new infection or a relapse. The probability of infection m, given it
occurs in generation t, when the host belongs to group g, and
given it is a relapse with a delay of dgenerations, when the host
belonged to group g, is denoted by Pr [m|td,g;t,g]. Hence,
the probability of infection moccurring in a host in group gin
generation t, which is a relapse from generation td, when the
host belonged to group g,is
Pr m;td, g;t, g

Pr m|td, g;t, g

R(t)
dG(td,t)
g,g .(2)
The conditional probability Pr [m|td,g;t,g]reects the model
of super- and co-infections. There are many possible models.
Super- and co-infections are both notoriously difcult to address.
Namely, knowledge about the vector dynamics and the
distribution of haplotype combinations in the mosquito
population must be known. This is a difcult task and
research on the topic is currently expanding, (cf. Nkhoma
et al., 2012;Wong et al., 2018;Zhu et al., 2019;Nkhoma
et al., 2020;Dia and Cheeseman, 2021;Neafsey et al., 2021).
FIGURE 3
Illustration of the relationship between inbreeding and MOI. Top: An infection with MOI = 1 (single-clone infection) leads only to recombination
between clones, i.e., effectively to no recombination. Bottom: Shown is a super-infection with four infective events (MOI = 4) and three different
haplotypes being transmitted (one haplotype is transmitted independently by two mosquitoes). Recombination between the illustrated haplotypes
leads to the creation of new haplotypes.
Frontiers in Genetics frontiersin.org06
Schneider and Salas 10.3389/fgene.2022.1030463
2.3.1 A model for super-infections
Many approaches to estimate MOI or COI by Bayesian or
maximum-likelihood methods (e.g. (Hill and Babiker, 1995;
Stephens et al., 2001;Rastas et al., 2005;Li et al., 2007;Hastings
and Smith, 2008;Druet and Georges, 2010;Ross et al., 2012;Wigger
et al., 2013;Taylor et al., 2014;Galinsky et al., 2015;Ken-Dror and
Hastings, 2016;Schneider, 2018;Hashemi and Schneider, 2021)) are
based on a model, which assumes only super-infections, but no co-
infections. The number of super-infections mis referred to as
multiplicity of infection (MOI; see Figure 3).
Let M(t,g)
mbe the probability that a host belonging to group g
is super-infected exactly mtimes in generation t. This is a
probability distribution, hence
m1
M(t,g)
m1(3)
for all tand g. At each infectious event, exactly one haplotype is
randomly drawn from the mosquito population, i.e., the
haplotype distribution P
t
. Hence, given MOI min generation
t, the infection m=(m
1
,... ,m
H
), which indicates how many
times haplotype hwas transmitted, follows a multinomial
distribution with parameters mand P
t
, i.e.,
Pr m|m;t
[]
m
m

Pm
t,(4)
where m
m
m!
H
h1mh!is a multinomial coefcient, and
Pm
tH
h1P(t)
hmh. Clearly, the constraint |m|H
h1mhm
must hold. If an infection is a relapse with a delay of d
generations, the haplotypes have to be drawn according to the
distribution P
td
.
Therefore, the probability of infection mgiven it has MOI
m=|m| and occurs in generation t, when the host belongs to
group g, from a relapse with a delay of dgenerations, when the
host belonged to group g, is given by
Pr m,m|td, g;t, g

M(td,g)
m
m
m

Pm
td,(5)
where M(td,g)
mis the probability of MOI min generation tdof
a host in group g. This model makes the expression (WHO,
2021b) much more explicit.
2.3.2 Choices for the distribution of super-
infections
The model (WHO, 2021b) becomes even more explicit for
specic choices of the distribution of MOI. A popular choice
emerges from the assumption of rare and independent infections,
namely that MOI is conditionally Poisson distributed (cf.
Schneider, 2021), i.e.,
M(t,g)
m1
exp λt,g

1
λm
t,g
m!,(6)
where λ
t,g
>0 is the Poisson parameter of group gin generation t
and m=1,2,....
Another popular choice is the conditional negative-binomial
distribution. It is similar to the Poisson distribution but over-
dispersed (cf. 17).
2.4 The exo-erythrocytic and erythrocytic
cycles
Assume an infection subsumed by the vector mhaving MOI
m=|m|. Since all steps of reproduction are clonal inside the host,
it is not necessary to model the different parasite stages explicitly.
Rather, it sufces to model the change in haplotype frequencies
inside the host as a single step.
If the host belongs to group g, the absolutefrequency of
haplotype his mh
mW(t,g)
m,h . Here, W(t,g)
m,h is the tness in generation t
of haplotype hin infection mof a host belonging to group g.Itis
interpreted as the expected number of gametocyte descendants of
a single copy of haplotype hinfecting the host at the time a
mosquito takes her blood meal.
2.4.1 Host-vector transmission
Concerning host-vector transmission, a mosquito ingests a
fraction fof male and female gametocytes at her blood meal. The
gametocyte haplotypes ingested are assumed to be proportional
to the haplotype frequencies within the host. More precisely,
fmh
mW(t,g)
m,h male and female haplotype hare ingested from
infection min group g. (Note different fractions fcan also be
assumed for male and female gametocytes, reecting an unequal
sex ratio.)
2.4.2 Sporogonic cycle
Recombination occurs immediately after the blood meal (see
Figure 1), and only parasites descending from the same host can
recombine (see Figure 3). Assuming the mosquito bite a host
from group gwith infection m, the probability that a male gamete
of haplotype hfertilizes a female i-gamete is the product of their
relative frequencies in the mosquitos gut, i.e.,
fmh
mWt,g
()
m,h
fW t,g
()
m
·fmi
mWt,g
()
m,i
fW t,g
()
m
mhWt,g
()
m,h miWt,g
()
m,i
m2Wt,g
()
m2
,(7)
where
fW t,g
()
mf
H
j1
mj
mWt,g
()
m,j (8)
is the total amount of parasites in the mosquitos gut. Therefore,
the absolute number of such matings is obtained by multiplying
the probability of the mating by the total amount of parasites, i.e.,
fA t,g
()
h,i (9)
Frontiers in Genetics frontiersin.org07
Schneider and Salas 10.3389/fgene.2022.1030463
where
A(t,g)
m,h,i mhWt,g
()
m,h miWt,g
()
m,i
m2Wt,g
()
m
. (10)
The absolute frequency of haplotype hin the population of
mosquitoes, which descends from infections with conguration
m, given 1) MOI m=|m|, 2) the infections occur in generation t,
3) in hosts in group g, which 4) are either novel infections (delay
d= 0) or relapses with a delay of dgenerations, is
Pr m|m;td;t, g

H
j,l1
fA(t,g)
m,j,lrjlh

,(11)
where r(jl h) is the probability that a mating between
gametes with haplotypes jand llead to offspring of
haplotype h.
The absolute number of haplotype hin the mosquito
population, which descend from hosts in group gwith MOI
m, is calculated from the theorem of total probability, i.e., by
averagingover all possible infections mwith MOI m.
Incorporating all relapses it is given by
Ppg,m
()
ht
()
D
d0
R(t)
d
m:|m|m
Pr m|m;td;t, g

×
H
j,l1
fA(t,g)
m,j,lrjlh

. (12)
If an infection in generation tis a relapse from generation td
the host might have belonged to a different group gthen. Noting,
that
Pr m|td;t, g

S
g1
G(td,t)
g,g Pr m|m;td, g;t, g

(13)
equation (Su et al., 2019) can be rewritten as
Ppg,m
()
ht
()
D
d0
R(t)
d
S
g1
G(td,t)
g,g
m:|m|m
Pr m|m;td, g;t, g

×
H
j,l1
fA(t,g)
m,j,lrjlh

.
(14)
2.5 Evolutionary dynamics
To determine the number of haplotypes hin generation t+
1, equation (Ngwa et al., 2016) has to be averaged over all
possible groups and values of MOI. Hence, the absolute
frequency of haplotype hin the next generations
sporozoite population is
Pp
ht+1
()
f
D
d0
R(t)
d
S
g,g1
G(td,t)
g,g
m1
×
m:|m|m
Pr m,m|td, g;t, g

×
H
j,l1
A(t,g)
m,j,lrjlh

. (15)
The relative frequency of haplotype hin the sporozoite
population in generation t+ 1 is hence
Pht+1
()
Pp
ht
()
H
i1
Pp
it
()
. (16)
The dynamics (Watson et al., 2021) are extremely exible.
They allow to model, e.g., temporal changes in selection pressures
(for instance changing treatment policies in the context of drug-
resistance evolution, temporally varying transmission intensities,
intra-host competition of parasites, super- and co-infections,
relapses, recrudescences etc.). This however requires to specify
the model more explicitly.
Next, we show how this is done if only super-infections but
no co-infections are considered.
2.6 Evolutionary dynamics with super-
infections
We introduce a couple of simplifying assumptions, which
make the model more explicit. First, only super- but no co-
infections are assumed. I.e., the super-infection model (Lover
et al., 2018) applies and is substituted into (Schneider and
Escalante, 2013). Thus, (Schneider and Escalante, 2013), becomes
Pp
ht+1
()
f
D
d0
R(t)
d
S
g,g1
G(td,t)
g,g
m1
M(td,g)
m
m:|m|mm
mPm
td
×
H
j,l1
A(t,g)
m,j,l rjlh

.
(17)
3 Results
The framework is appropriate to investigate numerous
evolutionary-genetics aspects in malaria. It would be far too
comprehensive to exemplify the full exibility. Hence, only
special cases are illustrated here. We assume that only super-
infections but no co-infections occur, i.e., the dynamics (Baird
et al., 2018) are assumed. First, we clarify the difference
between haplotype frequency and prevalence. Then we
focus on a simple model of drug resistance. Although it is
applicable to all malaria species, primarily it shall illustrate the
differences between P. falciparum and P. vivax, because there
Frontiers in Genetics frontiersin.org08
Schneider and Salas 10.3389/fgene.2022.1030463
were no reports on drug resistance in any of the other species
(Tseha and Tyagi, 2021).
3.1 Frequency and prevalence
The evolutionary genetics of malaria are described as the
time-change in the frequency distribution of parasite haplotypes.
For instance, monitoring the frequencies of haplotypes, which
confer drug resistance is essential. However, concerning the
clinical pathogenesis, the occurrence of resistance-conferring
haplotypes in infections is more relevant. Due to super- and
co-infections the frequency of a haplotype h, i.e., its relative
abundance among sporozoites in the mosquito population does
not coincide with the probability that haplotype hoccurs in an
infection. The latter is referred to as the haplotypes prevalence.
If only super-infections are considered, the prevalence of
haplotype hin generation t, denoted by q(t)
his derived in section
Prevalence in the Mathematical Appendix. It is given by
qt
()
h1
D
d0
R(t)
d
S
g1
G(td)
gU(td)
g1P(td)
h

,(18)
where U(td)
g(x)is the probability generating function of the
MOI distribution in group gin generation td. This function
characterizes transmission in group gin generation td. From
the above expression it is clear that prevalence depends on (i) the
frequency of haplotype h, (ii) the distributions of MOI in the
various groups, and (iii) the distribution of relapses/
recrudescence. If no relapses or recrudescences occur, as it is
the case for P. falciparum and P. knowlesi, the prevalence
simplies to
qt
()
h1
S
g1
G(t)
gU(t)
g1P(td)
h

. (19)
Hence, for P. falciparum and P. knowlesi prevalence is
characterized by the haplotype frequency distribution in t, the
distribution of groups, and the MOI distributions in the groups.
We illustrate the effect of relapses on prevalence in a simple
example below.
3.2 Selection at a single locus without
intra-host competition
Assume drug resistance is determined by a single locus. This
is a reasonable assumption since often drug resistance is
determined mainly by mutations at one locus. For instance, in
P. falciparum resistance to chloroquine is determined by
mutations at the Pfcrt locus, while resistance artemisinin is
determined by mutations in the Kelch-13 propeller region
(Cui et al., 2015). The assumption is even justied in
sulfadoxine-pyrimethamine resistance, determined by the
Pfdhfr and Pfdhps loci, because mutations at the Pfdhfr locus
seem to have a much stronger effect (McCollum et al., 2012).
Assume nalleles A
1
,... ,A
n
are segregating at the selected
locus. The ndifferent alleles confer different levels of drug
resistance. All other alleles are assumed to be neutral. Thus,
the number of possible haplotypes, H, is a multiple of n, i.e., H=
nN. Hence, Nis the number of all possible haplotypes when the
resistance-conferring locus is disregarded. Let us assume that the
haplotypes are ordered such that haplotypes h=(a1)N+1,...,
aN carry allele A
a
at the resistance-conferring locus. Therefore,
the frequency of allele A
a
at time t+ 1, denoted by p(t+1)
ais
given by
p(t)
a
aN
h(a1)N+1
P(t)
h. (20)
Cumulatively, we denote the vector of allele frequencies in
generation tby p
t
.
Under the assumption of no intra-host competition of
parasites these dynamics can be made more explicit. In an
infection characterized by mof a host in group g, no intra-
host competition means that the tness of an infecting haplotype
his independent of what other haplotypes are present in the
infection, i.e., it is independent of m, or formally
W(t,g)
m,h W(t,g)
h. (21)
Furthermore, because tness is only determined by the
resistance-conferring locus, the tness of haplotype hdepends
only on its allele at this locus. Let the tness of haplotypes
carrying allele A
a
at the resistance-conferring locus be denoted by
w(t,g)
a,i.e,
w(t,g)
aW(t,g)
hW(t,g)
m,h for h
a1
()
N+1,...,aN and for all m. (22)
Moreover, let the average tness of allele A
a
in generation tbe
w(t)
a
S
g1
w(t,g)
aG(t)
g. (23)
As shown in the Mathematical Appendix the dynamics of the
allele frequencies are given by
p(t+1)
a
w(t)
a
D
d0
R(t)
dptd
()
a
n
b1
w(t)
b
D
d0
R(t)
dptd
()
b
. (24)
As in the case without relapses/recrudescence (cf. 17), these
dynamics are independent of the distribution of MOI. This
holds because no intra-host competition occurs and because
only super-infections are considered. Even without intra-host
competition the dynamics of the allele frequencies at the selected
locus might depend on MOI, depending on the assumed model
for co-infections; a general statement cannot be made.
Frontiers in Genetics frontiersin.org09
Schneider and Salas 10.3389/fgene.2022.1030463
Further, the dynamics (Bousema et al., 2011) depend only on the
average tnesses of the alleles w(t)
a. This implies that the stratication
of the host population into different groups does not need to be
modelled explicitly, when considering selection at a single locus.
Note that the average tnesses can be scaled by any constant
without affecting the dynamics (Bousema et al., 2011). Hence, it
sufces to consider relative tnesses, and tness can be
normalized such that w(t)
11 in every generation.
3.2.1 The effect of recrudescences and relapses
In the dynamics of the allele frequencies (Bousema et al.,
2011) the effect of relapses or recrudescence is clearly visible. In
the case of no relapses or recrudescence, i.e., R(t)
d0 for d0 the
dynamics simplify to
p(t+1)
aw(t)
apt
()
a
n
b1
w(t)
bpt
()
b
. (25)
In this situation, the allele frequencies in generation t+ 1 are
solely determined by the tnesses and the allele frequencies in
generation t. Once relapses or recrudescences are considered, the
allele frequencies in generation t+ 1, depend also on the allele
frequencies in previous generations. This is intuitively clear,
because relapses/recrudescence are equivalent to infections
from the sporozoite population from previous generations (see
Figure 2). Hence, relapses/recrudescence act as seed banks.
Intuitively, this will delay the evolutionary dynamics, because the
allele frequencies are averaged over several previous generations.
To further discuss the effect of relapses/recrudescence we
impose some additional assumptions. First, we assume that the
selective environment does not change over time, i.e., w(t)
awa
for all t. This is a reasonable assumption when considering drug
resistance evolution over a time period in which treatment
policies do not change. In this case, the change in allele
frequencies can be solved explicitly only in the absence of
relapses/recrudescence. Namely, the dynamics become
p(t+1)
awt+1
ap0
()
a
n
b1
wt+1
bp0
()
b
,(26)
where p(0)
aare the initial allele frequencies in generation t=0.
From these dynamics it follows that the average tnesses w
a
can
be estimated from longitudinal data of allele frequencies by tting
a straight-line regression (see 48, 17).
Once relapses/recrudescence are considered, the dynamics
can no longer be solved explicitly, but need to be calculated
recursively from the frequencies of the last D+ 1 generations,
i.e., they become
p(t+1)
a
wa
D
d0
R(t)
dptd
()
a
n
b1
wb
D
d0
R(t)
dptd
()
b
. (27)
Importantly, to be able to iterate these dynamics, initial
frequencies need to be known from Dgenerations in the past.
Hence, to calculate the frequencies in generation t= 1, initial
frequencies p(0)
a,p
(−1)
a,...,p
(−D)
aneed to be specied. Moreover,
the distribution R(t)
dneeds to be known. In practice, the
distribution of relapses might change over time. For instance,
changes in control policies impact malaria transmission and
hence the proportion of new infection in comparison to
relapses. If transmission intensities decrease, relapses amount
for a larger fraction of infections. Also the number of
transmission cycles during 1 year decrease. Because the
distribution of the time to relapse measured in years will not
change, the time distribution measured in units of transmission
cycles will change. In the simplest case the distribution of relapses
remains constant over time, i.e., R(t)
dRd, the change of allele
frequencies is given by
p(t+1)
a
wa
D
d0
Rdptd
()
a
n
b1
wb
D
d0
Rdptd
()
b
. (28)
Unfortunately, even if the distribution of relapses is constant, the
average tnesses can no longer be estimated by a linear
regression.
The distribution of relapses depends crucially on the specic
parasite strain (White, 2011). Consider the following example of
drug-resistance evolution, with just two alleles: allele A
1
being the
drug sensitive wildtype and A
2
the mutant allele conferring drug
resistance. The mutant allele rst occurs in generation t=0at
frequency p(0)
20.001. Let w
2
=1+s, where sis the selective
advantage of the drug resistant allele A
2
. We assume s= 0.1,
i.e., the tness is increased by 10%, which is strong selection for
population-genetic processes, but reasonable for selection for
drug-resistance.
Regarding the distribution of relapses, we assume a situation
in which 1 year corresponds to 10 transmission cycles. Relapses
often occur in periodic patterns (White, 2011). We rst assume a
pattern which resembles the relapse pattern described by
(Hankey et al., 1953) in temperate zones of Korea. Namely,
let vbe the probability that a malaria episode relapses, i.e., R
0
=
1v. We assume the rst relapse can occur after 10 transmission
cycles, and all further relapses after 4 further transmission cycles
for a maximum delay of D= 90. More precisely, Rdv
21 for d=
10, 14, 18, 22, ..., 90 and R
d
= 0 else. As a comparison we assume
a simple second pattern of relapses, in which relapses occur
450 generations after the initial infection with equal probability,
i.e., Rdv
43 for d=4,... , 50. Compared with the rst pattern,
relapses occur more frequently and earlier.
The evolutionary dynamics are illustrated in Figure 4.
Without relapses v= 0, the resistance-conferring allele spreads
in approximately 110 generations, which corresponds to 11 years,
under the assumed number of 10 transmission cycles per year.
Relapses substantially slow down the spread of resistance. The
Frontiers in Genetics frontiersin.org10
Schneider and Salas 10.3389/fgene.2022.1030463
reason is that relapses act like seed bankswhich retain the
frequency distribution of previous generations. For the rst
pattern (Figure 4A), 5% relapses already substantially delay
the spread of resistance to about 400 generations or 40 years.
With 20% relapses, the frequency of the mutant allele is just 75%
after 1,000 generations corresponding to 100 years. For the
second pattern (Figure 4B), the results are qualitatively
similar, but relapses have a less profound effect, because they
occur with shorter delay after the original infection.
These results provide formal evidence that drug resistance
spreads faster in P. falciparum, where no relapses occur, than in
P. vivax, where relapses are common. In fact, while drug
resistance is a major concern in P. falciparum, it is less
common in P. vivax (Schneider and Escalante, 2013).
The pattern of relapses depends on 1) genetic factors
mediating the frequency of their occurrence; 2) transmission
intensities determining the number of malaria generations
(transmission cycles per year); 3) the fractions of new
infections and relapses; and4)treatmentpolicies.
Particularly, if a drug is partnered with primaquine (PQ) or
tafenoquine (TQ) for radical cure, the fraction of relapses
reduces, accelerating the spread of resistance to the primary
treatment. However, since PQ or TQ also act on gametocytes,
they prevent transmission and reduce the selective advantage of
drug resistance (cf. 23).
3.2.2 Prevalence
Next consider the prevalences corresponding to the
evolutionary dynamics illustrated in Figure 4. The
evolutionary dynamics are determined by the average tnesses
across the groups of hosts and the distribution of relapses.
Consequently, it was not necessary to specify the groups
explicitly. However, prevalence given by (Collins and Jeffery,
2007) depends on the generating functions of MOI in the
different groups. In the simplest case, which we consider here,
the whole population consists of only one group (S= 1).
Furthermore, we assume that the MOI distribution does not
change over time, and follows a conditional Poisson distribution
(cf. Eq. (6)) with parameter λ. The generating function of this
distribution is given by
Ux
()
exp λx
()
1
exp λx
()
1(29)
(cf. 17).
The prevalence of the resistance-conferring allele is obtained
from (Collins and Jeffery, 2007) by assuming that haplotypes are
characterized by a single locus. Hence,
qt
()
21
D
d0
RdU1ptd
()
2

D
d0
Rd
1exp λptd
()
2

1exp λ
(). (30)
The prevalences corresponding to the dynamics illustrated in
Figure 4A, are depicted in Figure 5, assuming different values of
the Poisson parameter λ, corresponding to different transmission
intensities.
The case λ= 0, implies that only single-infection(one
infective event) occurs, in which case prevalence and
frequency coincide. As shown in (Schneider, 2021) prevalence
always exceeds frequency in the case in which no relapses occur
(Figures 5A,F,K). This is intuitive, because the likelihood to
observe a parasite variant in an infection increases as the
average number of super-infections increase. If transmission
intensities are intermediate to high (λ1), prevalence is
considerably higher than frequency (Figure 5F). If the
frequency of the resistance-conferring allele is small, the
difference between frequency and prevalence is small in
absolute terms, but high in relative terms (compare Figure 5F
with Figure 5K).
If relapses occur, the pattern is similar, however, prevalence
can be lower than frequency (see Figures 5FJ). The reason is that
prevalence is also determined by the frequency distribution of
past generations. This occurs only if the average number of
super-infections is small (λslightly larger than 0) and is
increasingly pronounced if relapses are more frequent. In
general, the difference between frequency and prevalence
becomes smaller in absolute and relative terms as the fraction
FIGURE 4
Effect of relapses on the evolutionary dynamics. Shown is the frequency of the resistance-conferring allele as a function of time assuming
different proportions, vof relapses (colors) for the rst (A) and second (B) patterns of relapses.
Frontiers in Genetics frontiersin.org11
Schneider and Salas 10.3389/fgene.2022.1030463
FIGURE 5
Prevalence. Panels (AE) show the prevalence of the resistance-conferring allele corresponding to the dynamics in Figure 4A for different
values of the Poisson parameter λ(colors). Panels (AE) correspond to the dynamics with 0%, 5%, 10%, 15%, and 20% relapses, respectively. Panels
(FJ) show the corresponding difference between prevalence and frequency, and panels (KO) show the corresponding relative difference
(prevalence minus frequency divided by frequency) in percent.
Frontiers in Genetics frontiersin.org12
Schneider and Salas 10.3389/fgene.2022.1030463
of relapses increase. If this fraction is high (v= 0.15 or v= 0.2) the
particular pattern of relapses leads to oscillations in the relative
difference between prevalence and frequency, if the frequency of
the resistance-conferring allele is low (see Figures 5N,O).
4 Discussion
We introduced a general framework to model evolutionary-
genetic processes in malaria, which is exible enough to capture the
characteristics of all human-pathogenic Plasmodium species. Such a
framework is justied since standard population-genetic theory can
only be approximately applied to malaria. The reason is rooted in the
malaria transmission cycle, which involves one step of sexual
reproduction in the mosquito vectors. A high degree of selng
occurs during this step, because only parasites which descend
fromthesamehumanhostcanrecombine(cf. Figure 3). The
framework extends the one introduced in (Schneider and Kim,
2010;Schneider and Kim, 2011;Schneider, 2021), which is only
applicable to P. falciparum, because it ignores relapses from dormant
liver stages as they occur in P. vivax and P. ovale sp., and
recrudescence form prolonged blood stage parasites as they occur
in P. malariae. These previously widely neglected species are resilient
because of relapses and recrudescence, and hence are gaining more
importance in the context of malaria eradication. We demonstrated
the importance of relapses/recrudescence by contrasting drug
resistance-evolution in P. vivax and P. falciparum.
The necessity to extend the population-genetic framework
towardothermalariaspeciesisclearlyjustied by the results
presented here. Even in the simplest case of resistance being
determined by a single locus, relapses have a profound effect on
the evolutionary dynamics, when assuming the same hypothetical
drug pressure in both species. Namely, relapses substantially delay
the spread of resistance, because they are equivalentat least in the
idealization of the modelto infections with regard to past parasite
frequency distributions. In other words, relapses act as seed banks.
Dormancy by seed banks is known in evolutionary biology as a bet-
hedging strategy that allows organisms to survive through sub-
optimal conditions (Shoemaker and Lennon, 2018)in the case of
malaria the absence of the vector. Seed banks are also known to slow
down evolutionary processes and inuence recombination (Živković
and Tellier, 2012;Koopmann et al., 2017;Tellier, 2019). This is no
exception in malaria. Although exploring the effect of relapses/
recrudescence on recombination was beyond the scope of this work,
the effect is rather obvious. Because relapses/recrudescence slow
down the evolutionary dynamics, more genetic variation is
maintained, leading to a higher level of recombination. In fact, in
P. vivax higher levels of genetic variations than in P. falciparum are a
common empirical observation (e.g. (Pacheco et al., 2020)).
Our results have to be understood in a qualitative rather than a
quantitative context. Namely, the pattern of relapses have a substantial
inuence on the evolutionary dynamics. Hence, for adequately predict
the spread of resistance, good empirical estimates on the pattern of
relapses are necessary. However, empirically distinguishing re-
infections (consecutive independent infectious), recrudescence (a
rebound of parasitaemia due to incomplete clearance of
merozoites), and relapses are notoriously difcult. With more
advanced molecular methods becoming available to produce deep-
sequencing data (e.g. (Zhong et al., 2018;Gruenberg et al., 2019)),
heuristic methods to distinguish recrudescence from reinfections have
been proposed (Lin et al., 2015). Also haplotype-based statistical
models have been proposed (e.g. (Plucinski et al., 2015)). In principle
the framework here can be used to further develop statistical methods
to distinguish reinfections from relapses.
To obtain quantitative predictions it is also important to estimate
other model parameters. In the context of drug resistance, this
includes tness parameters, metabolic costs for resistance, and the
proportion of asymptomatic or untreated infections. The latter can be
achieved by routine diagnostics using reliable methods such as ultra-
sensitive PCR (e.g. (Gruenberg et al., 2020)). However, also the
transmission potential, determined by the abundance of gametocytes
has to be determined (cf. 9). Selection parameters of drug-resistant
haplotypes can be determined from longitudinal molecular data by a
linear regressions in P. falciparum (McCollum et al., 2012;Schneider,
2021). Disentangling the tness parameters into metabolic costs and
selective advantages of resistance is more difcult. Namely, costs and
selectiveadvantagesasfoundinvitro studies (cf. Cortese and Plowe,
1998) do not linearly scale with in vivo observations. In principle,
costs can be achieved by contrasting different populations with
different drug usage. Comparing such results with in vitro studies
helps to identify the functional relationship between in vitro
measurements and in vivo observations. Notably, tness estimates
from a linear regression apply mainly to P. falciparum. For other
malaria species the estimates have to be adapted to the evolutionary
dynamics which account for relapses/recrudescence.
Note that the application to modelling drug resistance here
had only the purpose of contrasting the absence and presence of
relapses. Therefore, only a simplistic model was assumed for drug
resistance, i.e., resistance was assumed to be determined by a
single biallelic locus. The examples here did not exhibit the full
exibility of the model. If drug resistance occurs in a stepwise
fashion as it is found in sulfadoxine-pyrimethamine resistant P.
falciparum haplotypes (Cortese and Plowe, 1998), where
resistance is caused by mutations at several codons in the
Pfdhfr and Pfdhps loci. To capture this situations, resistance-
conferring haplotypes have to be modelled by two mulltiallelic
loci, where each two-locus haplotype is associated with its own
metabolic costs and tness advantage. Moreover, the mutation
haplotypes have to be introduced into the model at different time
points. A simple example can be found in (Schneider, 2021).
Relapses are irrelevant in P. falciparum, and recrudescences
can be neglected, because they occur shortly after the initial
infection and do not need to be modeled explicitly. Nevertheless,
if transmission intensities are high, which is mainly relevant for
P. falciparum, the assumption of non-overlapping generations
(transmission cycles) are questionable. In the extended
Frontiers in Genetics frontiersin.org13
Schneider and Salas 10.3389/fgene.2022.1030463
framework, relapses can be reinterpreted to mimic overlapping
generations. This explains, at least partially, why drug resistance
in P. falciparum does not necessarily spread rst in areas of high
transmission (as they occur in Africa) with many more
transmission cycles per year.
Reinterpreting relapses in the framework is also important when
applied to P. knowlesi, which is primarily pathogenic to non-human
primates, but became the dominant human-pathogenic malaria
species in some endemic areas (Sutherland, 2016). The zoonotic
animal-host reservoir renders P. knowlesi resilient. Different
transmission dynamics between humans and animal hosts can
mediate the duration of a transmission cycle. If the number of
transmission cycles per year differs among human and non-human
primate hosts, this discrepancy can be compensated by modeling
overlapping generations by relapses.
We also discussed the differences of frequency and prevalence of
parasite haplotypes. The former is the relative abundance of a
haplotype in the parasite population, the latter the likelihood that
the haplotype occurs in an infection. Studying the haplotype
frequency distribution over time is the aim of evolutionary
genetics. From a clinical or epidemiological point of view,
prevalence is more relevant. The latter is determined by the
haplotype frequency distribution and the distribution of super-
or co-infections. This was already emphasized in the context of
seasonal malaria transmission in (Schneider, 2021)forP. falciparum.
Itwasshownthattheprevalenceofahaplotypealwaysexceedsits
frequency. This changes if relapses/recrudescence occur and was
exemplied here by the hypothetical dynamics of drug-resistance
evolution.
The applications of the framework introduced here are
manifold. For instance, in the context of drug resistance, the
framework allows to investigate the evolution of multi-drug
resistance determined by several loci and changing drug-
treatment policies. Also patterns of selection, e.g., genetic
hitchhiking, can be studied using this framework. The illustrated
applications were only under the simplest assumptions, e.g., of no
intra-host competition and super- but no co-infections.
Intra-host competition plays an important role in the spread of
HRP2/3 gene deletions associated with false-negative malaria rapid
diagnostic tests (RDTs) (Gamboa et al., 2010). Namely, if the
treatment guidelines require to verify suspected infections by
RDTs before treatment with artemisinin combination therapies
(ACTs), as recommended by the WHO (World Health
Organization, 2017), false-negative results can lead to delayed
treatment. Similarly intra-host competition seems relevant when
considering selection on merozoite surface proteins (Goh et al., 2021).
Intra-host dynamics enter the model via the denition of tness.
It is not necessary to dene an evolutionary-genetic model which
captures two timescales, the evolutionary dynamics in terms of
generations of transmission cycles, and the timescale of an infectious
episode in the same model, as it was done, e.g., in (Kim et al., 2014).
Rather, the framework can be used in a multi-scale model, which
takes input from a separate intra-host model.
Similarly, the framework does not require to model the mosquito
dynamics explicitly. They rather enter via the distribution of super-
and co-infections. Considering only super-infections has the
conceptional advantage, that it is a well-dened model. It is
frequently used in statistical approaches to estimate haplotype
frequency distributions and MOI (cf. e.g. Hill and Babiker, 1995;
Stephens et al., 2001;Li et al., 2007;Hastings and Smith, 2008;Wigger
et al., 2013;Schneider, 2018;Hashemi and Schneider, 2021). Ignoring
co-infectionsisjustied if the distribution of haplotypes in the
mosquitoes is uncorrelated or when considering only few loci.
However, if one aims to include genetic relatedness, it is
important to specify a model for co-infections. This becomes
increasingly popular as more high-quality genomic data is
becoming available in malaria, which has enough resolution to
study genetic relatedness (cf. Nkhoma et al., 2012;Wong et al.,
2018;Zhu et al., 2019;Nkhoma et al., 2020;Dia and Cheeseman,
2021;Neafsey et al., 2021).
Although the framework is very general, it also has several
limitations. First, it ignores mutations. This is not a strong
restriction, because in many applications one is interested in
de novo mutations which occur at discrete time points. This is
captured by the model, by introducing new haplotypes
(i.e., extending the model) at certain times. However, constant
mutation rates, e.g., to study mutation-selection balance, can be
easily introduced. Another limitation is the deterministic nature
of the framework. When aiming to study stochastic effects such
as genetic drift, it is rather straightforward to develop a stochastic
version of the framework. Third, the model ignores mitotic
recombination during merozoite production inside the host.
This plays an important role in some applications, particularly
in the structural rearrangement of Var genes (Claessens et al.,
2014). These hypervariable genes are responsible to
generate important antigen proles for parasite-host
interactions (Warimwe et al., 2009). In any case the
framework introduced here allows studying manifold
evolutionary-genetic aspects of malaria. Importantly, it allows
us to specify benchmark scenarios. More empirical evidence is
required to rene relevant parametrizations of the framework.
Data availability statement
The original contributions presented in the study are
included in the article/Supplementary Material, further
inquiries can be directed to the corresponding author.
Author contributions
KS conceptualized the work, developed the mathematical
model, performed the mathematical analysis, produced all
gures, and wrote the manuscript. CS assisted to
conceptualize the work and wrote the manuscript.
Frontiers in Genetics frontiersin.org14
Schneider and Salas 10.3389/fgene.2022.1030463
Funding
This study was funded by the Armed Forces Health Surveillance
Division (AFHSD), Global Emerging Infections Surveillance (GEIS)
Branch, ProMIS ID P0082_22_N6. This work was also supported by
grantsoftheGermanAcademicExchange(DAAD;https://www.
daad.de/de/; Project-ID 57417782, Project-ID: 57599539), the
Sächsisches Staatsministerium für Wissenschaft, Kultur und
Tourismus and Sächsische AufbaubankFörderbank (SMWK-
SAB; https://www.smwk.sachsen.de/;https://www.sab.sachsen.de/;
project Innovationsvorhaben zur Prolschärfung an Hochschulen
für angewandte Wissenschaften, Project-ID 100257255; project
Innovationsvorhaben zur Prolschärfung 2022,Project-ID:
100613388), the Federal Ministry of Education and Research
(BMBF) and the DLR (Project-ID 01DQ20002; https://www.
bmbf.de/;https://www.dlr.de/).
Acknowledgments
The authors gratefully acknowledge the constructive
comments of the two reviewers and the editor.
Conict of interest
The authors declare that the research was conducted in the
absence of any commercial or nancial relationships that could
be construed as a potential conict of interest.
Publishers note
All claims expressed in this article are solely those of the
authors and do not necessarily represent those of their afliated
organizations, or those of the publisher, the editors and the
reviewers. Any product that may be evaluated in this article, or
claim that may be made by its manufacturer, is not guaranteed or
endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found
online at: https://www.frontiersin.org/articles/10.3389/fgene.
2022.1030463/full#supplementary-material
References
Baird, J. K., Battle, K. E., and Howes, R. E. (2018). Primaquine ineligibility in anti-relapse
therapy of plasmodium vivax malaria: The problem of g6pd deciency and cytochrome p-
450 2d6 polymorphisms. Malar. J. 17, 4246. doi:10.1186/s12936-018-2190-z
Beshir, K. B., Parr, J. B., Cunningham, J., Cheng, Q., and Rogier, E. (2022).
Screening strategies and laboratory assays to support plasmodium falciparum
histidine-rich protein deletion surveillance: Where we are and what is needed.
Malar. J. 21, 201212. doi:10.1186/s12936-022-04226-2
Bousema, T., Kreuels, B., and Gosling, R. (2011). Adjusting for heterogeneity of malaria
transmission in longitudinal studies. J. Infect. Dis. 204, 13. doi:10.1093/infdis/jir225
Chamchod, F., and Beier, J. C. (2013). Modeling plasmodium vivax: Relapses,
treatment, seasonality, and g6pd deciency. J. Theor. Biol. 316, 2534. doi:10.1016/j.
jtbi.2012.08.024
Claessens, A., Hamilton, W. L., Kekre, M., Otto, T. D., Faizullabhoy, A., Rayner,
J. C., et al. (2014). Generation of antigenic diversity in plasmodium falciparum by
structured rearrangement of var genes during mitosis. PLoS Genet. 10, e1004812.
doi:10.1371/journal.pgen.1004812
Collins, W. E., and Jeffery, G. M. (2007). Plasmodium malariae: Parasite and
disease. Clin. Microbiol. Rev. 20, 579592. doi:10.1128/CMR.00027-07
Cortese, J. F., and Plowe, C. V. (1998). Antifolate resistance due to new and
known plasmodium falciparum dihydrofolate reductase mutations expressed
in yeast. Mol. Biochem. Parasitol. 94, 205214. doi:10.1016/s0166-6851(98)
00075-9
Cui, L., Mharakurwa, S., Ndiaye, D., Rathod, P. K., and Rosenthal, P. J. (2015).
Antimalarial drug resistance: Literature review and activities and ndings of the
icemr network. Am. J. Trop. Med. Hyg. 93, 5768. doi:10.4269/ajtmh.15-0007
Dean, L., and Kane, M. (2020). Tafenoquine therapy and g6pd genotype,in
Medical genetics summaries. Editors V. M. Pratt, S. A. Scott, M. Pirmohamed,
B. Esquivel, M. S. Kane, and B. L. Kattman (United states: National Center for
Biotechnology Information).
Dia, A., and Cheeseman, I. H. (2021). Single-cell genome sequencing of protozoan
parasites. Trends Parasitol. 37, 803814. doi:10.1016/j.pt.2021.05.013
Druet, T., and Georges, M. (2010). A hidden markov model combining linkage and
linkage disequilibriuminformation for haplotype reconstruction and quantitative trait
locus ne mapping. Genetics 184, 789798. doi:10.1534/genetics.109.108431
Galinsky, K., Valim, C., Salmier, A., de Thoisy, B., Musset, L., Legrand, E., et al.
(2015). Coil: A methodology for evaluating malarial complexity of infection using
likelihood from single nucleotide polymorphism data. Malar. J. 14, 4. doi:10.1186/
1475-2875-14-4
Gamboa, D., Ho, M. F., Bendezu, J., Torres, K., Chiodini, P. L., Barnwell, J. W.,
et al. (2010). A large proportion of p. falciparum isolates in the amazon region of
Peru lack pfhrp2 and pfhrp3: Implications for malaria rapid diagnostic tests. PloS
one 5, e8091. doi:10.1371/journal.pone.0008091
Goh, X. T., Lim, Y. A., Lee, P. C., Nissapatorn, V., and Chua, K. H. (2021).
Diversity and natural selection of merozoite surface protein-1 in three species of
human malaria parasites: Contribution from south-east Asian isolates. Mol.
Biochem. Parasitol. 244, 111390. doi:10.1016/j.molbiopara.2021.111390
Gonzales, S. J., Reyes, R. A., Braddom, A. E., Batugedara, G., Bol, S., and Bunnik,
E. M. (2020). Naturally acquired humoral immunity against plasmodium
falciparum malaria. Front. Immunol. 11, 594653. doi:10.3389/mmu.2020.594653
Greenwood, B., Cairns, M.,Chaponda, M., Chico,R. M., Dicko, A., Ouedraogo, J. B.,
et al. (2021).Combining malaria vaccination with chemoprevention: A promising new
approach to malaria control. Malar. J. 20, 361367. doi:10.1186/s12936-021-03888-8
Gruenberg, M., Lerch, A., Beck, H. P., and Felger, I. (2019). Amplicon deep
sequencing improves plasmodium falciparum genotyping in clinical trials of
antimalarial drugs. Sci. Rep. 9, 17790. doi:10.1038/s41598-019-54203-0
Gruenberg, M., Moniz, C. A., Hofmann, N. E., Koepi, C., Robinson, L. J., Nate,
E., et al. (2020). Utility of ultra-sensitive qpcr to detect plasmodium falciparum and
plasmodium vivax infections under different transmission intensities. Malar. J. 19,
319. doi:10.1186/s12936-020-03374-7
Hankey, D. D., Jones, R., Jr, Coatney, G. R., Alving, A. S., Coker, W. G., Garrison,
P. L., et al. (1953). Korean vivax malaria. i. natural history and response to
chloroquine. Am. J. Trop. Med. Hyg. 2, 958969. doi:10.4269/ajtmh.1953.2.958
Hashemi, M., and Schneider, K. A. (2021). Bias-corrected maximum-likelihood
estimation of multiplicity of infection and lineage frequencies. PloS one 16,
e0261889. doi:10.1371/journal.pone.0261889
Hastings, I. M., and Smith, T. A. (2008). MalHaploFreq: A computer programme
for estimating malaria haplotype frequencies from blood samples. Malar. J. 7, 130.
doi:10.1186/1475-2875-7-130
Hedrick, P. W. (2011). Population genetics of malaria resistance in humans.
Heredity 107, 283304. doi:10.1038/hdy.2011.16
Hill, W. G., and Babiker, H. A. (1995). Estimation of numbers of malaria clones in
blood samples. Proc. Biol. Sci. 262, 249257. doi:10.1098/rspb.1995.0203
Frontiers in Genetics frontiersin.org15
Schneider and Salas 10.3389/fgene.2022.1030463
Ken-Dror,G., and Hastings,I. M. (2016).Markov chainMonte Carloand expectation
maximization approachesfor estimation of haplotype frequenciesfor multiply infected
human blood samples. Malar. J. 15, 430. doi:10.1186/s12936-016- 1473-5
Kim, Y., Escalante, A. A., and Schneider, K. A. (2014). A population genetic model
for the initial spread of partially resistant malaria parasites under anti-malarial
combination therapy and weak intrahost competition. PLOS ONE 9,
e101601e101615. doi:10.1371/journal.pone.0101601
Koopmann, B., Müller, J., Tellier, A., and Živković, D. (2017). Fisherwright
model with deterministic seed bank and selection. Theor. Popul. Biol. 114, 2939.
doi:10.1016/j.tpb.2016.11.005
Li, X., Foulkes, A. S., Yucel, R. M., and Rich, S. M. (2007). An expectation
maximization approach to estimate malaria haplotype frequencies in multiply
infected children. Stat. Appl. Genet. Mol. Biol. 6, 33. doi:10.2202/1544-6115.1321
Lin, J. T., Hathaway, N. J., Saunders, D. L., Lon, C., Balasubramanian, S., Khar abora, O.,
et al. (2015). Using amplicon deep sequencing to detect genetic signatures of plasmodium
vivax relapse. J. Infect. Dis. 212, 9991008. doi:10.1093/infdis/jiv142
Lover,A. A.,Baird,J. K.,Gosling,R.,and Price,R.N.(2018).Malariaelimination:Time
to target all species. Am. J. Trop. Med. Hyg. 99, 1723. doi:10.4269/ajtmh.17-0869
McCollum, A.M., Schneider, K. A., Grifng, S. M., Zhou, Z., Kariuki, S., Ter-Kuile,
F., et al. (2012). Differences in selective pressure on dhps and dhfr drug resistant
mutations in Western Kenya. Malar. J. 11, 77. doi:10.1186/1475-2875-11-77
Ndiaye, Y. D., Hartl, D. L., McGregor, D., Badiane, A., Fall, F. B., Daniels, R. F.,
et al. (2021). Genetic surveillance for monitoring the impact of drug use on
plasmodium falciparum populations. Int. J. Parasitol. Drugs Drug Resist. 17,
1222. doi:10.1016/j.ijpddr.2021.07.004
Neafsey, D. E., Taylor, A. R., and MacInnis, B. L. (2021). Advances and
opportunities in malaria population genomics. Nat. Rev. Genet. 22, 502517.
doi:10.1038/s41576-021-00349-5
Ngwa, C. J., Rosa, T., and Pradel, G. (2016). The biology of malaria gametocytes.
Rijeka, Croatia: IntechOpen.
Nicoletti, M. (2020). Three scenarios in insect-borne diseases. Insect-Borne Dis.
21st Century 2020, 99251. doi:10.1016/b978-0-12-818706-7.00005-x
Nkhoma, S. C., Nair, S., Cheeseman, I. H., Rohr-Allegrini, C., Singlam, S., Nosten,
F., et al. (2012). Close kinship within multiple-genotype malaria parasite infections.
Proc. Biol. Sci. 279, 25892598. doi:10.1098/rspb.2012.0113
Nkhoma, S. C., Trevino, S. G., Gorena, K. M., Nair, S., Khoswe, S., Jett, C., et al.
(2020). Co-Transmission of related malaria parasite lineages shapes within-host
parasite diversity. Cell Host Microbe 27, 93103. e4. doi:10.1016/j.chom.2019.12.001
Pacheco, M. A., Forero-Peña, D. A., Schneider, K. A., Chavero, M., Gamardo, A.,
Figuera, L., et al. (2020). Malaria in Venezuela: Changes in the complexity of
infection reects the increment in transmission intensity. Malar. J. 19, 176. doi:10.
1186/s12936-020-03247-z
Plucinski, M. M., Morton, L., Bushman, M., Dimbu, P. R., and Udhayakumar, V.
(2015). Robust algorithm for systematic classication of malaria late treatment
failures as recrudescence or reinfection using microsatellite genotyping. Antimicrob.
Agents Chemother. 59, 60966100. doi:10.1128/AAC.00072-15
Rastas, P., Koivisto, M., Mannila, H., and Ukkonen, E. (2005). A hidden markov
technique for haplotype reconstruction,in Algorithms in bioinformatics. Editors
R. Casadio and G. Myers (Berlin, Heidelberg: Springer), 140151. Lecture Notes in
Computer Science. doi:10.1007/11557067_12
Ross, A., Koepi, C., Li, X., Schoepin, S., Siba, P., Mueller, I., et al. (2012).
Estimating the numbers of malaria infections in blood samples using high-
resolution genotyping data. Plos One 7, e42496. doi:10.1371/journal.pone.0042496
Schneider, K. A. (2021). Charles Darwin meets ronald Ross: A population-genetic
framework for the evolutionary dynamics of malaria. (Cham: Springer International
Publishing), 149191. chap. 6. doi:10.1007/978-3-030-50826-5_6
Schneider, K. A., and Escalante, A. A. (2013). Fitness components and natural
selection: Why are there different patterns on the emergence of drug resistance in
plasmodium falciparum and plasmodium vivax? Malar. J. 12, 1511. doi:10.1186/
1475-2875-12-15
Schneider, K. A., and Kim, Y. (2010). An analytical model for genetic hitchhiking
in the evolution of antimalarial drug resistance. Theor. Popul. Biol. 78, 93108.
doi:10.1016/j.tpb.2010.06.005
Schneider, K. A., and Kim, Y. (2011). Approximations for the hitchhiking effect
caused by the evolution of antimalarial-drug resistance. J. Math. Biol. 62, 789832.
doi:10.1007/s00285-010-0353-9
Schneider, K. A. (2018). Large and nite sample properties of a maximum-
likelihood estimator for multiplicity of infection. PloS one 13, e0194148. doi:10.
1371/journal.pone.0194148
Schneider, K. A., Tsoungui Obama, H. C. J., Kamanga, G., Kayanula, L., and Adil
Mahmoud Yousif, N. (2022). The many denitions of multiplicity of infection.
Front. Epidemiol. 2, 961593. doi:10.3389/fepid.2022.961593
Selvaraj, P., Wenger, E. A., and Gerardin, J. (2018). Seasonality and heterogeneity
of malaria transmission determine success of interventions in high-endemic
settings: A modeling study. BMC Infect. Dis. 18, 413414. doi:10.1186/s12879-
018-3319-y
Shoemaker, W. R., and Lennon, J. T. (2018). Evolution with a seed bank: The
population genetic consequences of microbial dormancy. Evol. Appl. 11, 6075.
doi:10.1111/eva.12557
Stephens, M., Smith, N. J., and Donnelly, P. (2001). A new statistical method for
haplotype reconstruction from population data. Am. J. Hum. Genet. 68, 978989.
doi:10.1086/319501
Su,Xz,Lane,K.D.,Xia,L.,Sá,J.M.,andWellems,T.E.(2019).Plasmodiumgenomics
and genetics: New insights into malaria pathogenesis, drug resistance, epidemiology, and
evolution. Clin. Microbiol. Rev. 32, e00019. doi:10.1128/CMR.00019-19
Sutherland, C. J. (2016). Persistent parasitism: The adaptive biology of malariae
and ovale malaria. Trends Parasitol. 32, 808819. doi:10.1016/j.pt.2016.07.001
Taylor, A. R., Flegg, J. A., Nsobya, S. L., Yeka, A., Kamya, M. R., Rosenthal, P. J.,
et al. (2014). Estimation of malaria haplotype and genotype frequencies: A statistical
approach to overcome the challenge associated with multiclonal infections. Malar.
J. 13, 102. doi:10.1186/1475-2875-13-102
Tellier, A. (2019). Persistent seed banking as eco-evolutionary determinant of
plant nucleotide diversity: Novel population genetics insights. New Phytol. 221,
725730. doi:10.1111/nph.15424
Tseha, S. T. (2021). Plasmodium species and drug resistance,in Plasmodium
species and drug resistance. Editor R. K. Tyagi (Rijeka: IntechOpen). chap. 2. doi:10.
5772/intechopen.98344
Warimwe,G. M., Keane,T. M., Fegan,G., Musyoki,J. N., Newton,C. R., Pain, A.,et al.
(2009).Plasmodiumfalciparumvargeneexpressionis modiedby host immunity.Proc.
Natl. Acad. Sci. U. S. A. 106, 2180121806. doi:10.1073/pnas.0907590106
Watson, J. A., Nekkab, N., and White, M. (2021). Tafenoquine for the prevention
of plasmodium vivax malaria relapse. Lancet. Microbe 2, e175e176. doi:10.1016/
S2666-5247(21)00062-8
White, N. J. (2011). Determinants of relapse periodicity in plasmodium vivax
malaria. Malar. J. 10, 297. doi:10.1186/1475-2875-10-297
WHO (2021). Global technical strategy for malaria 20162030. Geneva,
Switzerland: World Health Organization.
WHO (2021). World malaria report 2020: 20 years of global progress and
challenges. Available at: www.who.int/teams/global-malaria-programme/reports/
world-malaria-report-2020.
Wigger, L., Vogt, J. E., and Roth, V. (2013). Malaria haplotype frequency
estimation. Stat. Med. 32, 37373751. doi:10.1002/sim.5792
Wong, W., Wenger, E. A., Hartl, D. L., and Wirth,D. F. (2018). Modelingthe genetic
relatedness of Plasmodium falciparum parasites following meiotic recombination and
cotransmission. PLoS Comput. Biol. 14, e1005923. doi:10.1371/journal.pcbi.1005923
World Health Organization, (2017) A framework for malaria elimination.
Zhong,D.,Lo,E.,Wang,X.,Yewhalaw,D.,Zhou,G.,Atieli,H.E.,etal.(2018).Multiplicity
and molecular epidemiology of plasmodium vivax and plasmodium falciparum infections
in east Africa. Malar. J. 17, 185. doi:10.1186/s12936-018-2337-y
Zhu,S.J.,Hendry,J.A.,Almagro-Garcia,J.,Pearson,R.D.,Amato,R.,Miles,
A., et al. (2019). The origins and relatedness structure of mixed infections vary
with local prevalence of P. falciparum malaria. eLife 8, e40845. doi:10.7554/
eLife.40845
Živković, D., and Tellier, A. (2012). Germ banks affect the inference of past
demographic events. Mol. Ecol. 21, 54345446. doi:10.1111/mec.12039
Frontiers in Genetics frontiersin.org16
Schneider and Salas 10.3389/fgene.2022.1030463
... Namely, in this context, MOI or complexity of infection (COI) is established as a fundamental metric that scales with transmission intensity. Here, MOI is referred to as the total number of super-infections due to multiple infective contacts during one disease episode, following the definition of [2] (see Fig 1). Note, that the same pathogen 'lineage' might super-infect a host several times, which is accounted for by this definition. ...
... Several methods to estimate the distribution of MOI and frequency distribution of parasite lineages (e.g. allele or haplotype frequencies) from molecular data exist (see the introductions in [2,3] for an overview). In the case of malaria, molecular data is usually generated from disease-positive blood samples and contains allelic information at molecular markers. ...
... Thus, the sample size of X * b is N ðbÞ þ . For each dataset X b , the maximum-likelihood estimate (MLE),θ b , based on the extended model (2), is calculated (see Deriving the maximum-likelihood estimate). Furthermore, from the MLEθ b the average MOIĉ b is calculated according to (2). ...
Article
Full-text available
Background Molecular surveillance of infectious diseases allows the monitoring of pathogens beyond the granularity of traditional epidemiological approaches and is well-established for some of the most relevant infectious diseases such as malaria. The presence of genetically distinct pathogenic variants within an infection, referred to as multiplicity of infection (MOI) or complexity of infection (COI) is common in malaria and similar infectious diseases. It is an important metric that scales with transmission intensities, potentially affects the clinical pathogenesis, and a confounding factor when monitoring the frequency and prevalence of pathogenic variants. Several statistical methods exist to estimate MOI and the frequency distribution of pathogen variants. However, a common problem is the quality of the underlying molecular data. If molecular assays fail not randomly, it is likely to underestimate MOI and the prevalence of pathogen variants. Methods and findings A statistical model is introduced, which explicitly addresses data quality, by assuming a probability by which a pathogen variant remains undetected in a molecular assay. This is different from the assumption of missing at random, for which a molecular assay either performs perfectly or fails completely. The method is applicable to a single molecular marker and allows to estimate allele-frequency spectra, the distribution of MOI, and the probability of variants to remain undetected (incomplete information). Based on the statistical model, expressions for the prevalence of pathogen variants are derived and differences between frequency and prevalence are discussed. The usual desirable asymptotic properties of the maximum-likelihood estimator (MLE) are established by rewriting the model into an exponential family. The MLE has promising finite sample properties in terms of bias and variance. The covariance matrix of the estimator is close to the Cramér-Rao lower bound (inverse Fisher information). Importantly, the estimator’s variance is larger than that of a similar method which disregards incomplete information, but its bias is smaller. Conclusions Although the model introduced here has convenient properties, in terms of the mean squared error it does not outperform a simple standard method that neglects missing information. Thus, the new method is recommendable only for data sets in which the molecular assays produced poor-quality results. This will be particularly true if the model is extended to accommodate information from multiple molecular markers at the same time, and incomplete information at one or more markers leads to a strong depletion of sample size.
... The global COVID-19 pandemic underlined the importance of molecular surveillance in 2 disease control and prevention, as reflected by the recently released WHO Global 3 genomic surveillance strategy [1]. Molecular surveillance is well-established in some of 4 the most relevant infectious diseases in terms of incidence, mortality, and economic 5 burden. ...
... 10 Namely, in this context, MOI or complexity of infection (COI) are established as a 11 fundamental metric which scales with transmission intensity. Here, MOI is referred to 12 as the total number of super-infections due to multiple infective contacts during one 13 disease episode following the definition of [2] (see Figure 1). Note, the same pathogen 14 'lineage' might super-infect a host several times, which is accounted for by this definition. ...
... 18 Several methods to estimate the distribution of MOI and frequency distribution of 19 parasite lineages (e.g. allele or haplotype frequencies) from molecular data exist (see the 20 introductions in [3] and [2] for an overview). In the case of malaria, molecular data is 21 usually generated from disease-positive blood samples and contains allelic information 22 at molecular markers. ...
Preprint
Full-text available
Background Molecular surveillance of infectious diseases allows the monitoring of pathogens beyond the granularity of traditional epidemiological approaches and is well-established for some of the most relevant infectious diseases such as malaria. The presence of genetically distinct pathogenic variants within an infection, referred to as multiplicity of infection (MOI) or complexity of infection (COI) is common in malaria and similar infectious diseases. It is an important metric that scales with transmission intensities, potentially affects the clinical pathogenesis, and a confounding factor when monitoring the frequency and prevalence of pathogenic variants. Several statistical methods exist to estimate MOI and the frequency distribution of pathogen variants. However, a common problem is the quality of the underlying molecular data. If molecular assays fail not randomly, it is likely to underestimate MOI and the prevalence of pathogen variants. Methods and findings A statistical model is introduced which explicitly addresses data quality, by assuming a probability by which a pathogen variant remains undetected in a molecular assay. This is different from the assumption of missing at random, for which a molecular assay either performs perfectly or fails completely. The method is applicable to a single molecular marker and allows to estimate allele-frequency spectra, the distribution of MOI, and the probability of variants to remain undetected (incomplete information). Based on the statistical model, expressions for the prevalence of pathogen variants are derived and differences between frequency and prevalence are discussed. The usual desirable asymptotic properties of the maximum-likelihood estimator (MLE) are established by rewriting the model into an exponential family. The MLE has promising finite sample properties in terms of bias and variance. The covariance matrix of the estimator is close to the Cramér-Rao lower bound (inverse Fisher information). Importantly, the estimator’s variance is larger than that of a similar method which disregards incomplete information, but its bias is smaller. Conclusions Although the model introduced here has convenient properties, in terms of the mean squared error it does not outperform a simple standard method that neglects missing information. Thus, the new method is recommendable only for data sets in which the molecular assays produced poor quality results. This will be particularly true if the model is extended to accommodate information from multiple molecular markers at the same time, and incomplete information at one or more markers leads to strong depletion of sample size.
... The BIMEP aims to eliminate malaria from Bioko Island in the near future, and the continuous surveillance of circulating parasites is essential for prevention of parasite resurgence from low density reservoirs or strains circumventing RDT diagnosis by lacking functional expression of hrp2 or hrp3 [5,12,33,34]. Molecular monitoring techniques can help identify those infections missed by RDTs, such as qPCR, which is typically a specialized, resource-rich laboratory technique. ...
Article
Full-text available
Background Effective malaria control requires accurate identification of Plasmodium infections to tailor interventions appropriately. Rapid diagnostic tests (RDTs) are crucial tools for this purpose due to their small size and ease-of-use functionality. These tests typically target the Plasmodium falciparum histidine-rich protein 2 (HRP2) antigen. However, some strains of P. falciparum have deletions in the hrp2 and hrp3 genes, which may result in a false negative diagnosis using HRP2-based RDTs. Additionally, RDTs have a detection limit of 100 parasites per microlitre, insufficient for identifying low-density infections that sustain malaria transmission. This study explores integrating molecular monitoring using a novel cartridge-based PCR test, PlasmoPod, using samples from a malaria indicator survey (MIS) on Bioko Island, Equatorial Guinea to enhance detection of low-density infections and inform targeted malaria control strategies. Methods The study utilized a combination of RDTs and the DiaxxoPCR device for molecular monitoring. The device DiaxxoPCR uses a prefilled cartridge system, termed PlasmoPod for a malaria-based assay that employs a qPCR assay targeting 18S rDNA/rRNA. Samples from the 2023 MIS were extracted from dried blood spots (DBS), qPCR run in duplicate on the PlasmoPod. Epidemiological data from the MIS were merged with molecular data and the association between MIS variables to malaria infection by qPCR, and low-density infections were measured. Results The integration of molecular monitoring revealed a proportion of low-density infections that circumvented RDTs diagnosis. Notably, individuals in urban communities and those reporting recent fever were more likely to harbour low-density, asymptomatic malaria infections. Findings suggest that urban residents, although less associated to malaria infection than rural residents by both RDT and qPCR, may be serving as a transmission reservoir. The relationship between low-density infections and individuals who recently reported fever may reflect recent anti-malarial treatment or natural clearance, and thus have lingering parasites in their blood. Conclusion The study highlights the limitations of HRP2-based RDTs in detecting low density infections and underscores the potential of molecular tools like PlasmoPod in malaria surveillance. By identifying elusive transmission reservoirs and tracking parasite importation, molecular monitoring can play a crucial role in achieving malaria elimination. The findings advocate for the broader implementation of molecular diagnostics in malaria programs, especially in areas with low transmission, to enhance the detection and targeting of hidden reservoirs of infection.
... When selective pressure is highest (i.e. low transmission settings), these parasites may have an increased evolutionary advantage due to the increased di culty to detect infections in the general population, leading to their continued contribution to transmission [33,34]. In the event of any interruption in malaria control efforts before complete elimination, these reservoirs can act as a source to undo progress, and result in a resurgence to pre-control levels in malaria burden [35][36][37]. ...
Preprint
Full-text available
Background: Effective malaria control requires accurate identification of Plasmodium infections to tailor interventions appropriately. Rapid diagnostic tests (RDTs) are crucial tools for this purpose due to their small size and ease-of-use functionality. These tests typically target the Plasmodium falciparum histidine-rich protein 2 (HRP2) antigen. However, some strains of P. falciparum have deletions in the hrp2 and hrp3 genes, which may result in a false negative diagnosis using HRP2-based RDTs. Additionally, RDTs have a detection limit of less than 100 parasites per microliter, insufficient for identifying low density infections that sustain malaria transmission. This study explores integrating molecular monitoring using a novel cartridge-based PCR test, PlasmoPod, using samples from a malaria indicator surveys (MIS) on Bioko Island, Equatorial Guinea to enhance detection of low density infections and inform targeted malaria control strategies. Methods: The study utilized a combination of RDTs and the DiaxxoPCR device for molecular monitoring. The PlasmoPod employs qPCR targeting 18S rDNA/rRNA, capable of detecting low parasite density infections and is significantly more sensitive than HRP2-based RDTs. Samples from the 2023 MIS were extracted from dried blood spots (DBS), qPCR run in duplicate on the PlasmoPod. Epidemiological data from the MIS were merged with molecular data and the association between various risk factors to malaria infection by qPCR, and risk factors to low density infections were measured. Results: The integration of molecular monitoring revealed a proportion of low density infections that circumvented RDTs diagnosis. Notably, individuals in urban communities and those reporting recent fever were more likely to harbor low density, asymptomatic malaria infections. Findings suggest that urban residents, although less associated to malaria infection than rural residents, may be serving as a transmission reservoir. The relationship between low density infections and individuals who recently reported fever may reflect recent antimalarial treatment or natural clearance, and thus have lingering parasites in their blood. Conclusion: The study highlights the limitations of HRP2-based RDTs in detecting low density infections and underscores the potential of molecular tools like PlasmoPod in malaria surveillance. By identifying elusive transmission reservoirs and tracking parasite importation, molecular monitoring can play a crucial role in achieving malaria elimination. The findings advocate for the broader implementation of molecular diagnostics in malaria programs, especially in areas with low transmission, to enhance the detection and targeting of hidden reservoirs of infection.
... However, so far, empirical evidence on the impact of MOI on the clinical pathogenesis of malaria remains inconclusive (Pacheco et al., 2016). Nevertheless, the distribution of MOI influences the evolutionary dynamics of malaria (Schneider, 2021;Schneider and Salas, 2022) and mediates evolutionary-genetic patterns and pathogen genetic diversity (e.g., patterns of genetic hitchhiking and linkage disequilibrium) as it affects the effective rate of recombination as illustrated in Figure 1 (Schneider and Kim, 2010;Alizon et al., 2013). Moreover, there is an important link between the frequency distribution of pathogen variants, their occurrence within infections (i.e., prevalence), and MOI. ...
Article
Full-text available
Introduction The presence of multiple genetically distinct variants (lineages) within an infection (multiplicity of infection, MOI) is common in infectious diseases such as malaria. MOI is considered an epidemiologically and clinically relevant quantity that scales with transmission intensity and potentially impacts the clinical pathogenesis of the disease. Several statistical methods to estimate MOI assume that the number of infectious events per person follows a Poisson distribution. However, this has been criticized since empirical evidence suggests that the number of mosquito bites per person is over-dispersed compared to the Poisson distribution. Methods We introduce a statistical model that does not assume that MOI follows a parametric distribution, i.e., the most flexible possible approach. The method is designed to estimate the distribution of MOI and allele frequency distributions from a single molecular marker. We derive the likelihood function and propose a maximum likelihood approach to estimate the desired parameters. The expectation maximization algorithm (EM algorithm) is used to numerically calculate the maximum likelihood estimate. Results By numerical simulations, we evaluate the performance of the proposed method in comparison to an established method that assumes a Poisson distribution for MOI. Our results suggest that the Poisson model performs sufficiently well if MOI is not highly over-dispersed. Hence, any model extension will not greatly improve the estimation of MOI. However, if MOI is highly over-dispersed, the method is less biased. We exemplify the method by analyzing three empirical evidence in P. falciparum data sets from drug resistance studies in Venezuela, Cameroon, and Kenya. Based on the allele frequency estimates, we estimate the heterozygosity and the average MOI for the respective microsatellite markers. Discussion In conclusion, the proposed non-parametric method to estimate the distribution of MOI is appropriate when the transmission intensities in the population are heterogeneous, yielding an over-dispersed distribution. If MOI is not highly over-dispersed, the Poisson model is sufficiently accurate and cannot be improved by other methods. The EM algorithm provides a numerically stable method to derive MOI estimates and is made available as an R script.
... Additionally, estimate the frequency/prevalence of resistance, e.g., [19,39], but also to explain and 387 reconstruct the history of drug-resistance evolution [20]. Importantly, the population 388 genetics of Plasmodium differs from standard population genetics, due to the organism's 389 specific transmission cycle [40][41][42]. More precisely, the processes of selection and 390 recombination cannot be decoupled [43]. ...
Preprint
Full-text available
Molecular/genetic methods are becoming increasingly important for surveillance of diseases like malaria. Such methods allow to monitor routes of disease transmission or the origin and spread of variants associated with drug resistance. A confounding factor in molecular disease surveillance is the presence of multiple distinct variants in the same infection (multiplicity of infection – MOI), which leads to ambiguity when reconstructing which pathogenic variants are present in an infection. Heuristic approaches often ignore ambiguous infections, which leads to biased results. To avoid such bias, we introduce a statistical framework to estimate haplotype frequencies alongside MOI from a pair of multi-allelic molecular markers. Estimates are based on maximum-likelihood using the expectation-maximization (EM)-algorithm. The estimates can be used as plug-ins to construct pairwise linkage disequilibrium (LD) maps. The finite-sample properties of the proposed method are studied by systematic numerical simulations. These reveal that the EM-algorithm is a numerically stable method in our case and that the proposed method is accurate (little bias) and precise (small variance) for a reasonable sample size. In fact, the results suggest that the estimator is asymptotically unbiased. Furthermore, the method is appropriate to estimate LD (by D′, r ² , Q * , or conditional asymmetric LD). Furthermore, as an illustration, we apply the new method to a previously-published dataset from Cameroon concerning sulfadoxine-pyrimethamine (SP) resistance. The results are in accordance with the SP drug pressure at the time and the observed spread of resistance in the country, yielding further evidence for the adequacy of the proposed method. The method is particularly useful for deriving LD maps from data with many ambiguous observations due to MOI. Importantly, the method per se is not restricted to malaria, but applicable to any disease with a similar transmission pattern. The method and several extensions are implemented in an easy-to-use R script. Author summary Advances in genetics render molecular disease surveillance increasingly popular. Unlike traditional incidence-based epidemiological data, genetic information provides fine-grained resolution, which allows monitoring and reconstructing routes of transmission, the spread of drug resistance, etc. Molecular surveillance is particularly popular in highly relevant diseases such as malaria. The presence of multiple distinct pathogenic variants within one infection, i.e., multiplicity of infection (MOI), is a confounding factor hampering the analysis of molecular data in the context of disease surveillance. Namely, due to MOI ambiguity concerning the pathogenic variants being present in mixed-clone infections arise. These are often disregarded by heuristic approaches to molecular disease surveillance and lead to biased results. To avoid such bias we introduce a method to estimate the distribution of MOI and frequencies of pathogenic variants based on a concise probabilistic model. The method is designed for two multi-allelic genetic markers, which is the appropriate genetic architecture to derive pairwise linkage-disequilibrium maps, which are informative on population structure or evolutionary processes, such as the spread of drug resistance. We validate the appropriateness of our method by numerical simulations and apply it to a malaria dataset from Cameroon, concerning sulfadoxine-pyrimethamine resistance, the drug used for intermittent preventive treatment during pregnancy.
Article
Full-text available
The presence of multiple genetically different pathogenic variants within the same individual host is common in infectious diseases. Although this is neglected in some diseases, it is well recognized in others like malaria, where it is typically referred to as multiplicity of infection (MOI) or complexity of infection (COI). In malaria, with the advent of molecular surveillance, data is increasingly being available with enough resolution to capture MOI and integrate it into molecular surveillance strategies. The distribution of MOI on the population level scales with transmission intensities, while MOI on the individual level is a confounding factor when monitoring haplotypes of particular interests, e.g., those associated with drug-resistance. Particularly, in high-transmission areas, MOI leads to a discrepancy between the likelihood of a haplotype being observed in an infection (prevalence) and its abundance in the pathogen population (frequency). Despite its importance, MOI is not universally defined. Competing definitions vary from verbal ones to those based on concise statistical frameworks. Heuristic approaches to MOI are popular, although they do not mine the full potential of available data and are typically biased, potentially leading to misinferences. We introduce a formal statistical framework and suggest a concise definition of MOI and its distribution on the host-population level. We show how it relates to alternative definitions such as the number of distinct haplotypes within an infection or the maximum number of alleles detectable across a set of genetic markers. It is shown how alternatives can be derived from the general framework. Different statistical methods to estimate the distribution of MOI and pathogenic variants at the population level are discussed. The estimates can be used as plug-ins to reconstruct the most probable MOI of an infection and set of infecting haplotypes in individual infections. Furthermore, the relation between prevalence of pathogenic variants and their frequency (relative abundance) in the pathogen population in the context of MOI is clarified, with particular regard to seasonality in transmission intensities. The framework introduced here helps to guide the correct interpretation of results emerging from different definitions of MOI. Especially, it excels comparisons between studies based on different analytical methods.
Article
Full-text available
Rapid diagnostic tests (RDTs) detecting Plasmodium falciparum histidine-rich protein 2 (HRP2) have been an important tool for malaria diagnosis, especially in resource-limited settings lacking quality microscopy. Plasmodium falciparum parasites with deletion of the pfhrp2 gene encoding this antigen have now been identified in dozens of countries across Asia, Africa, and South America, with new reports revealing a high prevalence of deletions in some selected regions. To determine whether HRP2-based RDTs are appropriate for continued use in a locality, focused surveys and/or surveillance activities of the endemic P. falciparum population are needed. Various survey and laboratory methods have been used to determine parasite HRP2 phenotype and pfhrp2 genotype, and the data collected by these different methods need to be interpreted in the appropriate context of survey and assay utilized. Expression of the HRP2 antigen can be evaluated using point-of-care RDTs or laboratory-based immunoassays, but confirmation of a deletion (or mutation) of pfhrp2 requires more intensive laboratory molecular assays, and new tools and strategies for rigorous but practical data collection are particularly needed for large surveys. Because malaria diagnostic strategies are typically developed at the national level, nationally representative surveys and/or surveillance that encompass broad geographical areas and large populations may be required. Here is discussed contemporary assays for the phenotypic and genotypic evaluation of P. falciparum HRP2 status, consider their strengths and weaknesses, and highlight key concepts relevant to timely and resource-conscious workflows required for efficient diagnostic policy decision making.
Article
Full-text available
Background The UN’s Sustainable Development Goals are devoted to eradicate a range of infectious diseases to achieve global well-being. These efforts require monitoring disease transmission at a level that differentiates between pathogen variants at the genetic/molecular level. In fact, the advantages of genetic (molecular) measures like multiplicity of infection (MOI) over traditional metrics, e.g., R0, are being increasingly recognized. MOI refers to the presence of multiple pathogen variants within an infection due to multiple infective contacts. Maximum-likelihood (ML) methods have been proposed to derive MOI and pathogen-lineage frequencies from molecular data. However, these methods are biased. Methods and findings Based on a single molecular marker, we derive a bias-corrected ML estimator for MOI and pathogen-lineage frequencies. We further improve these estimators by heuristical adjustments that compensate shortcomings in the derivation of the bias correction, which implicitly assumes that data lies in the interior of the observational space. The finite sample properties of the different variants of the bias-corrected estimators are investigated by a systematic simulation study. In particular, we investigate the performance of the estimator in terms of bias, variance, and robustness against model violations. The corrections successfully remove bias except for extreme parameters that likely yield uninformative data, which cannot sustain accurate parameter estimation. Heuristic adjustments further improve the bias correction, particularly for small sample sizes. The bias corrections also reduce the estimators’ variances, which coincide with the Cramér-Rao lower bound. The estimators are reasonably robust against model violations. Conclusions Applying bias corrections can substantially improve the quality of MOI estimates, particularly in areas of low as well as areas of high transmission—in both cases estimates tend to be biased. The bias-corrected estimators are (almost) unbiased and their variance coincides with the Cramér-Rao lower bound, suggesting that no further improvements are possible unless additional information is provided. Additional information can be obtained by combining data from several molecular markers, or by including information that allows stratifying the data into heterogeneous groups.
Article
Full-text available
Malaria control has stalled in a number of African countries and novel approaches to malaria control are needed for these areas. The encouraging results of a recent trial conducted in young children in Burkina Faso and Mali in which a combination of the RTS,S/AS01 E malaria vaccine and seasonal malaria chemoprevention led to a substantial reduction in clinical cases of malaria, severe malaria, and malaria deaths compared with the administration of either intervention given alone suggests that there may be other epidemiological/clinical situations in which a combination of malaria vaccination and chemoprevention could be beneficial. Some of these potential opportunities are considered in this paper. These include combining vaccination with intermittent preventive treatment of malaria in infants, with intermittent preventive treatment of malaria in pregnancy (through vaccination of women of child-bearing age before or during pregnancy), or with post-discharge malaria chemoprevention in the management of children recently admitted to hospital with severe anaemia. Other potential uses of the combination are prevention of malaria in children at particular risk from the adverse effects of clinical malaria, such as those with sickle cell disease, and during the final stages of a malaria elimination programme when vaccination could be combined with repeated rounds of mass drug administration. The combination of a pre-erythrocytic stage malaria vaccine with an effective chemopreventive regimen could make a valuable contribution to malaria control and elimination in a variety of clinical or epidemiological situations, and the potential of this approach to malaria control needs to be explored.
Article
Full-text available
The use of antimalarial drugs is an effective strategy in the fight against malaria. However, selection of drug resistant parasites is a constant threat to the continued use of this approach. Antimalarial drugs are used not only to treat infections but also as part of population-level strategies to reduce malaria transmission toward elimination. While there is strong evidence that the ongoing use of antimalarial drugs increases the risk of the emergence and spread of drug-resistant parasites, it is less clear how population-level use of drug-based interventions like seasonal malaria chemoprevention (SMC) or mass drug administration (MDA) may contribute to drug resistance or loss of drug efficacy. Critical to sustained use of drug-based strategies for reducing the burden of malaria is the surveillance of population-level signals related to transmission reduction and resistance selection. Here we focus on Plasmodium falciparum and discuss the genetic signatures of a parasite population that are correlated with changes in transmission and related to drug pressure and resistance as a result of drug use. We review the evidence for MDA and SMC contributing to malaria burden reduction and drug resistance selection and examine the use and impact of these interventions in Senegal. Throughout we consider best strategies for ongoing surveillance of both population and resistance signals in the context of different parasite population parameters. Finally, we propose a roadmap for ongoing surveillance during population-level drug-based interventions to reduce the global malaria burden.
Chapter
Full-text available
Malaria is a leading public health problem in tropical and subtropical countries of the world. In 2019, there were an estimated 229 million malaria cases and 409, 000 deaths due malaria in the world. The objective of this chapter is to discuss about the different Plasmodium parasites that cause human malaria. In addition, the chapter discusses about antimalarial drugs resistance. Human malaria is caused by five Plasmodium species, namely P. falciparum, P. malariae, P. vivax, P. ovale and P. knowlesi. In addition to these parasites, malaria in humans may also arise from zoonotic malaria parasites, which includes P. i n u i and P. cynomolgi. The plasmodium life cycle involves vertebrate host and a mosquito vector. The malaria parasites differ in their epidemiology, virulence and drug resistance pattern. P. falciparum is the deadliest malaria parasite that causes human malaria. P. falciparum accounted for nearly all malarial deaths in 2018. One of the major challenges to control malaria is the emergence and spread of antimalarial drug-resistant Plasmodium parasites. The P. vivax and P. falciparum have already developed resistance against convectional antimalarial drugs such as chloroquine, sulfadoxine-pyrimethamine, and atova-quone. Chloroquine-resistance is connected with mutations in pfcr. Resistance to Sulfadoxine and pyrimethamine is associated with multiple mutations in pfdhps and pfdhfr genes. In response to the evolution of drug resistance Plasmodium parasites, artemisinin-based combination therapies (ACTs) have been used for the treatment of uncomplicated falciparum malaria since the beginning of 21th century. However, artemisinin resistant P. falciparum strains have been recently observed in different parts of the world, which indicates the possibility of the spread of arte-misinin resistance to all over the world. Therefore, novel antimalarial drugs have to be searched so as to replace the ACTs if Plasmodium parasites develop resistance to ACTs in the future.
Article
Despite considerable genetic variation within hosts, most parasite genome sequencing studies focus on bulk samples composed of millions of cells. Analysis of bulk samples is biased toward the dominant genotype, concealing cell-to-cell variation and rare variants. To tackle this, single-cell sequencing approaches have been developed and tailored to specific host–parasite systems. These are allowing the genetic diversity and kinship in complex parasite populations to be deciphered and for de novo genetic variation to be captured. Here, we outline the methodologies being used for single-cell sequencing of parasitic protozoans, such as Plasmodium and Leishmania spp., and how these tools are being applied to understand parasite biology.
Article
The present study aimed to examine the genetic diversity of human malaria parasites (i.e., P. falciparum, P. vivax and P. knowlesi) in Malaysia and southern Thailand targeting the 19-kDa C-terminal region of Merozoite Surface Protein-1 (MSP-119). This region is essential for the recognition and invasion of erythrocytes and it is considered one of the leading candidates for asexual blood stage vaccines. However, the genetic data of MSP-119 among human malaria parasites in Malaysia is limited and there is also a need to update the current sequence diversity of this gene region among the Thailand isolates. In this study, genomic DNA was extracted from 384 microscopy-positive blood samples collected from patients who attended the hospitals or clinics in Malaysia and malaria clinics in Thailand from the year 2008 to 2016. The MSP-119 was amplified using PCR followed by bidirectional sequencing. DNA sequences identified in the present study were subjected to Median-joining network analysis with sequences of MSP-119 obtained from GenBank. DNA sequence analysis revealed that PfMSP-119 of Malaysian and Thailand isolates was not genetically conserved as high number of haplotypes were detected and positive selection was prevalent in PfMSP-119, hence questioning its suitability to be used as a vaccine candidate. A novel haplotype (Q/TNG/L) was also detected in Thailand P. falciparum isolate. In contrast, PvMSP-119 was highly conserved, however for the first time, a non-synonymous substitution (A1657S) was reported among Malaysian isolates. As for PkMSP-119, the presence of purifying selection and low nucleotide diversity indicated that it might be a potential vaccine target for P. knowlesi.
Article
Almost 20 years have passed since the first reference genome assemblies were published for Plasmodium falciparum, the deadliest malaria parasite, and Anopheles gambiae, the most important mosquito vector of malaria in sub-Saharan Africa. Reference genomes now exist for all human malaria parasites and nearly half of the ~40 important vectors around the world. As a foundation for genetic diversity studies, these reference genomes have helped advance our understanding of basic disease biology and drug and insecticide resistance, and have informed vaccine development efforts. Population genomic data are increasingly being used to guide our understanding of malaria epidemiology, for example by assessing connectivity between populations and the efficacy of parasite and vector interventions. The potential value of these applications to malaria control strategies, together with the increasing diversity of genomic data types and contexts in which data are being generated, raise both opportunities and challenges in the field. This Review discusses advances in malaria genomics and explores how population genomic data could be harnessed to further support global disease control efforts. In this Review, Neafsey, Taylor and MacInnis discuss how population genomics approaches are currently used to study malaria parasites and mosquito vectors. They explore information that can be derived from such genomics approaches and discuss the use of relatedness-based measures of population variation to understand parasite and vector dynamics at highly resolved spatiotemporal scales.